Recently it was discovered that a researcher in my specific area, social niche construction in animals, likely committed egregious errors, if not outright fraud, throughout his career. While I don't consider myself a behavioral ecologist, my research often dovetails with this group. I was therefore shocked that other researchers were using the case to lambast the study of individual differences in animal behavior as a whole, comparing the act of a single researcher to the replication crisis that plagued the whole field of psychology recently.
At the core of this controversy is common dynamic within most collaborations, a single individual has privileged access to both the organisms and the data collections processes, who then shares the data with other collaborators. For myself, getting an excel file and a short description of how the data was collected, isn't sufficient to truly assess the quality of the data, I need to know about context of the experimental design, data production, and the behavior of the organisms themselves. A lot of my work is observational, looking at patterns of social interactions in semi-naturalistic settings, and measuring how those patterns change over development, or in response to group changes. Such studies require the skill (dare I say the art) of keen observation, a skill that can only be cultivated through months (if not years) of rigorous training and reliability checks with yourself and others.
All data is a limited and indirect representation of the system you are observing. No data is ever “raw”. Each data set is collected under a particular set of circumstances, and is always partial to the limitations of human judgement and design. It takes somebody with broad expertise with an animal system to design and control for such human influences to assure they don’t interfere with the data as much as possible. Even tools that attempt to bypass such human fallibilities, such as deep learning, are designed to be deeply influenced by the structure of information in the world that is often a reflection of human effort and judgement. For better or worse, human judgment is going to be the producer and consumer of data, as no outside force can divine us with pure truth in spreadsheet form. So, what can we do when such human judgment fails us? How can we prevent such failures from reoccurring?
One proposed solution is to keep sharing data, to make data as open and transparent as possible. While I am in support of such efforts, I feel they are barely sufficient to protect the integrity of data from those with strong careerist motivations. Already stretched thin, researchers don't have the time to be policing others work constantly, and I worry that the vagueness of “open science” will lead to a more segregated class of “data collectors” and “data analyzers” (cue in the “behavior-omics”) who will get credit for making discoveries. Furthermore, laying the responsibility for data integrity to those with the time does not automatically free ourselves of fallible human biases and nefarious motives.
I am not under any illusions that I have a solution to this issue, but one thing that has consistently troubled me is the willingness of researchers to use data when they have little understanding regarding how the data was produced in relation to the animals being investigated. I used the word produced here intentionally, human labor produces and creates data that we expect to approximate real observations on real organisms. Thus, how “good” or “bad” a dataset is reflect the relationships of the data production processes to the natural history of the organisms being studied. During analysis many potential confounds, misinterpretations, and limitations of the data can be clearly seen by those intimately familiar with the organismal system, but completely lost to those unfamiliar with the system. The idea that researchers need to have some grounded experience with the data production and the organisms they are investigating often conflicts with the proposed solutions offered by strong advocates of open data, who believe that transparency alone may be enough. However, it’s my opinion that researchers must have expertise with the organisms they are investigating to collectively assess when the data is so out of character to suggest either potential errors, or exciting discoveries.
This brings me back to the opportunistic use of one misguided researcher to discredit the whole concept of social niche construction. No, the field itself is not in crisis. Many in the field (myself included) were suspicious of the data and the near supernatural productivity of the person in question. But yes, those in animal behavior need to be more cautious about how they approach behavioral data. Frankly put, those analyzing the data from other labs who use unfamiliar animal systems should at least gain extensive experience with those animals, and at best have a role in collecting the data themselves (if possible, running the experiments/observations in their own lab before analyzing shared data). When fraudulent players extend their data beyond the capabilities of what animals do, those with deep expertise with those animals are best suited to catch such intentional errors. Keep your eyes on the animals, as they are ultimately the ground truth from which accurate data will be assessed.