The need for transparency in science stems from the fact that most societies are making the majority of their decisions based on evidence coming through the guidance of science. If scientific evidence is at the heart of decision making, then the collection of this evidence must be transparent to those who make resulting policies, together with those who challenge them in a democracy. Moreover, as public funded scientists we should be setting a community standard of transparency for the rest of society to follow. Publishing of science is currently in transition to address the wicked problem, and it is vital that the scientific community leads the way forwards, and that we are not led by for-profit publishers. One of the ways to achieve transparency is through preregistration of your research project to avoid confirmation bias, which is, in part, a product of commercial publishers and the metrics they promote. In order to make this effective, we need the gatekeepers of our journals to support the preregistration of research hypotheses and methods. Right now, journals should be openly advocating and encouraging preregistration with a plan to transition their journal in future to a system embracing rigour, reproducibility, and transparency (RRT Valdez et al., 2020). However, many editors are resisting this move as they feel that there is no support from the community. This may well be the case, but inequalities in science, and particularly in publishing, mean that editors can either be instruments of change or at the heart of inequality in publishing (see Part IV and Figure 1.1). Either our editors will lead us toward transparency, or we as a community simply need to demand that they change their practices. Currently, editors are responding to calls for transparency by making small steps (for example asking for open coding: Powers & Hampton, 2019), rather than adopting transparency wholesale through the badge system set up by Kidwell et al. (2016).
Confirmation bias is the phenomenon increasingly seen in science that most studies published accept the alternative hypothesis, even though this is the least likely outcome of any experiment. Confirmation bias happens in publishing as editors prefer to accept papers that have a positive outcome. It has been suggested that this leads to a culture of ‘bad science’, and even fraud. One convincing set of evidence of confirmation bias is the decline of null results over time (Fanelli, 2012).
At the outset of our scientific research we pose a hypothesis with the expectation that we will be able to accept or reject our null hypothesis. We often think of rejecting the null hypothesis as the only result that we are interested in, but if we only ever reported these results we would not be responsible in moving our field forwards. That is, in a world where we only report significant results (i.e. reject the null hypothesis) we would necessarily keep repeating experiments where the null hypothesis is accepted, because there would never be the evidence that the hypothesis had been previously tested in the literature. This is actually practised by the majority of scientific journals who won’t consider a null result, and results in ‘Publication Bias’. It’s easy to see why this is a bad policy, but it is the prevailing culture in science publishing.
If journals only publish manuscripts that reject the null hypothesis (cf Franco, Malhotra & Simonovits, 2014), researchers are more likely to mine their data for positive results (P hacking), or re-write their hypothesis in order to reject the null (HARKing) (Measey, 2021). Deceptive practices such as p hacking, HARKing and salami-slicing are not in the interests of any journals, or the scientific project in general (Ioannidis, 2005; Nissen et al., 2016; Forstmeier, Wagenmakers & Parker, 2017; Measey, 2021).
But positive results don’t only come from deliberate manipulation of results (see Part IV). As humans we are predisposed towards positive results (Trivers, 2011), and there are plenty of reasons why researchers might inadvertently reach a false positive outcome. Forstmeier et al. (2017) draw attention to cryptic multiple tests during stepwise model simplification, and the two types of researcher degrees of freedom (sensu Simmons, Nelson & Simonsohn, 2011): stopping rules and flexibility in analysis.
Cryptic multiple tests during stepwise model simplification relates to the way in which adding predictors to models inflates the total number of models to test, making it necessary to adjust alpha accordingly (for repeated tests). However, Forstmeier and Schielzeth (2017) report that even with Bonferroni adjusted alpha levels, they found using random data that models with one significant effect happen around 70% of the time. The only way to keep this under control is to use sufficient sample sizes to maintain the power to distinguish between true positives and false positives. A handy rule of thumb from Field (2013) is that sample size needs to be eight times the number of model predictors plus 50. Better would be to run a power analysis on your study design, and to critically reassess your predictors to eliminate as many as you can before you begin your study.
Researcher degrees of freedom is the way that Simmons et al. (2011) described ways in which researchers may inadvertently increase their chances of getting false positive results during analysis. The first is simply the way in which researchers decide to stop collecting data. Clearly, if preliminary collections showed a trend, but not a significant result, then collecting more data sounds like a good idea. However, as the collection of data is not independent (when the first set is kept) then the first test is not independent of the second, and so the chance of getting a Type I error is cumulative. Even if multiple datasets are collected, those that are insignificant should also be considered and reported in order to get an unbiased estimate. The second major way in which analyses can turn out with false positives is through potentially infinite flexibility in analyses. There are lots of ways to analyse your data and given enough trials, it is quite likely that you’ll find one that gives you significant results. Moreover, on the road to conducting the test, there are many options that can change the outcome of the analysis:
- Inclusion or exclusion of an outlier
- Inclusion or exclusion of a covariate
- Transforming dependent variables.
- Inclusion or exclusion of baseline measures
- Controlling for sex (or another variable) as a fixed effect
- Excluding individuals with incomplete datasets
The potential list of ways in which the outcome of your analysis could change quickly grows as the number of ways in which you could analyse the data also grows. But don’t despair. Transparent help is at hand.
One criterion for many journals is that the research should be novel. This is increasingly practised by journal editors as you move up the Impact Factor levels. Novelty sells (just think of the meaning of “new” in newspaper), and that’s the basis for selling novel stories from higher Impact Factor journals. The perils of testing increasingly unlikely hypotheses and how this inflates Type II errors as well as increasing the proportion of Type I errors are widely acknowledged (Forstmeier, Wagenmakers & Parker, 2017; Measey, 2021). Novelty also stifles repeatability (Cohen, 2017). If we can never repeat studies in science, then a fundamental tenet of the methodology is repressed. Reproducibility in science has received a lot of attention recently, as attempts to reproduce the results of highly cited research have failed (Begley & Ellis, 2012; Open Science Collaboration, 2015). This has been followed by general outrage among scientists that things should change (Anderson, Martinson & De Vries, 2007; Munafò et al., 2017), including a majority of those in biological sciences (Baker, 2016). The irony that these reports and requests are published in exactly the journals that will refuse to publish research that seeks to repeat work (is not novel) is clearly lost on the editors. However, more nuanced views are also coming forwards to actively introduce variable conditions and sampling of biological variation into the study design to more fully represent the nature of biological variation making studies more likely to be replicated (Voelkl et al., 2020).
The way in which editors choose and interpret reviewers can either reinforce their own prejudices, or help to make publication more open and transparent for everyone. The first step along this road is to move from double-blind review to triple-blind where editors cannot make decisions with prejudice towards certain reviewers. Next is the need for public reviews with DOIs that allow open assessment of what reviews contained. For more details about problems in peer review, see Part IV.
In order to change this culture to a more transparent selection of scientific studies for publishing, we need journals to sign up to be transparent. Sadly, when most journals are approached, the editors either ignore the email or make an excuse about why it is not possible (Advocating for Change in How Science is Conducted to Level the Playing Field, 2019). Of course, some journals have adopted the road to transparency, and we should be encouraged by the fact that they still exist, and that we could build on these initial front runners. In addition, there are a growing number of excellent frameworks that are pointing the way forwards (e.g. Macleod et al., 2021). This is a cultural change that we can expect will take time (Figure 1.3).
Taking out the profiteering from publishers will take a more concerted approach, especially now that they are attempting to capture our entire work-flow (Brembs et al., 2021). But the reality is that we have only ourselves to blame. Biological scientists do not challenge the publishing model because we are used to getting all of the ‘frills’ associated with it. This includes the designer layout, custom websites, editorial management systems and now increasingly the use of free tools like Mendeley, Overleaf, Peerwith and Authorea. Indeed, these and other tools can be used as spyware to capture data from individual academics and sell it (see Brembs et al., 2021). Instead I suggest using not-for-profit repositories and Open Source Tools (Kramer & Bosman, 2016). An excellent way to learn and implement the use of these tools is to form a Open Science Community (Armeni et al., 2021) in your institution. There you can learn more from your peers about which tools are best used in your area and with your institutional resources, and help spread the word of the need to move toward Open Science among your colleagues.
The reality is that we really don’t need any of the frills or tools that publishers offer. An entire work-flow using Open Source Tools is available, and it is up to us to make this convenient for our own use. If we cared more about our science and less about so-called prestige, we’d all be better off. Mathematicians and physicists are way ahead of us. Given that they’ve shown the way, it’s simply up to us to embrace openness and transparency.