Lies, Damned Lies, and the Peer Review Process

February 6th, 2009 by Potato

The peer review process is one of the most important parts of sharing scientific findings: before a paper is published, 2-3 (or more) scientists are contacted by the journal to anonymously tear the article to shreds. This process is what helps ensure that references to peer-reviewed publications are respected more than references to newspaper articles, websites, self-published books, and Wikipedia. As a scientist I’ve had the opportunity to be on both sides of the peer review process now, and while it can be a bit of a pain as an author if a reviewer is very nit-picky or worse yet, doesn’t understand something and wants to make drastic changes from the wrong point-of-view, it can help fix some common errors that can be very misleading down the road. There have been some truly terrible papers come across my desk — so bad that I often have to wonder why they weren’t stopped by an editor before even getting to peer review — but anonymity and letting mistakes that do get corrected slide is a vital part of the process. So instead I’ll just talk about a few generalities:

Statistics and their misuse are the number one weakness that otherwise good papers have. Usually it’s minor things, like saying p = x.xx instead of p < a. That's just a distinction that you typically threshold your statistics, so you choose to accept that two populations are separate when the chance of their means being that different is greater than 5% (or some other percentage, but 5% is the most common)... you don't typically say that the chance that two samples having mean differences that large coming from the same population is 4.5% or 5.1% or what have you (no matter how close to your 5% threshold 5.1% is, there are other ways to report that). Often parametric ("regular") statistical tests are run on measures that should be tested with non-parametric ("fancy but less powerful") methods. This is a pretty fine distinction as well, and we try to not usually get our heads up our asses about it (especially since many readers are only familiar with the "normal" statistical tests, so reporting those makes it easier for them to grasp... we just like to see the non-par tests done as well when it's appropriate). But there have been a few doozies, where the authors apparently learned how to do stats from the excel help files. Things like running hundreds of t-tests: roughly speaking, if you're looking for differences that have less than a 5% chance of occurring by random chance alone, and then test 100 differences, you should find about 5 "significant" results by chance alone. One paper in particular tried this trick, then had the balls to cite a (non-peer-reviewed) book justifying the move. I showed it to the statistician here, and when he got to that part he threw his keys across the room. "What the fuck!" was all he could manage to say until he calmed down. I suppose you'd have to be a statistician to feel that strongly about it. Anyway. Sometimes papers are well-written, with good references and stats, but are doomed from the start because the experiment just wasn't done very well. This happens a lot with "let's just see what this does" type studies, where no control group is included. It can also happen when systematic errors or artifacts creep in, in cases where what you're doing something (testing a drug or whatever) that affects the thing you're trying to measure directly, rather than through the mechanism it's hypothesized to. Unfortunately if someone is set on deceit it's very difficult to root out scientific fraud when all you're given is a manuscript, so the peer review process is not good at anti-fraud (and it wasn't really designed to do that, despite the burden some would like to put on it). Sometimes I'd like to see the peer review process encompass things like visiting a lab in person, and better yet to have independent replications arranged for nearly every paper published. Unfortunately the realities are that it's difficult enough for an impassioned scientist to get funding to do their project once, so it's pure fantasy to think that funding would appear to do studies twice over for replications to be done as a matter of course.

Comments are closed.