The survivor bias is one of the commonest thinking errors that plague statistical analysis, often leading to the wrong analysis
01-May-2015 •Vivek Kaul
During the course of the Second World War, the British Royal Air Force (RAF) wanted to attach heavy plating to its airplanes. This was to be done to protect the planes from the German anti-aircraft guns as well as fighter planes.
The plates were heavy, and hence they had to be strategically attached to the planes. Interestingly, data of where exactly bullets struck the planes was available. As Jordan Ellenberg writes in How Not to Be Wrong: The Hidden Maths of Everyday Life: "The damage wasn't uniformly distributed across the aircraft. There were more bullet holes in the fuselage, not so many in the engines."
If the data were to be interpreted in a straightforward manner, it would mean plating the area around the fuselage because that was what got hit the most. But a statistician called Abraham Wald realised that things were not so straightforward as they appeared to be. As Ellenberg writes: "The armour, said Wald, doesn't go where bullet holes are. It goes where bullet holes aren't: on the engines...The missing bullet holes were on the missing holes. The reason planes were coming back with fewer hits to the engine is that planes that got hit in the engine weren't coming back."
A slightly gory comparison to this can also be found in a recovery room in a hospital. There will be more people with bullet holes in legs vis-à-vis people with bullet holes in chests. This in no way means that people don't get hit in chests. They sure do. It's just that people who get hit in the chest don't recover.
As Gary Smith writes in Standard Deviations: Flawed Assumptions Tortured Data and Other Ways to Lie With Statistics: "Wald...had the insight to recognize that these data suffered from survivor bias...Instead of reinforcing the locations with the most holes, they should reinforce the locations with no holes." Wald's recommendations were implemented and ended up saving many planes which would have otherwise gone down.
Interestingly, the survivor bias is a part of lot of other data as well and leads to wrong analysis at times. Take the data for judging the performance of mutual funds over a long period of time. The numbers typically end up overstating the returns earned primarily because something's missing. As Ellenberg writes: "Mutual funds don't live forever. Some flourish, some die. The ones that die are, by and large, the ones that don't make money. So judging a decade's worth of mutual funds by the ones that still exist at the end of ten years is like judging our pilot's evasive manoeuvres by counting the bullet holes in the planes that come back." Hence, it makes sense to be sceptical about any mutual fund study that shows high returns. The first question you should be asking is whether the study has taken the performance of dead funds into account or not.
So where else does the survivor bias show up? Business books are one place where a lot of survivor bias shows up. Take the case Jim Collins' all time bestselling book Good to Great. In this book, Collins identified 11 stocks that clobbered the average stock after looking at the forty-year history of 1,435 companies.
Collins and his research team then started to look for the principles of success from these companies. As Collins wrote, the "search [was] for timeless, universal answers that can be applied by an organization...Almost any organization can substantially improve its stature and performance, perhaps even become great, if it conscientiously applies the framework of ideas we've uncovered."
The only trouble was that Collins' entire study suffered from the survivor bias. As Smith writes: "Here is how the study should have been done. Start with a list of companies that existed at the beginning of this forty-year period. It could be all the companies in the S&P 500 index, all the companies traded on the New York Stock Exchange, or some other list...Then, use plausible criteria to select eleven companies predicted to do better than the rest. These criteria must be applied in an objective way, without peeking at how companies did over the next forty years. It is not fair and meaningful to predict which companies will do well after looking at which companies did well."
Further, how did the 11 companies identified by Collins in Good to Great perform over the years? Fannie Mae made a mess and was taken over by the American government. Circuit City went bankrupt. Between the publication of the book and 2012, five of the 11 stocks in the sample did better than the stock market, six did not.
Tom Peters and Robert Waterman who wrote the bestselling In Search of Excellence went through the same problem. Of the 43 companies the authors studied, 35 are still listed. Of these 15 have done better than the overall stock market, 20 worse. To conclude, the survivor bias shows up in a lot of books which try and give us the formula for success. As Smith writes: "This problem plagues the entire genre of books on formulas/secrets/recipes for successful business, a lasting marriage, living to be one hundred, and so on and so forth. Such books are based on backward-looking studies of successful businesses, marriages and lives and have an inherent survivor bias." Next time you come across a statistical study or a success formula or impressive numbers, do be mindful of the survivor bias.
Vivek Kaul is the author of Easy Money. He can be reached at [email protected]