Wednesday, July 17, 2013

The Signal and the Noise

Nate Silver, The Signal and the Noise: Why So Many Predictions Fail And Some Don’t” (NY: Penguin Press, 2012), 454 pages.

Nate Silver gained national attention with a forecasting system he developed in 2003 called PECOTA. He used baseball data to forecast the performance of major league baseball players. It was very successful and the attention he got allowed him the time and money to expand his interests into other areas and write us a thoughtful and reflective book on forecasting.

The Signal and the Noise is a book for people who like data and the possibilities for using it in the latest information revolution brought on by cheap and powerful computers. In the Introduction Silver tells a precautionary tale from the first information revolution that came in 1440 with the printing press. Before the printing press knowledge was lost without some way to store it; after the printing press knowledge could be stored but ideas could also be circulated to make arguments to the masses and promote controversies like Martin Luther’s ninety-five theses. The printing and distribution of 300 thousand copies brought centuries of religious warfare.

Silver uses the introduction to make similar contrasts and set up a philosophy of forecasting. Forecasting implies planning with uncertainty that needs prudence, wisdom and industriousness along with a dose of humility. Silver thinks of forecasting as an on-going process of revision where the risk of failure is always present but the possibility of progress makes it worth the trouble.

The book has 13 chapters. The first chapter titled “A Catastrophic Failure of Prediction” describes what went wrong with the predictions for the recent stock market and housing bubbles. Then it is on to six more chapters on prediction for political polls, baseball, weather, earthquakes, economic forecasting and swine flu.

The topics all have a random and unpredictable element for something we would like to predict in advance. Silver gives readers some historical background, some basic theory or science where it’s relevant and then an assessment of the forecasting record: weather forecasting, better, earthquake forecasting, no progress, and so on. Each chapter suggests a common principle or two of success or failure that turns into a general theory and practice of forecasting by the end of the book.

Chapter 8 introduces Bayesian statistical inference in a descriptive form to be applied to six more topics of prediction: gambling, chess, poker, stock prices, greenhouse effects and terrorism. Bayesian inference is a branch of statistics that uses prior probability to make a new probability estimate, the posterior probability. Silver tries to convince readers to think of Bayes as a procedural method that combines new evidence with prior beliefs in a repeated process of revision.

Given the variety of topics from technology and the social and physical sciences readers will prefer some topics over others. I spent more time on the economics chapters. In Chapter 6 Silver does what forecasters hope no one will do: he goes back to check and compare old forecasts. Most are way off but being a labor forecaster I appreciate Silver’s precautions: correlation does not mean causation, explanatory variables change frequently, never throw out data, tell a story using credible economic reasoning.

I especially liked the comments he used of Jan Hatzius, chief economist at Goldman-Sachs. His correct forecast of the 2007 financial collapse resulted from looking at mortgage data and evaluating the size of the leveraged mortgage market and the risk of default from unqualified buyers. At page 196 Silver writes “Hatzius refers to this chain of cause and effect as a ‘story.’ It is a story about the economy – and although it might be a data-driven story, it is one grounded in the real world.”

Chapter 11 takes a close look at the age old question: Can you make money predicting stock prices? Those who believe in markets always answer no, but Silver uses the volumes of stock data to do a variety of fun experiments comparing strategies - buy and hold, manic momentum and a few more – before reaching that conclusion.

The Signal and the Noise is a very readable statistics book with good graphics, thorough documentation and source notes. I would never have predicted publication of a general audience book with so much detail but as I finished reading I decided the volume of data on the Internet promotes a wider interest along with wider access. The growing combination of access and interest make it a timely book that suggests a structure and philosophy for people who want to pursue their own interests and separate the signal from the noise.

No comments: