Next, instead of the range of forecasts, we will concern ourselves with the accuracy of forecasts, i.e., the ability to predict the number itself.
We can also learn about prediction errors from trading activities. We quants have ample data about economic and financial forecasts – from general data about large economic variables to the forecasts and market calls of the television “experts” or “authorities”. The abundance of such data and the ability to process it on a computer make the subject invaluable for an empiricist. If I had been a journalist, or, God forbid, a historian, I would have had a far more difficult time testing the predictive effectiveness of these verbal discussions. You cannot process verbal commentaries with a computer – at least not so easily. Furthermore, many economists naively make the mistake of producing a lot of forecasts concerning many variables, giving us a database of economists and variables, which enables us to see whether some economists are better than others (there is no consequential difference) or if there are certain variables for which they are more competent (alas, none that are meaningful).
I was in a seat to observe from very close our ability to predict. In my full-time trader days, a couple of times a week, at 8:30 A.M., my screen would flash some economic number released by the Department of Commerce, or Treasury, or Trade, or some such honorable institution. I never had a clue about what these numbers meant and never saw any need to invest energy in finding out. So I would not have cared the least about them except that people got all excited and talked quite a bit about what these figures were going to mean, pouring verbal sauce around the forecasts. Among such numbers you have the Consumer Price Index (CPI), Nonfarm Payrolls (changes in the number of employed individuals), the Index of Leading Economic Indicators, Sales of Durable Goods (dubbed “doable girls” by traders), the Gross Domestic Product (the most important one), and many more that generate different levels of excitement depending on their presence in the discourse.
The data vendors allow you to take a peek at forecasts by “leading economists”, people (in suits) who work for the venerable institutions, such as J. P. Morgan Chase or Morgan Stanley. You can watch these economists talk, theorizing eloquently and convincingly. Most of them earn seven figures and they rank as stars, with teams of researchers crunching numbers and projections. But the stars are foolish enough to publish their projected numbers, right there, for posterity to observe and assess their degree of competence.
Worse yet, many financial institutions produce booklets every year-end called “Outlook for 200X”, reading into the following year. Of course they do not check how their previous forecasts fared
The problem with prediction is a little more subtle. It comes mainly from the fact that we are living in Extremistan, not Mediocristan. Our predictors may be good at predicting the ordinary, but not the irregular, and this is where they ultimately fail. All you need to do is miss one interest-rates move, from 6 percent to 1 percent in a longer-term projection (what happened between 2000 and 2001) to have all your subsequent forecasts rendered completely ineffectual in correcting your cumulative track record. What matters is not how often you are right, but how large your cumulative errors are.
And these cumulative errors depend largely on the big surprises, the big opportunities. Not only do economic, financial, and political predictors miss them, but they are quite ashamed to say anything outlandish to their clients – and yet
Since my testing has been informal, for commercial and entertainment purposes, for my own consumption and not formatted for publishing, I will use the more formal results of other researchers who did the dog work of dealing with the tedium of the publishing process. I am surprised that so little introspection has been done to check on the usefulness of these professions. There are a few – but not many – formal tests in three domains: security analysis, political science, and economics. We will no doubt have more in a few years. Or perhaps not – the authors of such papers might become stigmatized by his colleagues. Out of close to a million papers published in politics, finance, and economics, there have been only a small number of checks on the predictive quality of such knowledge.
A few researchers have examined the work and attitude of security analysts, with amazing results, particularly when one considers the epistemic arrogance of these operators. In a study comparing them with weather forecasters, Tadeusz Tyszka and Piotr Zielonka document that the analysts are worse at predicting, while having a greater faith in their own skills. Somehow, the analysts’ self-evaluation did not decrease their error margin after their failures to forecast.
Last June I bemoaned the dearth of such published studies to Jean-Philippe Bouchaud, whom I was visiting in Paris. He is a boyish man who looks half my age though he is only slightly younger than I, a matter that I half jokingly attribute to the beauty of physics. Actually he is not exactly a physicist but one of those quantitative scientists who apply methods of statistical physics to economic variables, a field that was started by Benoit Mandelbrot in the late 1950s. This community does not use Mediocristan mathematics, so they seem to care about the truth. They are completely outside the economics and business-school finance establishment, and survive in physics and mathematics departments or, very often, in trading houses (traders rarely hire economists for their own consumption, but rather to provide stories for their less sophisticated clients). Some of them also operate in sociology with the same hostility on the part of the “natives”. Unlike economists who wear suits and spin theories, they use empirical methods to observe the data and do not use the bell curve.
He surprised me with a research paper that a summer intern had just finished under his supervision and that had just been accepted for publication; it scrutinized two thousand predictions by security analysts. What it showed was that these brokerage-house analysts predicted
Tetlock studied the business of political and economic “experts”. He asked various specialists to judge the likelihood of a number of political, economic, and military events occurring within a specified time frame (about five years ahead). The outcomes represented a total number of around twenty-seven thousand predictions, involving close to three hundred specialists. Economists represented about a quarter of his sample. The study revealed that experts’ error rates were clearly many times what they had estimated. His study exposed an expert problem: there was no difference in results whether one had a PhD or an undergraduate degree. Well-published professors had no advantage over journalists. The only regularity Tetlock found was the negative effect of reputation on prediction: those who had a big reputation were worse predictors than those who had none.
But Tetlock’s focus was not so much to show the real competence of experts (although the study was quite convincing with respect to that) as to investigate why the experts did not realize that they were not so good at their own business, in other words, how they spun their stories. There seemed to be a logic to such incompetence, mostly in the form of belief defense, or the protection of self-esteem. He therefore dug further into the mechanisms by which his subjects generated ex post explanations.
I will leave aside how one’s ideological commitments influence one’s perception and address the more general aspects of this blind spot toward one’s own predictions.