The cause wasn’t original. James was hardly the first person to notice that there was still stuff to be figured out about baseball, and that the game’s underlying rationalities might be discerned through statistical analysis. Going right back to the invention of the box score in 1845, and its subsequent improvement in 1859 by a British-born journalist named Henry Chadwick, there had been numerate analysts who saw that baseball, more than other sports, gave you meaningful things to count, and that by counting them you could determine the value of the people who played the game. But what got counted was often simply what was easiest to count, or what Henry Chadwick, whose reference point was cricket, had decided was important to count.
Chadwick was the critical figure in this history. To anyone who asked, “How could baseball statistics be so screwed up?,” Henry Chadwick was usually the beginning, and occasionally the end, of the answer. Chadwick’s stated goal in counting the events that occurred on a ball field was reform: he wanted players to be judged by their precise contributions to victory and defeat. He was as upset about the immorality he witnessed on the baseball diamond as he was about the drinking and gambling he found on the city streets—about which he also never tired of complaining. He longed to affix blame and credit for baseball plays, and to do it, he grossly oversimplified matters. Fielding errors were just one example of Chadwick’s moralistic mind at work. Another was his interpretation of the base on balls. In cricket there was no such thing as a walk: Chadwick had to get his mind around a new idea. The tool was ill-designed for the task: Chadwick was better at popularizing baseball statistics than he was at thinking through their meaning. He decided that walks were caused entirely by the pitcher—that the hitter had nothing to do with them. In his initial box score Chadwick recorded a walk as an error; even in the later box scores, after he had listened to, or at least heard, the obvious objections from others, Chadwick never credited the hitter. He simply removed the walk altogether from the record books. “There is but one true criterion of skill at the bat,” he wrote, “and that is the number of times bases are made on clean hits.” Enter the batting average, ever since the chief measure of a player’s offensive value.*
For a fuller, more respectable account of the history of the box score, see Jules Tygiel’s Past Time: Baseball As History (2000).
The more you examined these old measurement devices, the less apt they seemed. Chadwick, with help from others, had created a system of perverse incentives for anyone who trotted out onto a baseball field. The fetish made of “runs batted in” was another good example of the general madness. RBI had come to be treated by baseball people as an individual achievement—free agents were paid for their reputation as RBI machines when clearly they were not. Big league players routinely swung at pitches they shouldn’t to lard their RBI count. Why did they get so much credit for this? To knock runners in, runners needed to be on base when you came to bat. There was a huge element of luck in even having the opportunity, and what wasn’t luck was, partly, the achievement of others. “The problem,” wrote James, “is that baseball statistics are not pure accomplishments of men against other men, which is what we are in the habit of seeing them as. They are accomplishments of men in combination with their circumstances.”
The failure of baseball people to acknowledge that fact in their statistics led to exactly the sort of moral corruption Henry Chadwick, in creating them, had sought to eliminate. The many little injustices and misunderstandings embedded in the game’s records spawned exotic inefficiencies. Baseball strategies were often wrongheaded and baseball players were systematically misunderstood. Chadwick succeeded in creating a central role for statistics in baseball, but in doing it he created the greatest accounting scandal in professional sports.
Between Chadwick and James there had been fitful efforts to rethink old prejudices. The legendary GM Branch Rickey employed a professional statistician named Allan Roth who helped to compose an article under Rickey’s byline in Life magazine in 1954 that argued for the importance of on-base and slugging percentages over batting average. A professor of mechanical engineering at Johns Hopkins, Earnshaw Cook, wrote two pompous books, in prose crafted to alienate converts, that argued for the relevance of statistical analysis in baseball. In the early 1960s, a pair of brothers employed by IBM used the company’s computers to analyze baseball strategies and players. But the desire to use statistics to make baseball efficient—to measure and value precisely the events that occur on a baseball field, to give the numbers new powers of language—only became potent when it became practical.
When Bill James published his 1977 Baseball Abstract, two changes were about to occur that would make his questions not only more answerable but also more valuable. First came radical advances in computer technology: this dramatically reduced the cost of compiling and analyzing vast amounts of baseball data. Then came the boom in baseball players’ salaries: this dramatically raised the benefits of having such knowledge. “If we’re going to pay these guys $150,000 a year to do this,” James concluded in his essay on fielding, “we should at least know how good they are—which means knowing how much they allowed in the field just as much as it means knowing how much they created at bat.” If this sounded compelling when baseball players were paid $150,000 a year, it sounded one hundred times