Thursday, September 25, 2008

"Never make predicitions, especially about the future" -- Casey Stengel

I somehow missed this the other day, but our friend Vegas Watch broke down the preseason predictions of various experts. The results are such that your average ink-stained wretch will likely have a stroke. Basically the computer won, followed by two guys -- Neyer and Law -- who are partial to computers. Here are the overall standings of prediction accuracy, using mean squared error, with the lower number being better. "O/U" means a perfectly-balanced guess of every team going 81-81*:

To sum up: the three sets of predictions that sprang from slide rules in mothers' basements were the only ones that did better than wild-ass guesses, while the professionals were basically shooting in the dark (with the non-sabermetrically-inclined ESPN professionals at the absolute bottom).

Remember this when everyone starts trotting out their playoff prediction columns Sunday evening.

*UPDATE: Keith Law and Vegas Watch hisself correct me in the comments. O/U is not an 81-81 guess. Rather, it's (duh) the Vegas over/under lines. As Keith notes "The 81-81 predictions' RMSE was around 10.6, although it's a little bit rigged in that case, since RMSE penalizes large errors."

Thanks guys.


Sara K said...

Neyer and Law apparently failed to meet the BWWAA's standards for MMOE (minimum margin of error).


mooseinohio said...

Curious which stat is used to predict the gagging that is going on in NY and Chicago.

bigcatasroma said...


Why don't you just MARRY Neyer and Law, why don't ya??????

It's amazing just how definitive something like this is, in the realm of stats v. scouts. I mean, with each passing day, it becomes clearer and clearer

Craig Calcaterra said...

bigcats: because all three of us are already married. Beyond that, I really can't think of a good reason. ;-)

Keith Law said...

Pretty sure O/U = Vegas over/under lines. The 81-81 predictions' RMSE was around 10.6, although it's a little bit rigged in that case, since RMSE penalizes large errors.

Craig Calcaterra said...

Thanks Keith. BTW, re: the comment a couple up from yours . . . uh, how are you and the wife these days?

Vegas Watch said...

Just came to say exactly what Keith said. So now I'm totally aimless. Um...go Rays?

Daniel said...

Stats are evil! Look, they can predict the future! This is obviously witchcraft.

I will now begin a hungerstrike in protest of all use of stats until David Eckstein and Jason Bartlett are voted the league MVP's (yes Eckstein's 5 weeks with the D-Backs were THAT VALUABLE*).

*Valuable in this case meaning "full of grit and small, white man effort"

aleskel said...

this reminds me of what someone figured out about picking NFL games: if you always picked the team with the better record, and, in cases where teams had the same record, just picked the home team, you would be right 60% of the time. Most of the "experts" rarely cracked 50%

rob said...

So wait, if I had guessed 81-81 for every team, I would have done only slight worse than Law and Neyer?

Sounds like faint praise for them, although it's pretty damning for the former GM and his ilk. You wonder why the Mets sucked so bad under Phillips.

mooseinohio said...


I'd be curious what the trend analysis date reveals and if the Neyer's and Law's beat the curve on a regular basis or does it even out over time. I suspect that it will be more of a standard bell curve with all methods (e.g. Sabermetics, best quesses, gut feelings, highest payrolls, previous years record) producing similar results. While I believe in much of what Sabermetrics preaches it is best when used with more qualitative factors as both methods when used together enhance one another.

For example, end of the year numbers on Manny will not reveal the slacking period that predicated his being traded by the Red Sox and I doubt there is a statistical model to account for Manny being Manny. Also, what type of variable accounts for Ozzie calling out his pitcher before a big game and the pressure that may add to a pitcher to perform (or underperform as was Javy's case). Remember it was Bill James that highly recommended the 'closer by committee' concept that made great sense on paper butcould not account for items such as ability to handle the pressure (M. Rivera v. J. Mesa) or how valuable it is to have a defined role (i.e. 8th inning specialist).