08 September 2010

Weak Sauce? Secret Sauce's Predictive Capacity Wanes in Recent Years

It was late in the 2006 season when Nate Silver, then of Baseball Prospectus, debuted his oft-referenced Secret Sauce formula for predicting playoff success. Secret Sauce—based on starter strikeout rates, closer win value, and fielding—is attractive for a few reasons. First, it appeals to the Sabremetric community's fondness of obscure, normalized, performance-based peripheral stats. Second, leaving offense out of the equation resonates with conventional wisdom: offense wins games, defense wins championships. Third, it passes the smell test: all the best postseason teams have had great closers, right?

Finally, and most importantly, it seemed accurate at the time. The three graphs below show the historical relationship between Secret Sauce Score and playoff success:

Methodological note: the first two graphs compare the series-winning success of playoff teams to their Secret Sauce Score, which is the sum of their regular season rankings in starter K-rate (EQSO9), closer win value (WXRL), and team fielding (FRAA); lower is better. The third graph measures the success of the team with home field advantage in postseason series match-ups compared to the difference between the home team and visiting team's Secret Sauce Score. All three are fractional polynomial curves based on logistic regression analysis of the variables.

The formula is based on the statistical analysis by Silver and Dayn Perry, which showed that starter strikeout rate, closer win value and team fielding were the only statistically significant correlates between regular season performance and postseason success. Of course, historical trends do not always hold, as has been the case for Secret Sauce since ~2002. Take a look at the success of the Secret Sauce favorites in head-to-head competition since the playoffs expanded to eight teams in 1995:

Year Record Win%
2002-2009 28-28-0 0.500
1995-2001 34-13-2* 0.714
1995-2009 62-41-2 0.600

*On two occasions, opposing teams had the same Secret Sauce Score

Since 1995, the Secret Sauce test is certainly better than guessing, correctly predicting the winner of any given series 60% of the time. From 1995-2001, the formula is especially successful, defeating random chance at the same rate that the 1998 Yankees defeated their opponents. However, since 2002, Secret Sauce has performed no better than a coin flip at picking winners.

Having discovered this, my first thoughts were of sample size. Silver and Perry used a larger data set than I'm testing against, and this may just be a small-sample fluke. However, a logistic regression analysis of the formula shows that Secret Sauce, while rather reliable from 1995-2001, is statistically insignificant in the 2002-2009 period.

Table 1.1: Postseason Success vs. SS Score (P-Values)

Range Sample Series W LDS LCS WS
1995-2009 n = 120 0.017 0.030 0.073 0.030
1995-2001 n = 56 0.004 0.008 0.024 0.085
2002-2009 n = 64 0.633 0.746 0.869 0.182

The table above indicates the statistical significance of the relationship between Secret Sauce Score and four different variables: Series Wins (the total number of series a team wins in a given postseason), LDS (whether or not the team wins the first round), LCS (wins the second round) and WS (wins the third round). The p-values for the regressions were statistically significant (using a weak 0.10 standard) for the entire date range and for pre-2002 postseasons, but from 2002-onward there is no statistically significant relationship.

Methodological Note: Ordered logit was used to test Series Wins, binary logit for LDS, LCS and WS.

This pattern repeated itself when looking at head-to-head postseason match-ups:

Table 1.2: Matchup Success vs. SS Score (P-Values)

Range Sample
Home Win

1995-2009 n = 105 0.033

1995-2001 n = 49 0.012

2002-2009 n = 56 0.594

Breaking Secret Sauce down into its components only yields more questions. Despite Silver's findings, closer win value since 1995 is almost never correlated with postseason success in any statistically significant way. In fact, by the 2002-2009 period, the only significantly correlated variable is fielding, and then only in some cases!

Table 2.1: Postseason Success vs. SS Components (P-Values)

Range Component Wins LDS LCS WS
1995-2009 Fielding 0.007 0.005 0.120 0.051

Starter K/9 0.058 0.042 0.262 0.217

Closer Wins 0.791 0.850 0.641 0.258

1995-2001 Fielding 0.060 0.028 0.721 0.147

Starter K/9 0.045 0.123 0.064 0.121

Closer Wins 0.090 0.250 0.040 0.325

2002-2009 Fielding 0.053 0.106 0.058 0.172

Starter K/9 0.331 0.190 0.903 0.718

Closer Wins 0.206 0.172 0.152 0.563

The pattern continues when looking at Secret Sauce from a head-to-head perspective:

Table 2.2: Matchup Success vs. SS Components (P-Values)

Range Component Home Win

1995-2009 Fielding 0.014

Starter K/9 0.036

Closer Wins 0.889

1995-2001 Fielding 0.058

Starter K/9 0.019

Closer Wins 0.258

2002-2009 Fielding 0.063

Starter K/9 0.303

Closer Wins 0.180

So what gives? Is this just an statistical-historical anomaly, or is it an indicator that Secret Sauce no longer accurately models postseason success in the current era? Allow me to submit this harebrained theory:
  1. Since 1995, the gap between the "poor" and rich teams has grown.
  2. General managers (thanks to the Sabremetric revolution) have gotten smarter.
  3. The increasing payroll gap results in an increasing amount of talent on the board as the trade deadline approaches.
  4. Smarter GMs for the richer teams are spending scarce resources on good relievers and power pitching rather than crafty ground-ball hurlers and overrated offensive talent.
  5. Secret Sauce is failing to incorporate the proper value of these moves because the formula is based on full-season numbers, undervaluing players who were picked up late in the year.
Now it's only a theory, easily falsifiable for someone who would bother to take a stab at it. But consider the 2009 Phillies, who ranked dead last among playoff teams and 22nd in all of baseball, and yet gave the Yankees a run for their money in the Series. What Secret Sauce failed to account for was the two-thirds of Cliff Lee's numbers that he put up before joining the Phightins (as well as Brad Lidge's capacity for not being terrible during a short period in October). Of course, this still doesn't explain how the 2006 St. Louis Cardinals—perhaps the worst on-paper playoff team in history—took home the Commissioner's Trophy.

What I'd hope you take away from this is not that Secret Sauce is worthless, but rather that we should remember its recent poor performance when playoff prognosticators employ it to predict the 2010 postseason (don't repeat that last clause aloud). That said, the concept and construction of Secret Sauce is so elegant, and the idea of a Unified Field Theory of Postseason Success so alluring, that one doesn't want to abandon it so quickly. I took one last crack at it by exchanging some of the BPro variables with less proprietary ones (such as UZR, FIP and xFIP), but to no avail.

I'm sure someone will come up with something. I'll keep trying, too.

P.S. For what it's worth, Secret Sauce has the Padres winning it all this year, currently holding a SS Score of 10. This would qualify for the best since the playoffs expanded to 8 teams. The Rays are nipping at San Diego's heels with a score of 14, which would still rank among the best of all time. I don't think I'd be alone in predicting that a Rays-Padres World Series would be one of the more exciting Fall Classics in recent memory—and the least watched Fall Classic of all time.


Sturgeon General said...

I'm gonna take a stab at the 2006 Cardinals. This is a team that consisted of a roster who had attended the 2004 WS and the 2005 NLCS. They started the season 34-19 before the heavy injuries took their toll.

Defensively the most important positions would be SS, 2B, CF. Well David Eckstein missed nearly 40 games, Aaron Miles missed 27 games and Jim Edmonds missed 52 games. In addition, Albert Pujols missed 19 games and Scott Rolen missed 20 games. That is a lot of missed games from the starting infield.

Next Jason Isringhausen was closer the entire year. It wasn't until essentially the postseason that Adam Wainwright took over for injured the closer. He struck out a half a batter more per 9 while walking 2 less per 9.

Furthermore their top 3 starters all logged 30+ starts. However they had an additional 4 starters get 13-17 starts. When you get to the postseason you get to drop 3 of those starters. Not to mention, their performance is likely to increase as they get their strongest defensive lineup on the field behind them.

I think when you factor all that in, they were a much better team than their record and Secret Sauce numbers showed.

JD Mathewson said...

Thanks, Sturge. That actually makes a lot of sense--SS wouldn't pick up on a lot of that stuff.

obsessivegiantscompulsive said...

Ironically, when I went into BP's data after the season, the 2010 Giants would have ranked among the Top 10 that was highlighted in the book, if I got the numbers right, or nearly so.

Also sadly ironic, while you note that the Secret Sauce is not worthless, your post caused BP to reexamine things and decide that the Secret Sauce was not valid and thus retired.

I still think that there is validity to the premise of that original article and another study by someone on THT that took a different route to the answer of success in the playoffs, that defense - pitching and fielding - is significant to winning and that offense is not.


JD Mathewson said...

Thanks, OGC. That article you linked to predated mine by five years, so that's data that's included in this post. Regardless, it is important to revisit these findings every so often.