04 June 2010

Competitive Balance Part I: MLB Competition Most Balanced Despite Payroll Disparities

In the prologue to RPBlog's Competitive Balance series, I demonstrated that the disparity among team payrolls in Major League Baseball is greater than that of the NFL, NBA and NHL. I also presented evidence that the level of payroll inequality is sensitive to the collective bargaining regimes that govern spending in the four leagues. But perhaps most importantly I claimed that, when it comes to competitive balance, payroll disparity isn't as important as some would have you believe.

Below I explain (part of the reason) why.

In order to understand the level of competitive balance across the four leagues, we need a way to measure this phenomenon. There are a few different tools for this job, and we will use a few of them. The first one we will use is, conveniently, the same tool we used to measure payroll disparities: the Gini Coefficient.

Competitive imbalances should reveal themselves in an unequal distribution of wins across teams in a league during a given period of time. A chart illustrating the variance of win inequality in the MLB, NFL, NBA and NHL (across time) is posted below:

The chart above paints a rather interesting and unexpected picture: the MLB has the lowest level of win inequality, followed by the NHL, NBA and NFL. In other words, the average maldistribution of wins is in fact the perfect opposite of what we see when we model the disparity of payrolls. As you can see, these figures are rather stable over time.

What might account for this reversal in disparities? Some of this variance is certainly due to granularity issues in calculating the Gini coefficient, considering the fact that there are only 16 games in an NFL season versus 162 games in an MLB season (the NBA and NHL falling in the middle at 82).
Why should this matter? Consider the following thought experiment: if we determined game outcomes by coin flip, no team should fare any better than another if they played an infinite number of games. Just the same, the Gini coefficient of the win distribution in this universe should approach 0.00 as the season length approaches infinity. But as we reduce the number of games per season (and coin flips), a disparity should arise as random chance takes over: 162 flips per team per season and we should see a small disparity; 16 flips per team and we should see a very large one, along with a larger Gini coefficient that increases exponentially. At one game, the Gini Coefficient would equal 1.00, or perfectly unequal.

Therefore, we should expect to see an unusually high disparity in the win distribution for the NFL simply due to the short schedule. However, there are ways of compensating for this measurement error, and this effect does not account for the entire variance between coefficients (more on that below). Moreover, this effect does not account for the variance in competitive balance in the NBA and NHL (each with 82 game seasons), or the fact that the MLB is more balanced than either of those leagues (the level of error in a 162 game season not being that much greater than in a season of 82 games).

Something interesting is going on here, and I can confidently say that I'm not sure what it is. On the other hand, I have a pretty good idea of what this means, especially after analyzing at the chart below:

The graph above plots the disparity in the distribution of regular season wins against the disparity of payroll distribution, separately labeling each league. Even though there is a small but positive relationship between payroll inequality and competitive imbalance in hockey and baseball, the overall relationship is just the opposite: league-seasons with the greatest level of payroll inequality have the best competitive balance when we don't control for the leagues individually. The relationship is confirmed by an Ordinary Least Squares analysis of the data, presented below:

IV Coefficient Std. Error P-value
*NHL Dropped
Payroll Inequality 0.1347 0.0603 0.031

MLB -0.0422 0.0076 0.000

NFL 0.1139 0.0080 0.000

NBA 0.0565 0.0075 0.000

Season 0.0000 0.0005 0.992
n  48
Y-Intercept 0.1035 0.9874 0.917
Adjusted R^2  0.9324

The relationship we see in the previous chart is entirely explained by the inherent levels of competitive balance within (but not across) the leagues. In other words, the non-league-controlled trend is inverse simply because the MLB and NHL enjoy greater levels of competitive balance than the NFL and NBA. When we control for this effect, the source of which is unclear, there is a significantly positive relationship between payroll inequality and competitive imbalance.

In layperson's terms, a 0.10 increase in the level of payroll inequality should result in an approximately 0.01 increase in the level of competitive imbalance. At the same time, the regression fails to reject the hypothesis that the MLB is inherently more balanced* than the NFL and NBA: a typical baseball season is 4% more balanced than the average season during any given league-year, while NFL seasons are 11% less balanced, and NBA seasons are about 6% less balanced.

*Analysis with dummy variables requires the researcher to drop one of the leagues from the analysis. When the NHL is included and the MLB excluded, we see a negative but insignificant relationship between the NHL binary and competitive imbalance. Moreover, a more rigorous analysis would have included interaction terms rather than simple controls, but such an analysis would have required a larger sample.

As I wrote above, some of the variance among Gini Coefficients is the result of short-season bias. We can compensate for this by looking at ten full seasons for each NFL team rather than one. But as the graph above indicates, even when compensating for this bias, the NFL retains the largest competitive imbalance of all the leagues, with a 10-year Gini Coefficient higher than any single-year MLB value (although the NFL coefficient drops a full 0.10 when expanding the sample to 160 games).

So there you have it--based on win distributions, the MLB is clearly the most balanced American sports league, and the NFL the least balanced, contrary to popular opinion. This tells us that there is something inherent in baseball that is generating a great deal of fairness for the teams that play, regardless of payroll disparity. This also raises the possibility that Baseball's "competitive balance problem" may be nothing more than a public relations problem (which isn't insignificant, it's just not a problem that can be fixed by modifying the distribution of payrolls).

That said, we don't yet have enough evidence to make this claim just yet. A deeper investigation of the level of competitive balance in baseball and other sports requires more than a look at regular season win distributions. We also need to look at the distribution of playoff appearances, as well as the volatility of win totals from year to year (what sociologists and economists would refer to as "mobility" were we discussing household and personal incomes rather than success in sport).

Stop by next week as we investigate these phenomena in greater detail for Part II of RPBlog's Competitive Balance series.

Competitive Balance Series
Prologue: Payroll Inequality
Part I: Regular Season Competitive Balance


BMMillsy said...

Just a side note: be careful using a standard Gini Coefficient to measure payroll disparity. Because of unbalanced schedules, and the fact that the worst teams play each other at some point, the 'perfectly unbalanced league' isn't as perfectly unbalanced as a standard Gini might suggest. (check out this paper that talks all about it: http://rodneyfort.com/Academic/ElectronicPubs/UttFortJSE02.pdf)

In addition, the leagues seem to be roughly ordered from left to right with respect to the number of games played in a season, and we'd expect the leagues with less games to have higher W% disparity to begin with just because of the small sample size, as you mention. So I'm not sure the W% variation itself is the best way to measure balance given the correlation between season length and payroll disparity (however, I think it's suitable in measuring "uncertainty of outcome").

JD Mathewson said...

@Millsy: Thanks for your comment. I feel that the unbalanced league schedule is definitely an issue, but that it should be a small one when looking at the results over time or when combining the full 10 years worth of data.

Also, I believe I addressed the small sample size issue well enough to get numbers that we can have confidence in.

But this data is still preliminary, and I'll be disaggregating much further in the weeks to come.

BMMillsy said...

The unbalanced league schedule is less of a problem than the fact that you cannot accomplish half the teams being undefeated and half being all-defeated, because the top teams all play each other, and the bottom teams all play each other. This leaves you with an unclear picture of what a 'perfectly unbalanced' league actually is.

I actually don't worry too much about the unbalanced schedules and their effect on balance of the league (as I agree the aggregation pretty much throws most of it away).

But here's the problem: if they were in fact perfectly balanced (everyone plays each other once), you can come up with the theoretically perfectly unbalanced league by making one team undefeated, the next beating everyone but the undefeated team, and so on down the line. Unfortunately, knowing which teams should be the undefeated one with an unbalanced schedule isn't an easy task (one I've been fiddling with in some simulations to no avail). Either way, you're not using a standard Gini Coefficient as you would be with a league where all teams are either undefeated or all-defeated.

And I mentioned you addressed the season length in the article well...but its correlation with exactly the issue being addressed here, payroll dispersion, can cause some problems.

With that said, I enjoy reading.

JD Mathewson said...


I'm not quite sure why you say a season such as the one you mention (ne team undefeated, the next beating everyone but the undefeated team...) would be perfectly unbalanced.

If I'm reading you right, you're describing a 17 team league with a 16 game season in which the best team is 16-0, the second best team is 15-1, and so on until the 17th best team is 0-16. When I feed that data into the software I get a Gini of 0.3125--far from perfectly unbalanced.

That said, thanks again, and I'll be reinforcing my findings here with more data in the future. Glad you enjoyed reading, please keep sharing your thoughts.

BMMillsy said...

Right, the Gini you get from your program is exactly my point. The scheduling of sports competitions make it impossible to get a perfectly unbalanced Gini in the standard sense (or, a 1.000 comparison).

Normally, we use a Gini Coefficient in the way you describe to evaluate proportions of the binomial compared to one group having all 1's and the other having all 0's. Or, just this kind of data. Unfortunately, with a sports league with more than 2 teams that all play one another at some point, you can't get all 1.000 W% for half the teams and all 0.000 W% for the other half. The 1.000 W% teams will play one another at some point, and someone will have to win, meaning they won't all be 1.000%. Same goes for the terrible teams. So comparing to the standard 1.000 Gini isn't sufficient, and the 'perfectly unbalanced' league changes each year depending on the scheduling.

From my reading, the best you can do is the theoretical outcome I lay out (though, that comes from the Utt & Fort paper, as well as a new proposal by Dorian Owen in a forthcoming one). If you find a more unbalanced way, with the scheduling constraint in tact, I'd actually be extremely interested.

BTW-Your graph is interesting, especially the reversal of slopes for the leagues. You would expect the intensity of the slope to change, but why the sign reversal? Kind of cool.

JD Mathewson said...

Okay, I kind of see what you're getting at.

I'm not exactly sure what's responsible for the reversal, I'll need to think on it for awhile.

BMMillsy said...

Yea, it's tough to really say with a small number of points (like the NFL one could be pulled down by that single point furthest right). Is there a reason there are less points for one league than another? Or is that just a result of overlapping points?

Unknown said...

Honestly, the only thing MLB needs to do to change perceptions of competitive balance is expand their postseason. If more teams make the playoffs, the game will seem more balanced even if nothing else has actually changed.

JD Mathewson said...

I'll actually be using a different methodology next week that completely contradicts this one. They both tell interesting stories. Stay tuned!

Larry said...

J-Doug -

Jason at IIATMS pointed me over here.

I like this analysis, and as I'm not all that statisticlly trained, I'll need to go though this again more carefully. But I think you may be missing an important factor, which is that baseball contains a higher element of chance than basketball and football. A top football team can go undefeated; the best baseball team can win maybe 70% of the time. This is inherent in the game, and is part of its charm, but does not say anything about competitive balance.

It would be interesting to redo all of these charts, and instead of looking at the absolute number of wins, to look instead at the record of each team in comparison to the best and worst team records posted during the period you're examining.

Unknown said...

J-Doug - thanks for all your articles - I really appreciate your work.

I would be interested to see how these computations come out if you removed the Rays, Marlins, and Twins from MLB's data set. Those clubs seem to have vastly superior scouting and player development that doesn't reflect the economics of most of the league. Would the data still show MLB to be most competitively balanced using the other 90% of the teams? Just how badly do those 3 clubs skew the wins/payroll figures?

Thanks again - I love the site and look forward to reading more.

jimgannonjr said...

I dont think the problem is a public relations problem as much as it is a public expectations problem. The public and the media expect more out of baseball than other sports (fair or unfair). Baseball is just judged more harsh than other sports. See the steriods scandel, the NFL wasnt hauled in front of congress, but baseball was.

Unknown said...

An important factor missing is the amount of scoring that happens in each sport. Some games are designed with more discrete scoring events (NBA, NFL) than others (MLB, NHL). In the MLB teams average around 4 runs per game. With so few discrete scoring events the shot noise is higher. This leads to luck playing a more significant factor in deciding the winner. The NBA is an example of a league with many more discrete scoring events per game.

Decoud said...

In baseball, if you go 2-2 and then win the fifth game throughout the season you are a great team and will win 100 games and probably be the best in the league. If you go 2-2 and then lose the fifth game throughout the season then you will lose 100 games and likely be the worst team in the league.

Effectively, teams in baseball finish with a winning percentage in the .400 to .600 range. In basketball the winning percentage range is about .250 to .750 and in football that range is about .200 to .800.

How is winning inequality calculated? Is anything done to standardize the range of winning percentage outcomes inherent to each sport?

BeefMaster said...

I think you're barking up the wrong tree here. The complaints about MLB's inequality are on a persistent basis, not within a given season. No one cares that the Yankees go 100-62 and the Royals go 62-100 - good and bad teams are a part of baseball. The problem is that those teams stay in that relative position year after year.

As Decoud and Larry mentioned above, it's simply in the nature of baseball that even the best teams are less-dominant than their counterparts in football or basketball. People don't complain about the issue in football because there is (likely) a smaller correlation between team revenue and team success, and because teams are rarely dominant for more than a couple years at a time.

It looks like Part II will be where you really tackle the issues people have with competitive balance.

JD Mathewson said...

@Larry, Decoud and Jarid: Yes, I'm getting that comment quite a bit. I feel I addressed the issue of chance rather sufficiently in the article. Much of that has to do with season length, but even when you add 10 years worth of win distribution data together in the NFL (which approximates one season worth of MLB games) the Gini Coefficient still doesn't approach that of the MLB.

@Bill: That's a rather interesting consideration, how much skill is involved. I may take a look at that as well in an epilogue.

@Lucas: I hadn't actually thought about that. I'll take that into consideration.

@Jim: Well that's really the same thing, isn't it? It's the marketing department's job to manage expectations in such a way that makes the product more appealing.

Next week I'll be looking at the issue of team "mobility" (how often teams can move up and down in the standings) and the data I'll be presenting is much more in line with public beliefs and expectations. Please stop by next week and take a look!

Unknown said...

Every sport has injuries that affect how well teams perform, but one reason why great baseball teams are "less dominant" than great teams in other sports is because in baseball teams frequently and intentionally field sub-optimal lineups simply because it's impossible to use the same starting pitcher in every single game. Even position players get a day off from time to time. You don't see Peyton Manning or Kobe Bryant taking a game off when they're healthy. The only comparable situation would be in hockey when teams occasionally give the goalie the day off and run their backup out there, but in baseball a staff ace will start roughly 20% of his team's games.

Brian Burke said...


I commend you for a thoughtful and well-presented bit of research. However, as mentioned, the comparative natures of the sports themselves isn't taken into account. In the NFL, the worst team can beat the best team once in a blue moon, probably about <5% of the time. But in MLB, it happens all the time. The Royals can beat the Yankees 1/3 of the time, sometimes even win their season series.

Put another way, in the NFL the best team finishes about 14-2, which would be equivalent to 141-21 in MLB. There, the best team each year finishes about 100-62, which would be 10-6 in the NFL (which may not even be enough to get into the playoffs).

The NFL is more like arm-wrestling where the stronger contestant nearly always wins the game, but MLB is more like flipping coins, where team strength isn't so important in any one contest. We are naturally going to see a wider distribution of win% in a 16-game season of arm-wrestling than a 162-game season of coin flipping, but this says nothing about the connection to salary inequity.

The 162-game season nearly guarantees the stronger teams will finish on top, even if by only a tiny sliver of win%. So a comparison of league win% distribution doesn't mean much. I'm sorry to say, but the results you've presented are illusions and are very misleading.

One suggestion would be to compare the correlation between a team's payroll and its *rank order* of finishing in its respective league. I bet you would find that MLB is clearly the least balanced league in terms of competitiveness, probably by far.

Besides, as Jarid says above, people aren't concerned about the width of the win% distributions. Fans are frustrated when the same few teams are the only ones able to compete for a championship year-after-year.

Looking forward to part II.

Shiyam Pillai said...

Please. People complain about competitive balance due to the lack of playoff appearances by their teams - not a general lack of winning.

You want to prove that MLB has real competitive balance? Do a study of the AL East since it went to a 5-team division. See how the payroll disparities have driven Baltimore, Tampa and Toronto to a "lightning in a bottle" contention strategy.

JD Mathewson said...

@ Brian: Yes, you're clearly not the first person to mention this. In terms of the sample size of the season length, this should still even out when you look at 10 years worth of data, but it doesn't. As for the team mobility, that's the subject of the next post.

@Shiyam: Want to talk about playoff appearances? Fine. Playoff baseball has more turnover per playoff spot than football. Haven't looked at hockey or basketball yet. I'll be demonstrating that in future posts as well.

Teams make the playoffs more often in the NFL, NBA and NHL because they allow 37.5%, 53.3% and 53.3% of the league make the playoffs every year, whereas baseball only allows 26.7% of its teams to reach the postseason. Please explain to me what that has to do with payroll.

Finally, one division does not a pattern make. Baltimore and Toronto have failed because they have terrible front office staff. It's one thing to fail for lack of resources, it's another thing to use those resources poorly. As for Tampa, I don't know how you can include them in any argument regarding payroll and success nowadays.

JD Mathewson said...

Part II, for those of you who are interested.

Jacob said...

I think that there is a flaw here in your reasoning. I'm not a statistician, but imagining that we ought judge pure wins as a level of wealth seems a bit odd to me. For instance, I have looked forward to a baseball season thinking "I hope my team wins 30 games this year." And it isn't the length of the season that accounts for this. It is the nature of the game. The inherent variability in the fact that historically great teams can only win ~65% of there games - take any reasonable sample and you will never find streaks in baseball that rival those in football or basketball. Attempting to equate pure wins is just an apples and oranges argument.

That being said, it occurred to me you might be able to do something reasonable with wins over least expected. Essentially marginal wins. So I pulled the data for baseball for the last 20 years (excluding the shortened 1994 and 1995 seasons) and found that the mean minimum wins in each of those seasons was ~58. So I subtracted the minimum number of wins from the total wins (anything <0 is set to 0) and then used those numbers to compute the gini coefficient for each year in baseball. Here is how it looks:

1989 0.2036199
1990 0.1771679
1991 0.1897951
1992 0.2100592
1993 0.2596495
1996 0.1924266
1997 0.1970405
1998 0.2826129
1999 0.2709364
2000 0.2128876
2001 0.2786822
2002 0.3152477
2003 0.2560606
2004 0.2832614
2005 0.2261079
2006 0.2122883
2007 0.1940183
2008 0.2373062
2009 0.2450242

Anonymous said...

This is being over thought - what you want to graph is a team's finish in its league in each of say 10 years or so vs their payroll that year.

If there were no inequality you would expect no correlation. If the sport were highly uneqaul you would expect a 1:1 correlation.

I suspect if you graph THAT you will see that baseball is more unequal than the other sports in its outcomes.

JD Mathewson said...

That's really not what you want. That doesn't tell you what the impact of other, non-payroll factors is on competitive balance on the league.

If you want to measure a macro dependent variable, you want to compare it to macro independent variables, not micro independent variables.