Comments on Rod's Sports Economics: Ruminations on RSD

Thanks, Guy.

2016-04-27T10:07:14.687-05:00

Thanks, Guy.

I'm not sure I follow your exercise. In any ca...

2016-04-27T10:06:04.368-05:00

I'm not sure I follow your exercise. In any case, we may be at an impasse. Our objection is that RSD does not effectively control for length of schedule. I believe that you agree, but say that was not its true purpose. So that is really our disagreement.

Obviously, I can't speak to *your* purpose in using the metric. But I would make two points: 1) it is in fact frequently used by sports economists, explicitly, for exactly that purpose (Vrooman 1995, Berri et. al.). You obviously know that body of work very well. If this is all a huge misunderstanding about the true purpose of RSD, why have you never pointed out this error to your colleagues? And 2) using RSD leads to false conclusions about competitive balance (e.g. that the NFL is highly competitive), undermining the value of work that relies on the metric. I'm not a sports economist, but if I were I think that would trouble me.

But at least we can agree that RSD does not control for schedule length, which is something. And perhaps you are (or will be) persuaded that Phil has correctly calculated "Z" (which he has). So this has been a productive discussion, even if we cannot reach consensus.

Sorry about that. I had the link access set incorr...

2016-04-27T09:28:56.174-05:00

Sorry about that. I had the link access set incorrectly. I think you can get it now. And I think the spreadsheet there does go to your points. Look forward to your response.

2016-04-27T09:22:33.320-05:00

This comment has been removed by the author.

Rod: Yes, I am the same "Guy." As a n...

2016-04-27T07:30:30.033-05:00

Rod: Yes, I am the same "Guy."

As a non-UMich person, I cannot access your link. But in any case, the reason that RSD does not provide a common "footstep" size for comparison is that the "units" are each league's respective ISD, which of course varies by G. And if your answer to that is "that's OK, we want to account for the difference in G," then my reply is your metric should reflect the actual impact of G on the ASD. But RSD does not, as you acknowledge. And if ISD has no relation to the actual affect of G on ASD, then RSD is just ASD divided by an arbitrary and shifting denominator.

Here is an analogy: We know that the dimensions of a particular ballpark adds 5 HR for a player on that team. A 10 HR hitter will hit 15, and a 40 HR hitter will hit 45. To adjust for that in comparing these players to the rest of the league, we should subtract 5 HR from their total. But instead, we say the average player here hits 20 HR (but only 15 HR elsewhere), so we will subtract 25% from each player's HR total when comparing to the league. That is what RSD does: it takes an additive relationship (fewer games adds variance) and pretends it is multiplicative. And that gives you the wrong answer.

And note that RSD is not just "less good" than other competitive balance metrics. It is actively misleading. Berri, for example, uses it to show that the NFL is much more competitive than the NBA, despite similar ASD, once you adjust for schedule length. That is simply a false claim. So the use of RSD is literally reducing sports economists' understanding of these issues.

Part II: As for a demonstration: I would predict ...

2016-04-26T15:46:14.765-05:00

Part II:

As for a demonstration: I would predict that if you take a random sample of X games for each team, out of a season of 162 games, you will find that:

1. For Noll-Scully, the lower the X, the higher the Noll-Scully

2. For the Z above, Z will not depend on X at all. It'll jump around for random reasons, but will "average" about the same for all values of X. (Somewhere between .05 and .06, probably).

I'd predict that if you calculated N-S and Z *right now*, after 20 games or whatever it is in the MLB season right now, N-S will look very high, and Z will look "normal". But Z does have a large SD for a 20-game sample, so you never know. But I'd easily bet a substantial that today's Z divided by last year's Z is less than today's N-S divided by last year's N-S.

And if you gave me 2:1 odds, I'd bet that today's Z is less than last year's Z. (the correct odds are probably ... 1.2:1, or something?) But if you wanted me to bet that today's N-S is less than last year's N-S, I probably wouldn't even take 25:1.

Part I (had to split to get the comment to work): ...

2016-04-26T15:46:03.333-05:00

Part I (had to split to get the comment to work):

Let me try to demonstrate it by starting the other way.

We agree that for a league of .500 teams, ISD= sqr(0.25/G). That is, the expected observed SD of that league is sqr(0.25/G). (I say "expected," but that might not be precisely true in the mathematical sense -- the expected value of the SD might not be that. Maybe the expected value of the variance is the square of that. So take "expected" in a colloquial sense, like how the SD is the "typical deviation".)

Now, suppose the league is half .600 teams, and half .400 teams (that's their expected record, like 60% and 40% biased coins). What's the expected observed SD now?

Well, first, the SD of the team expectations themselves is .100. That's the SD of "half the teams at .400 and half the teams at .600". (For the earlier case, where every team was at .500, the SD of team expecations was zero, since the SD of a bunch of identical ".500"s is zero.)

I'm going to call this, the SD of team expectations, "SD(talent)".

Now, let's talk about the SD of *deviations* from .600/.400. Because, of course, not every .600 team will finish at exactly .600, just like in your example, not every team finished at exactly its expectation of .500.

The SD of the expected difference from the expected record -- that is, the SD of the deviations from .600 or .400, respectively -- is sqr(0.24/G). That's by the same binomial approximation to normal as in the .500 case. Well, in the .500 case, it was (0.25/G). But they're close.

I'm going to call this "SD(luck)".

Now, a team's record is exactly its "talent" plus its "luck", the way we defined talent and luck here. That is, if one of the teams with .600 talent (96-66, say) goes .620 (98-68), its "talent" is .600, its "luck" is +.020, and its record is .620. That's by definition.

So

actual = talent + luck

So, by properties of variance,

Var(actual) = var(talent) + var(luck) + 2 cov(talent, luck)

Since talent and luck are independent, the covariance term is zero, so

Var(actual) = var(talent) + var(luck)

Which means

ASD squared = SD(talent) squared + SD(luck) squared

Which means, if we treat SD(luck) as the same as "ISD of the .500 case" -- which it almost is, except that it's sqr(.24/G) instead of sqr(.25/G) -- we get

ASD squared = SD(talent) squared + ".500 case ISD" squared

So SD(talent) squared = ASD squared - ".500 case ISD" squared

By definition, SD(talent) doesn't depend on G. ASD and ".500 case ISD" *do* depend on G, but they must "cancel each other out" in the subtraction, for this identity to work.

Which is why I say, the statistic Z, where

Z = square root of (ASD squared - ".500 case ISD" squared)

is a "successful Z" that doesn't depend on G. Furthermore, it's exactly what we want to know, because we defined it as the SD of team expectations, the quantity of interest!

Does that make sense?

------

Sure, will write something up.

2016-04-26T15:16:34.337-05:00

Sure, will write something up.

Also, I apologize since it appears one of your rep...

2016-04-26T12:26:58.535-05:00

Also, I apologize since it appears one of your replies is not here! I never get comments at my blog so I'm not proficient at managing them. I'm trying to figure out how I goofed up, but given this reply of yours made it, I think the flow of the discussion is intact. Again, my apologies.

Hopefully, you are Guy Molyneaux? If hope so sinc...

2016-04-26T12:25:07.400-05:00

Hopefully, you are Guy Molyneaux? If hope so since you were part of the Twitter discussion and I'm glad you found your way here.

I disagree on your point that a useful RSD requires proportional changes in ASD and ISD and that is moot if I can handle your very important second issue.

I think I now see what the problem is with the "step size" issue; a game in MLB is worth .0062 winning percent points and a game in the NFL is worth .0625 winning percent points and so on.

But that suggests that the "distance" can be normalized relative to a particular league so that comparisons across leagues are again viable.

I did so and you can see my try here:

https://umich.box.com/s/i44dxbz578suyavnfq1m16i6lakulsgu

The results are extremely important for RSD. Both the magnitude of relative RSD and the ranking given by relative RSD suggest big issues for the past use of RSD.

I get what the measure is, but I still don't s...

2016-04-26T12:18:08.965-05:00

I get what the measure is, but I still don't see how it satisfies the Z* idea (not mine, but the one that others keep stressing) that Z(G1) = Z(G2). Maybe you can just show this by demonstration if not by proof?

Rod, I’m afraid you’ve started your rumination at ...

2016-04-25T16:13:44.468-05:00

Rod, I’m afraid you’ve started your rumination at what should be the conclusion: the “idealized” SD. Let’s instead start at the beginning: We want to compare competitiveness in two leagues of different season lengths G (or the same league at different times, with intervening change in G). We could simply compare ASDs and say one league is more competitive. But instead we sometimes use RSD. The only conceivable reason to do this is because G influences the ASD -- if it didn’t, we’d just use ASD and be done! So, the purpose of RSD must be to allow a fair comparison of leagues correcting for the influence of different season lengths.

To control for season length, we should adjust ASD in a way consistent with the actual impact of G. If a given increase in G reduced both ASD and ISD proportionately, then your RSD metric would be an elegant solution. Unfortunately, increasing G does *not* increase ASD and ISD proportionately. And thus, dividing ASD by ISD cannot provide an estimate of competitive balance independent of G.

Nor does RSD tell us the “distance” from ideal balance, because the size of the “footsteps” in each league’s RSD are different, depending on G. And since ISD does not change proportionally to the change in ASD as G varies, the differing size of the footsteps makes RSD an apples-to-oranges comparison.

The RSD solution– “hey, let’s just divide by the ISD” – never had any statistically valid justification, which is why none has ever been offered. It was an intuition, but one that turns out on further inspection to take us nowhere. It would never be created today. The only reason to use it is because many have used it in the past. But that is really no reason at all..

Let me try to make the previous "proof" ...

2016-04-25T15:25:21.149-05:00

Let me try to make the previous "proof" clearer.

ISD is defined as the expected SD of an idealized league where every team is .500 talent.

I say: *Redefine* ISD as the expected SD of all the team's deviations from their expected record based on their talent.

For all teams being .500 talent, the definition is identical, since SD(team record) = SD(team record - constant .500).

For teams being different from .500, you can still use the same formula you're using (sqrt of .25/G). It will be close enough, because even for a .600 team, the actual value is (sqrt of .24/G), and for a .700 team, the actual value is (sqrt of .21/G), which is still close.

If you accept that approximation, and the redefinition of ISG to make it work for non-.500 leagues, then

ASD squared = ISD squared + TSD squared

Where TSD is the (estimated) SD of team talent, and its expected value does not depend on G.

2016-04-25T15:14:57.400-05:00

This comment has been removed by the author.

In this measure, how is Z(G1)=Z(G2), that is, dZ/d...

2016-04-25T14:53:36.099-05:00

In this measure, how is Z(G1)=Z(G2), that is, dZ/dG = 0? Both ASD and ISD change with G. Do you show this at your blog? Or can you email me a proof?

Hi, Dr. Fort, My argument is that there is a &q...

2016-04-25T14:47:38.219-05:00

Hi, Dr. Fort,

My argument is that there *is* a "successful" Z statistic:

Z = square root of (ASD squared - ISD squared).

Comments on Rod's Sports Economics: Ruminations on RSD

Thanks, Guy.

I'm not sure I follow your exercise. In any ca...

Sorry about that. I had the link access set incorr...

Rod: Yes, I am the same "Guy." As a n...

Part II: As for a demonstration: I would predict ...

Part I (had to split to get the comment to work): ...

Sure, will write something up.

Also, I apologize since it appears one of your rep...

Hopefully, you are Guy Molyneaux? If hope so sinc...

I get what the measure is, but I still don't s...

Rod, I’m afraid you’ve started your rumination at ...

Let me try to make the previous "proof" ...

In this measure, how is Z(G1)=Z(G2), that is, dZ/d...

Hi, Dr. Fort, My argument is that there *is* a &q...

Hi, Dr. Fort, My argument is that there is a &q...