This is spurred by a couple of blog posts at Phil Birnbaum's site ("Noll-Scully doesn't measure anything real") and TangoTiger's site ("Trap of Noll-Scully"). And I also hope it helps with some Twitter discussion that didn't work out there (@BMMillsy, @guymolyneux, @dataandme).
Here's what I understand about the Noll-Scully RSD measure plus the best that I can make out about the issues raised above.
Thought process. The standard deviation of final winning percent is useful in relation to competitive balance (which also follows from the underlying distribution of talent in the league). Suppose one version of a perfectly balanced league, given by Pr(win)=0.5 for all teams and all games. Start this version of a completely
competitively balanced league at G1 games. Change it to G2 games. What happens?
Let ISD bet the standard deviation of this version of a completely balanced league.
Fort and Quirk (Journal of Economic Literature, 1992) show that ISD=0.5/sqrt(G) for the binomial without ties. Thus, moving from G1 to G2,
ISD(G1) will be different than ISD(G2) because ISD depends
on G. This helps to make clear that in general, the standard deviation of
winning percent depends on G as well.
Let ASD be any standard deviation of winning percent from the league.
Here is what I get from the discussion/Twitter noted above. Suppose we have a statistic Z that measures the outcome of
applying the talent distribution in league play. In a league with season length G1,
we get Z(G1). If the league
changes to season length G2, we get Z(G2). Define a successful
Z to have the following characteristic:
Z(G1) = Z(G2) because the underlying talent distribution
is the same in either case, just applied in leagues of different season length.
So, how to reconcile Z and ASD? Even in a league of equal playing strength,
so that Pr(win)=0.5, ISD changes with G and so will ASD generally.
Let’s consider three alternatives (there may be more).
Alternative 1: Z*
= ASD ±
dASD/dG. Now it will be the case that Z*(G1)
= Z*(G2) because the impact of just changing season
length will be netted out by calculation of the impact of G on ASD and addition
or subtraction. This is sort of like an
“inflation adjustment”. The distribution
of talent didn’t change and our Z* provides the comforting result
that it is the same for either G1 or G2. Of course, this requires knowing dASD/dG.
Alternative 2: RSD =
ASD/ISD. It is immediately clear that
there is no way that RSD can be a successful Z.
It will always change with G and it contains no adjustment that would
make it stay the same regardless of G.
Alternative 3: Just
dump the standard deviation as useful because you don’t like either of the
above. There are other measures
of final season competitive balance as a reflection of the distribution of
talent. But don’t propose a game-level
or playoff access or dynasty alternative since we’re talking about final season
competitive balance. There are
other measures of those other aspects of balance as well.
So the point is well made that RSD cannot be a candidate for
Z. But it was never intended to be such (I know Noll quite well and knew Scully well prior to his death). It really is meant to be the distance comparison measure, 5 steps from my door are farther than 2 steps from your door, so I am farther from my door.
Perhaps there is just a semantics misunderstanding when the literature using RSD states that it "controls" for G? Surely the Z* measure does this forcefully. But RSD does it relatively, so maybe a better way of saying it is that RSD "recognizes" G in its relative comparison.
Some concluding comments...
So far, I haven’t seen anybody
take a crack at calculating dASD/dG. I
wonder if the related critics Owen & King (Economic Inquiry, 2015) are actually just simulating dASD/dG
in which case they are an ally to those seeking Z*. One could just use their simulation results rather than trying to determine the derivate, dASD/dG. Or perhaps the Pythagorean discussion in the references to blogs at the top of this post handle this problem already? If not, then there is a ways to go still with Z* development.
But I’m still not so sure that taking the "inflation
adjustment approach" is any more informative than what is done with RSD. Z* distills the dASD/dG problem to
an absolute level. RSD just puts the
comparison at a relative level.
It does seem to me that the Z* devotees are not really critiquing
RSD as a normalization. They would just
prefer to take the direct Z* approach rather
than taking the relativist approach.
And I chose the word “prefer” carefully. While it is easy to see that Z* is
different than RSD, I still don’t see
how Z* is superior to
RSD. And it is not enough to just say
so. In any event, if Z* is shown to be
better at a later date, future work will be the better for it.
In the meantime, I have competitive balance to compare, within a league where season length changes and across leagues with different season lengths. And I haven't yet been dissuaded on RSD as one useful measure.