Turnover Conversions Revisited. Stu Lantz was… “kinda” right?

One of my first posts that christened this blog came after Lakers announcer Stu Lantz’s on air comment of Away Teams tend to give up more points from turnovers than Home Teams.

I parsed some Play by Plays of about 90 games and visually compared the estimated densities. Through visual inspection, the mean / location “looked” close enough, and i stopped there. Noting the glaring non-normality and small sample size, I wanted to pursue a non-parametric test which I’ve been saving in my pocket to use now. (So we can compare the results to a parametric test which will use a larger sample size).

Since cool kids eat dessert before dinner, we’ll first jump into the classic parametric test.

Being in the dog days of the off season, we have the privilege of looking at ALL of the games. Through a plot of densities and corresponding medians, we see a much more “normal” distribution. You also see a difference… (of about 1) in the means.

To quantify this and embed the problem in a probabilistic framework we use the, you guessed it, t-test for a hypothesis involving the mean/location of the number of possessions scored off of turnovers.

[R] spews out the printout

Welch Two Sample t-test

data:  home.to.score and away.to.score
t = -4.0508, df = 1953.926, p-value = 2.652e-05
alternative hypothesis: true difference in means is less than 0
95 percent confidence interval:
-Inf -0.548823
sample estimates:
mean of x mean of y
35.97137  36.89571

So… yes there is a statistical “difference” ( between the Home Team “X” and Away Team “Y” ) in the number of scored possessions stemming from turnovers.

Specifically, we reject the Null Hypothesis of the # of Scores from TO ( Home Team ) > the # Scores from TO Away Team.

Stu Lantz is right on that account. But, the effect size (difference) is… about 1. Stu Lantz is “kinda” right.

Halftime

If we were in a more restrictive context where things were less “normal,” the inference from the previous parametric test might mislead us. The workaround? We ditch any distributional assumptions about our data.

For our analysis that only had data for the first 90 games of the season, our data was distributed oddly. This is the opportunity to use the Kolmogorov-Smirnov Test (which utilizes the sexier cousin of the PDF, the CDF).

Looking at the “Y-Axis” We see that the CDF of the Home Team is consistently “above” the CDF of the Away Team. Looking at their respective “X-Axis” values, this corresponds to the quantiles of the Home Team being located to the “left” of the quantiles of the Away Team.

Through [R], Komolgorov and Smirnov declares

Two-sample Kolmogorov-Smirnov test

data:  home.to.score and away.to.score
D^+ = 0.0879, p-value = 0.0005196
alternative hypothesis: the CDF of x lies above that of y

Warning message:
In ks.test(home.to.score, away.to.score, alternative = “g”) :
  p-values will be approximate in the presence of ties

There you have it, Stu Lantz is “kinda” right again. However, the test doesn’t tell us much about the size of the “difference of means.” But the density plots show some value of about 1 between home and away teams.

Since we had dessert first (results of the parametric test utilizing all observations), our main course of the non-parametric dish tastes pretty good too.

I should have known better than to question the Stu. Look at Joel Meyers. Who? You can find him on the back of a milk carton. Respect goes out to Stu Lantz’s mustache and sick burns.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s