So TGC and I are sittin’ around the other night watching a little college basketball on a weeknight. Of course it’s an ESPN broadcast, and to be honest, the old announcers weren’t doing too bad of a job. Then, about halfway through the first half, the announcers start talking about the NCAA tournament selection committee considering the last 10 games on a team’s schedule. We then get Announcer #1 spouting the following statement “There is no correlation between how you finish the season and how you do in the tournament.”
Well, well, well. We couldn’t just take that information as the gospel here at APIAS. That wouldn’t be our style. So follow the jump to find out why ESPN needs to train their guy’s to watch their words.
So immediately after this came over the speakers we hit the Internets. An Excel sheet was quickly developed comparing the results of every team in the 2007 NCAA tournament. The key statistics? We compared a team’s number of wins in their last 10 games to what round they got to in the tournament. Every team got to round 1 and only Florida got to the proverbial “round 7” as they won the champeenship.
The next step in our process was to axe the #15 and #16 seeds. They were just straight up deleted from the data. Why? Because they never win. It doesn’t matter if they come in at 1-9 or 10-0 in their last ten games. They don’t win in the tournament so they only taint the data.
So then we take a look at the correlation between wins (in last ten games) and round achieved. We get the following chart.
Now that looks pretty rough. The correlation coefficient, or R-value, is 0.1948. For those not in the know, a perfect correlation is 1.0 and no correlation at all is 0.0. So we’re looking pretty grim on proving our announcer hero wrong. But let’s take a little closer look at the situation.
The teams that usually get in based on their recent performance are big time schools. A team like Nevada gets an at-large bid becasuse they’re 27-4 for the season. Not because they’re 9-1 in their last ten. A team like Clemson could get in because they’re 8-2 in their last 10 and made a run into their conference tournament. So we go back to the data and take out all the mid-majors. Only BCS conferences remain in the data (a set of 36 data points).
Re-plot that data and what do we find? An R-value of 0.4424. Well, that’s getting closer but it still doesn’t say much. But when you look at that chart you notice that Florida and OSU are putting a kink in the data. Why? Because they had 7 and 10 wins, respectively, in their last 10 games. They also constitute only 2 data points in our set and throw off the data because they’re “top-heavy”, or basically they come at the end of the data set and upset the general trend.
Off they go. Florida and OSU’s data points are out and what do we find? With 34 data points left, comprised only of major conference teams that lost before the championship game, we have a correlation coefficient of 0.9839.
That, friends, is significant. In fact, it says there is a direct relationship between how one plays down the stretch run of the season and how they will do in the tournament. We threw some data points out to get to this conclusion, but that’s how you go about a statistical analysis. You take the pertinent information and see how it pans out, not the entire data set.
So, this is why we say that ESPN announcers need to watch their words. We have no problem with one of them saying that they don’t “think” or “believe” that how a team does down the stretch should contribute to the selection committee’s decisions. What we do have a problem is is when they make a distinct statement that they have in no way investigated.
In a way, this whole experience has been enlightening. We feel a bit like Carl Monday right now. Doing all the ground work really does pay off in the end when you catch some guy masturbating in a public library. Or an ESPN announcer spouting off at the mouth. Either way, satisfying.