Apr 03 2013

Big Data Still Hasn’t Cracked the March Madness Code

Don’t hate the data; hate the game. Sports analytics is a hot field, but the results haven’t matched the hype so far.

When college basketball fans sat down to fill out their March Madness brackets, many of them probably wondered, “Isn’t there an algorithm that can figure this out?”

The answer? Yes, there is. But the real question is whether machine-selected brackets are accurate. Well, it turns out that big data still has a lot of work to do when it comes to picking upsets in the Big Dance.

According to a report from Science 2.0, Georgia Tech's Logistic Regression/Markov Chain (LRMC) college basketball ranking system predicted that Indiana University, Gonzaga University, the University of Florida and the University of Louisville would be in the Final Four in Atlanta.

Big data got one out of four correct: The teams scheduled for the Final Four this weekend are Louisville, Wichita State, Michigan and Syracuse.

Clearly, the big-data revolution is still in its infancy.

You might think we should give up trying to use big data to predict sports outcomes, but that’s not the scientific way. After all, Thomas Edison failed some 1,000 times before he successfully invented the light bulb.

Data analytics improves as the data sets get larger. We just have to accept that there’ll be a lot more misses before we consistently see big-data slam dunks.

Vasu Kulkarni, a longtime basketball fan and an entrepreneur in the sports analytics space, is completely honest in his assessment of the success rate of machines when it comes to betting on the winners.

“The thing everyone’s trying to figure out is: Is there a way for you to find the two or three or four things that will guarantee you a win or at least tip the scale in your favor at any given time. I don’t think anyone has cracked the code,” Kulkarni said in a recent GigaOm article.

Kulkarni is the founder and CEO of Krossover, a startup company that processes game footage and breaks out player and game statistics for teams. He attended the MIT Sloan Sports Analytics Conference in Boston earlier this year and realized that, while technology has allowed us to collect more data than ever, we — and the machines — still aren’t doing such a great job of putting the data to good use.

Although big data isn’t great at predicting March Madness winners, it does tell us how fans respond to a game’s final score: They either console themselves with food, or they’re too excited to eat.

GrubHub, an online food-delivery service, tracked food orders and found that requests for comfort food spiked in regions where NCAA teams lost. In regions that were home to March Madness winners, fans skipped the late-night snacks, according to a Bostinno article:

In the hour following the game, order data shows a post-game order spike of 10-15% in markets mourning big losses, while post-game eating actually decreased 5-10% in markets celebrating wins.

Here are specific examples from GrubHub’s data:

  • Following the Pitt loss to Wichita St., post-game orders in Pittsburgh spiked by more than 60 percent.
  • Following the Temple loss to Indiana, post-game orders in Philly spiked by more than 30 percent.
  • Following the Cal loss to Syracuse, post-game orders in San Francisco spiked by more than 20 percent.
  • Following the Georgetown loss to Florida Gulf Coast, post-game orders in D.C. spiked by more than 15 percent.

You’ve got to love technology. It might lead you down the wrong path for bracket selection, but at least you can look forward to greasy fried delights if your team loses.

<a href="http://www.flickr.com/photos/adamglanzman/8594968091/sizes/l/">Adam Glanzman/Flickr</a>