This was my second year at the Sloan Sports Analytics Conference and it was interesting to see the changes that they have implemented from one year to the next. The biggest differences were more attendees and extending the conference to a second day. There were a few people I had wanted to meet up with early on the first day, but given the increased attendance, meeting up with them was like finding a needle in a haystack. Fortunately, the conference attracts a lot of really great people so I was able to meet some wonderful folks through random encounters. A lot of the value of the conference is in the hallway conversations and unfortunately that isn’t something you get when watching the taped sessions online.
The conference overloads you with a wealth of information that is difficult to take in all at once and it is impossible to attend everything you’d like to due to scheduling conflicts. My key takeways were:
- Analysts need to build trust with those within the organization
- The way the results are presented is crucial
- The human element can never be taken out of the game (ie: shooting freethrows in key moments is incredibly difficult)
- Teams are interested in how they can win more games, not player rating systems
- Teams can measure things that outsiders cannot (ie: missed blocking assignments)
- Season long averages are often meaningless, variability and recency are more important
- Impact is more important than averages (averages can be inflated by blowouts, teams want to know who is contributing to wins at key moments)
- Swings of chance are large in football so analytics might not always predict the outcome (this doesn’t mean they aren’t useful)
- The psychological makeup of a player is incredibly useful information and hard to quantify. A coach can get a good read of a player’s ability from watching him but it is almost impossible to know ahead of time if the guy loves the game or when his love for the game will leave him.
- Analytics aren’t a replacement for decision making. At the end of the day, someone needs to look at all of the information (beyond analytics) and make a decision. Analytics helps, but it isn’t a complete solution.
Day 1 started off with a keynote speech by Malcolm Gladwell followed by a panel discussion with Jeff Van Gundy, Daryl Morey, Justin Tuck and Mark Vergestad. Gladwell continued with his theme of needing 10,000 hours of practice to acheive greatness and the panel quickly devolved into a discussion of whether or not it was possible for an athlete to be too talented (and therefore not needed 10,000 hours of practice). The example used was Tracy McGrady — a basketball player who never reached his full potential possibly because of a lack of ambition. Boredom was a term thrown around and I can’t help but think of Ronaldinho as the soccer equivalent of this. The game comes so easily to him and he accomplished so much in a short period of time that it appears like he grew bored with it. Oddly enough, players like Messi and Ronaldo, who are clearly the best of the best right, seem to be immune from this, at least for now. In a conversation I once had with someone from Nike’s Sports Marketing department, they said the best thing for Pete Sampras’s career was Andrei Agassi. He needed a rival to keep him interested in the game and keep pushing him forward. Although neither Messi nor Ronaldo will admit to being interested in the rivalry, the fact that neither is easing off the accelerator shows how competitive they are.
I next attended the panel on Injury and Performance Analytics. The main themes were that there wasn’t a lot of research being done into injury prevention. 70% of injuries in an NFL season occur in the first two weeks of minicamp when minimal contact is occuring. Much of the pregame warm up routines are archaic and haven’t changed in decades. There is lots of room for improvement here and teams are just beginning to take notice. I had to leave the discussion early so I don’t know if anyone brought up the work happening in Milan Lab run by AC Milan. For better or worse, these guys seem to have been able to extend the careers of players through research.
After that I attended a talk on using optical tracking data in the NBA. This talk was referenced later on the Soccer Analytics panel because the NBA is just now getting access to some of the fine grained tracking data that soccer has had access to for almost a decade thanks to ProZone, yet the NBA is so much further along in terms of the actual analysis. My favorite part of the talk was Sandy Weill’s explanation of the steps he needed to go through to clean up the data. He was fortunate enough to have a play-by-play feed that he could use to correlate the events from the tracking data with and he ended up throwing out roughly a third of the shots in the tracking dataset because of quality issues. Quality data is crucial to good analysis.
The Referee Analytics was new this year (or new to me at least). I was particularly interested in it because of some questionable calls going against Arsenal and some of the analysis done by A Beautiful Numbers Game. To give a sense of the crowd at the conference, Mike Carey was introduced by Bill Simmons as the head referee of Super Bowl XLII which got a chuckle out of most people. For those not in the know since this is a soccer blog, Bill Simmon’s favorite football team lost that Super Bowl and there was a questionable call that proved to be the turning point of the game. The right call was made. The issue of instant replay was brought up and Carey’s response was “My job is to get the call right. It helps me get the call right, therefore as a referee I like it”. FIFA, are you listening? Mark Cuban was on the panel and although he was quite restrained, not wanting to be fined some more, he did share how some of his complaints had affected change in the NBA. In particular he noticed that all hired refs were coming from only two college divisions because the head of NBA referees knew the guys in those divisions. Cuban pushed for change so that the NBA could pull from a larger talent pool. He also brought in an advisor from the US military to help train the referees on how to make correct decisions under pressure. I’m not sure how much of this is done in soccer leagues around the world, but it sounds like there are plenty of lessons that could be learned from the NBA and NFL.
That evening was the soccer analysts get-together and I got to put some faces to the names I have been reading for the last few months. It was great to meet everyone (or see again) and find out that a lot of us have common goals and desires. Two main things that people are after are: better access to data and a central place to share and discuss ideas. Based on that chat I’ve decided to launch Soccer Analysts (clever, right?), a one stop shop for all the latest in soccer analytics. If you’re interested in having your blog be a part of Soccer Analysts, send me a note with a link to the blog at srudd@onfooty.com. To address the data issue, I was pleased to learn that StatDNA is having a data analysis competition. They are giving entrants access to several hundred matches worth of data and asking them to come up with the best analysis. Details about the contest can be found at their blog.
I was only able to attend limited sessions on Day 2, but they were well worth it. I stopped by the Evolution of Sport talk on data mining and it was pretty interesting. The concept of using data mining for sports isn’t novel, but what caught my attention was the way the results were presented. The scenario presented was trying to determine what factors predict whether a baseball team wins the division. Using traditional regression techniques the results don’t seem very clear because of what was described as projection shadows. Basicailly, the data gets muddied when trying to project mulitple dimensions onto a two-dimensional space. Using a bivariate plot of the factors in the model allowed the picture to become a lot clearer. Going back to one of the key takeaways about presentation being important, I thought this was particularly useful.
Finally I attended the Soccer Analytics panel — the main reason I went to Boston. The panelists hit on the main themes I mentioned at the beginning of the post and there were a few interesting anecdotes. Gavin Fleig talked about the process Bolton used to change their match model to fit their players using analytics and it was great to hear Ian Graham talk about how that showed up in the Castrol Index and how it exposed a shortcoming of the index. Bolton finished 7th one year, but their player’s cumulative score for the Castrol Index was 17th. The index was tuned to reflect what skills make the average player good, but it didn’t reflect what made a Bolton player good (ie creating set pieces in or near the final third). When the index was adjusted to account for Bolton’s requirements of their players, the players ended up scoring extremely well. This highlighted one of the caveats when dealing with analytics. Style of play and the assets of the team make comparisons incredibly difficult.
It was interesting to see Daryl Morey and Mike Zarren (two key basketball guys) sitting in the back of the room during the soccer analytics panel. Chatting with industry insiders, one of their main reasons for attending is to learn what other people are doing. It’s promising that there is lots of cross-polination going on between soccer and basketball.
I was really impressed with the level of soccer presence at the conference. Many EPL clubs sent representatives although I was shocked to see almost no one from MLS. It would have been the perfect place to announce the Opta/MLS agreement. Also, given how resource constrained MLS is, I feel like there are loads of opportunities for MLS clubs to gain a competitive advantage. One interesting take away from the soccer panel was that everyone agreed identifying talent like Messi and Torres is easy. Only a few clubs can afford them. Where analytics comes in is identifying less obvious talent, or the right talent for your club. This is the exact problem MLS faces.
If you are interested in attending the Sloan Sports Analytics Conference next year, the conference is held every March and tickets go on sale in late November/early December. The conference sells out, so act quickly.


Picking up on your comment on the NBA analysis – they are much further on than in soccer in terms of ‘actual analysis’. In what way do you think this is?