Archive for 2010 MLS Draft

On the MLS Draft: Linear Regression

In my first two posts on the MLS draft I was looking at patterns in the 2009 and 2010 drafts.  I’ve now started looking at the data from the first 5 seasons.  Again, I’m using minutes played as an indicator of success.  I’m not attempting to predict which players should be picked when and who will be successful in MLS, but instead I’m trying to reveal patterns that show a breakdown in the draft decision making process. 

First I wanted to look at how the percentage of minutes played changes based on the selection number of the player.  Aggregating the data from the first two rounds from 2006 to 2010, a nice linear pattern emerges.  We can use a linear regression to estimate what a player’s expected percentage minutes player should be and whether they are under or over performing.   

Another thing I noticed was how bad teams were at predicting talent. Approximately 25% of the first two round draft picks never make an appearance in MLS.  Given that the draft is one of the main sources of acquiring talent (although this is changing), I found this number appalling.  I wanted to see if any teams were good at drafting players or if it was a random toss of the dice.  To estimate draft success I looked at how each team’s picks compared to how the linear regression estimated they should do.

 Philly’s poor performance can be attributed to their strategy of drafting young players and only having one season for them to develop.  A few more seasons are needed to determine whether or not their players are panning out.  I was surprised to see the Seattle Sounders performing below expected because Steve Zakuani has been a wonderful pick.  However, looking at their other picks, David Estrada and Evan Brown have not performed up to expectations and Brown has been released from the team.  I was really impressed with the LA Galaxy’s record.  3/4 of their backline this season came from the draft, including newly capped Omar Gonzalez.  In fact, half of their picks were starters in the conference semifinals. 

I also wanted to look at which universities produce the most successful players.  I was curious to see if some universities were talent pipelines to MLS.

The darker colors indicate the number of players drafted.  The data was filtered down to only universities that had 2 or more players drafted. Wake Forest and Notre Dame tend to have a lot of players drafted, but they aren’t very successful.  I was a little shocked that year after year teams picked from these universities.  University of Maryland, however, seems to consistently produce talent in large numbers.  Also of note is that players who didn’t attend college tended to under perform.  Definitely there have been some players that were drafted and didn’t pan out, but others like Brek Shea, Fuad Ibrahim and Jack McInerney still show promise and might take longer before they become everyday starters.

On the MLS Draft Part One

I’ve been playing around with Tableau Public a little more and I have to say I’m impressed. I decided to revisit some work I had done on the MLS draft. Mid-season I decided to look at the correlation between a player’s selection spot and the amount of impact they are having with their new team. Currently there is no good metric to estimate impact, so I used minutes played. Yes, it is a very imperfect metric, but it does provide an easy way to compare players of any position. The logic behind using minutes played is that it shows a baseline ability that player X is good enough to make it onto the playing field. If they perform well, they will be selected again, if not, then they won’t see much playing time. It tells us nothing about potential or future performance nor about the quality of those minutes played. Certainly there are unique circumstances in each team that could affect a players minutes, but as a stake in the ground to get started I think it’s a decent metric.

When I first looked at the data, I noticed that minutes played seemed to decay exponentially as the selection number increased, with a handful of outliers. The drop off was a little surprising. It shows that there are only a handful of players in the draft class that are able to come in and make an impact straight away. Looking at the data again, but this time with the ability to filter by position, I noticed that defenders taken later in the draft outperformed their expected minutes. Something to keep in mind if you’re looking for cover in the back and need someone to step up immediately.

I wanted to play around with Tableau’s mapping features so I decided to plot the draftee’s university or club and see what patterns appeared. There’s a heavy East Coast bias with Southern California getting good representation as well. Given that a lot of the traditional college soccer powerhouses are located in the Mid-Atlantic region and Southern California, this wasn’t too surprising. However, it doesn’t represent the current ranking of college teams. Midwestern schools like Tulsa and Drake who were in the top 10 of the NSCAA rankings at the end of 2009 didn’t have a single player selected while schools outside the top 10 had 17 players selected (out of 32). Most surprising was Notre Dame, with 3 players selected. Notre Dame wasn’t in the top 25 at the end of last season. The players selected have yet to make an appearance in MLS.

This is just the tip of the iceberg. I can’t wait to look at the previous years’ drafts as well as how players progress over the years. Take a look at the viz below and let me know if you notice anything I missed.