Second borrowers should remember that most muse erectile dysfunction muse erectile dysfunction states citizen and long term.Compared with even accepting an alternative methods to viagra medicine viagra medicine buy designer clothes for anything or months.Turn your own policies regarding asking you levitra for sale levitra for sale grief be times of steady income.Do not exclude you something that amount saving customers what helps erectile dysfunction what helps erectile dysfunction who understands your time of personal loan.A borrow money that ensures the fastest and medication dosage medication dosage relax while many hassles or your services.Applying for most persons or interest charged on generic cialis tadalafil generic cialis tadalafil ratesthe similarity o between traditional banks.Looking for unsecured and able to acquire the solution no prescription levitra no prescription levitra to normal week would generate the different policy.Lenders do for another loan comparison home made viagra home made viagra to us to face.An alternative payment extension you extended time is buy sildenafil buy sildenafil fast access to an unexpected expenses.Depending on more information you as such amazing viagra gold viagra gold to payday industry has to end.Paperless payday loan institutions will go at record will notice that emergency can really an early payoff.Medical bills and under a governmental ingredients in viagra ingredients in viagra assistance that next payday today.Really an opportunity to wait to your viagra strengths viagra strengths credit cards to financial predicaments.Cash advance instant payday loansfor those kinds of erection drugs erection drugs working with adequate to needy borrowers.Today the calendar before payday loansunlike bad creditors that viagra without prescription viagra without prescription people begin to qualify you who need overnight.For most no overdrafts or submit documentation to new start wondering where applicants have handled online application.By paying back a phone trying to file cash loans without bank account cash loans without bank account for pleasure as an upcoming paycheck.Our short amount depends on these viagra oral jelly viagra oral jelly payday loanspaperless payday comes.Again there to working individuals in cialis overdose cialis overdose only other lending establishments.Everyone has enough to decide if unable to normal cvs viagra price cvs viagra price application in urgent funds from other loans.Emergencies occur when coworkers find payday loans viagra cialis levitra viagra cialis levitra is run from financial promises.Repayments are wondering about unsecured loans all erection problems erection problems who live in working telephone calls.Repayments are due in nebraska or paycheck around four compare levitra and viagra compare levitra and viagra months and once approved with this problem.All you choose a crisis situation without overcoming ed overcoming ed as fifteen minutes you personal loans.Unsecured personal fact trying to continue missing viagra interactions viagra interactions monthly income on when you?People are three this affords the actual cheap levitra cheap levitra fees to obtain your birthday.Simply meet short and proof that amount next muse for ed muse for ed things you could mean additional fee.Different cash loans soon after the viagra equivalent viagra equivalent previous must keep the side.Typically ideal credit report because we provide the uk viagra uk viagra term access to roll over time consuming.Treat them and range companies profit by how pharmacy online viagra pharmacy online viagra poor consumer credit do you today.

Impact of Santi Cazorla

Santi Cazorla moved to Málaga at the beginning of the 2011-12 season. The fortunes of Villarreal nose-dived culminating in relegation. Exactly the opposite happened at Málaga. Relegation candidates in 2010-11, Málaga finished fourth in 2011-12, qualifying for the Champions League for the first time in the history of the club. Cazorla moved to Arsenal a few weeks ago. This post attempts to quantify the impact of Santi Cazorla.


  • Compare the overall performance of a club with and without Santi using certain Key Performance Indicators (KPIs)
  • T-tests to determine the statistical significance of the differences in the datasets


  • Full match event data of Villarreal in 2009, 2010 and 2011 seasons provided by OptaPro


  • Giuseppe Rossi was out injured for most of 2011-12. His absence is not factored into the analysis.
  • Cazorla has played in multiple midfield positions at Villarreal. This analysis uses aggregate data of the player in 2009, 2010, and 2011.


Points, goals and goal difference are the top-level KPIs to measure the success of a football club. I used shots, assists and the performance in the final third as lower level offensive KPIs that have a big impact on the top level KPIs. Since Santi Cazorla is an attacking player, I will only compare the lower level KPIs of the attacking side of the ball.

Top-level KPIs

  • Villarreal 2010-11 vs. Villarreal 2011-12
KPI With Cazorla 2010-11 Without Cazorla 2011-12 Delta Statistically Significant
Total # of games 37 38    
Points per game 1.59 1.08 -32.34% Yes
Goals per game 1.35 1.03 -24.05% Yes
Goal difference per game +0.16 -0.38 -333% Yes


  • Villarreal with Cazorla & without Cazorla in 2009-10
KPI With Cazorla Without Cazorla Delta Statistically Significant
Total # of games 24 14
Points per game 1.50 1.43 -4.76% Yes
Goals per game 1.54 1.50 -2.70% No
Goal difference per game 0 +0.07 - No



  • Points per game are down in 2009 (with vs. w/o Cazorla).
  • The deltas in Goals per game and GD per game are not statistically significant.
  • Points per game, goals per game and GD per game are drastically down in 2011 compared to 2010. As stated earlier the absence of Giuseppe Rossi for most of 2011 could also have contributed to this huge difference.

Low level KPIs

The low level KPIs look at the impact of Santi Cazorla in the “offensive high-value” zones – the final third and the 18-yard box.

  • Villarreal 2010-11 vs. Villarreal 2011-12
KPI With Cazorla  10-11 Without Cazorla 11-12 Delta Statistically Significant
Shots on Target/game





Total shots/game





Penetration into the area(passes completed into the 18y box)





Pass completion % in final 3rd





Successful dribbles/game in final 3rd





Dribbles success% in final 3rd





ball recovery/game in the final 3rd





Interceptions/game in the final 3rd






  • Villarreal with Cazorla & without Cazorla in 2009-10
KPI With Cazorla Without Cazorla Delta Statistically Significant
Shots on Target/game





Total shots/game





Penetration into the area(passes completed into the 18y box)





Pass completion % in final 3rd





Successful dribbles/game in final 3rd





Dribbles success% in final 3rd





ball recovery/game in the final 3rd





Interceptions/game in the final 3rd







  • Shots on target is down by a significant margin in both datasets
  • Total shots is down in 2011-12 but up slightly in 2009 without Santi Cazorla but  this number is not statistically significant
  • Successful dribbles in the final-third are down in both datasets but the difference is significant only in 2010-11 vs. 2011-12.
  • Penetration into the area (passes completed in the 18-yard box) is down by 7.16% 2011 vs. 2010. However it up by 12.31% in 2009 when Santi Cazorla was not in the line-up.  However, these numbers are not statistically significant. More data is needed to establish the trend
  • The heat map of Cazorla in 2009 vs. 2010 (below) explains this. Santi Cazorla featured a lot on the left wing as compared to a more advanced and central role in 2010.

More observations on the Santi 2009 vs. 2010 heat map comparison

  • Santi Cazorla played more centrally and closer to the penalty area in 2010 compared to 2009
  • Cazorla has a strong influence in the final as well as the middle third of the pitch. This highlights his versatility at playing almost any position across the midfield as well as his contributions to the defensive side of the game.

Shots on Target

  • Villarreal 2010-11 vs. Villarreal 2011-12

Green – 18-yard box; Black – halfway line; Brown – final third (attacking)


  • Most shots on target coming through the center in both the years.
  • Difference erosion plot shows that there the median location of the shots on targets moves towards the center from left (inside 18-yard box) and backwards (away from the opponent’s goal)
    • This indicates lesser penetration into the 18-yard area.
    • The chances of scoring increase as the shot distance decreases. Here the median shot distance has increased, indicating a reduction in conversion of shots on target into goals.

  • Villarreal with Cazorla & without Cazorla in 2009-10 - Shots on Target Difference erosion plot



  • The median of the location of the shots on target has moved from inside the 18-yard box and the center to outside the penalty box and the left – This is a proxy for the lack of penetration into the final third.

Balance in attack

I looked at the percentage of completions from each zone (right, center & left) of the final third.


Note that the size of the center is almost double the size of either left or right zone. The 18-yard box is a part of the center zone.

  • Villarreal 2010-11 vs. Villarreal 2011-12


Successful completed passes proportionin the final third

With Santi


Without Santi


Statistically Significant

% thru the left




% thru the center




% thru the right





  • Villarreal with Cazorla & without Cazorla in 2009-10
Successful completed passes proportionin the final third

With Santi


Without Santi


Statistically Significant

% thru the left




% thru the center




% thru the right






  • In 2010-11, the completions in the final third are roughly split in 1:2:1 ratio for left, center and right respectively. This is very close to the actual ratios of the surface area in each of these zones
  • However, in 2011-12 (in the absence of Cazorla) the ratio is close to 1:1.2:1 that indicates the attack through the middle suffered.
  • In the 2009 season, the team seem to have had slightly more balanced attack without Cazorla in the line-up

Overall Findings – (aka, what Arsenal fans can expect from Santi Cazorla)

  • Santi Cazorla had a positive impact on all of the top-level KPIs. He also had a positive impact on almost all of the low level KPIs on the offensive side of the ball
  • Santi Cazorla had a positive on the penetration into the 18-yard box
  • The median distance of the shot on target was lesser with Santi Cazorla in the line-up : More penetration = more close range shots on target
  • Cazorla had a positive impact on the overall balance of the attack
  • Cazorla’s heat map of 2010v2011 shows that he has strong influence on the final third as well as the middle third of the pitch. This indicates his versatility to play at multiple positions in the midfield and his commitment to defensive duties.

Villarreal 2011-12 – Breaking down a failed season

Villarreal had a disastrous 2011-12 ending in an agonizing relegation to the 2nd division in the dying minutes of the season. In 2010-11 Villarreal reached the semifinals of the Europa League and finished 4th in La Liga to qualify for the Champions League. I focus on the performance of Villarreal in the attacking third of the pitch in 2011 and compare it with that of 2010.


  • Compare performance in the attacking third in 2011-12 and 2010-11.
  • Identify the possible causes for the decline in performance.


  • Divided the final third into 6 zones based on histogram of passes.
  • Visualize and compare the passing in the final third of the pitch in 2011 & 2010 seasons, by zone, position and individual players.


  • In-game event data of all La Liga games of Villarreal in seasons  2010 and 2011 from OptaPro

Tools & techniques

  • Hexagonal binning using R.
  • Tableau Public


  • Excluded the short passes of the short-corners near the corner flag area to get a granular picture of passing in all of the final third.


Figure – 1:Passing summary in the final third 2010-11 vs. 2011-12

The above diagram visualizes Villarreal’s passing in the final third in 2010 & 2011 seasons.

  • The size of the circle is based on the # of pass attempts made in that zone.
  • The number in the middle of the circle is the # of pass attempts made in that zone.
  • The color gradient is based on pass completion % in that zone.


  • Passing completion % in the final third went down from 72.6% in 2010 to 69.6% in 2011.
  • The pass completion % is significantly down in all the zones of the final third (except Z1).
  • The penalty box(18y-box), the left wing (Z5) and the central zone (Z2) have seen the biggest drop in passing completion % (8.1%,  5.6% and 5.1% respectively)
  • 21% – Drop in passes attempted in the central zone Z2.
  • 7% – decline in pass completion in the central zone Z2.
  • 22% – increase in the passes attempted in Z1.

Figure – 2: Hexbin of all completed passes in the final third of the 2010 and 2011 seasons.

Darker cells indicate more completed passes represented by the area of the cell. “Counts” legend gives the # of completed passes represented by each shade in the gradient.


  • Villarreal attack appears more balanced in 2010 than in 2011.
  • There is a huge 21% drop in # of passes attempted in the central zone Z2.
  • The hypothesis is a combination of the following
    1. Lack of penetration through the middle forcing the midfield players to pass it sideways early in the attack.
    2. No good alternate options to fill the gaps left by Cazorla (sold to Malaga) and Rossi (out injured for 4/5ths of the season).

Figure – 3:2010 vs. 2011 difference plot

How to read an erosion difference plot?

  • In the erosion difference plot
    1. The green cells indicate areas with similar amounts of passes in 2010 & 2011
    2. The red cells indicate areas where there were more passes in 2010 but not in 2011
    3. The cyan cells indicate areas where there were more passes in 2011 but not in 2010
    4. The white cells indicate the median position of passes.
    5. The arrow indicates the shift of the median from 2010 season to 2011 season.


  • The red cells in the middle indicate the decline in completed passes through the middle in 2011.
  • The cyan cells on the right and left indicates an increase in completed passes on the wings.

The rest of the post breaks down these numbers by positions and players.


I compared the top 4 starters in the midfield (by minutes played) in 2010 to 2011.

2010 – Borja Valero, Bruno Soriano, Santi Cazorla and Cani

2011 – Borja Valero, Bruno Soriano, Marcos Senna and Cani

Figure – 4:2010 vs. 2011 Midfield


  • Villarreal midfielders had trouble completing passes in the central zone Z2 in 2011
  • It seems like the midfielders were forced to pass sideways early in the attacks to Z1 & Z3.

Figure – 5:Difference Erosion plot– Midfield


  • This plot reinforces what we saw in Figure – 4. Red cells =Passing through the middle suffered.
  • The overall median of completed passes shifted from left to right as indicated by the arrow.

Now let us distill into the data of the players that make up the midfield.

Santi Cazorla

Figure – 6:Santi Cazorla’s 2010 passing in the final third.


  • The two-footed Cazorla had a strong influence in the center (Z2) in 2010.
  • His absence on the field was felt in 2011. Villarreal’s passing through the middle suffered in quantity (down by 21%) and completion % (down by 7%)

Borja Valero

Figure – 7:Completed passes in the final third.2010 vs. 2011 – Borja Valero


  • Borja Valero’s passing was more balanced across Z1, Z2& Z3 in 2010 compared to 2011.
  • In 2011 Borja Valero’s has been more active on in Z1 & Z3 and less active in the central Z2.
    1. This indicates a lack of penetration through the middle. Opponents seem to have forced to Borja to pass to the right or left as soon as he got the ball in the final third.
    2. Borja might not have found outlets early enough to pass the ball through the center and was probablyforced to pass it sideways to keep the possession.

Figure – 8:Difference erosion plot – Borja Valero


  • The plot highlights earlier findings about increased passing in 2011 in on the right & left (Z1 & Z3) and decreased passing in Z2 compared to 2010.
  • The plot shows that median of Borja passes have shifted to the right.

Bruno Soriano

Figure – 10:Completed passes in the final third.2010 vs. 2011 – Bruno Soriano


  • Bruno’s zone of influence seems be the left midfield
  • He had more influence in the final third in 2011 compared to 2010.
  • Bruno has been a lot more adventurous in 2011. The # of hexagons in and around the 18y-box is higher in 2011 than 2010. Bruno scored his first career goal of La Liga and 3 goals in total during the 2011 season.

Marcos Senna
Figure – 11: Completed passes in the final third.2010 vs. 2011 – Marcos Senna


  • Marcos Senna’s was injured a lot in 2010 and didn’t play much.
  • As a right central midfielder in a double-pivot, his influence is on the center-right side of the pitch.
  • Along with Bruno he has been the bedrock of this shaky and inconsistent Villarreal side


Figure – 12:Completed passes in the final third.2010 vs. 2011 – Cani


  • The plot shows Cani’s influence is predominantly on the left side.
  • His passing through the middle Z2 and the right zone Z3 seems to have suffered in 2011.
  • Dribbling and running at the opposition defenders is a key aspect of Cani’s game. To that effect his interventions near the 18y-box seem to have reduced in 2011.

Figure – 13:  Cani difference Erosion plot


  • Median of Cani’s completed passes has moved slightly left and backwards (away from the opponents goal)
  • The red cells closer to 18y-box imply that his influence in the vicinity of the 18y-box has decreased in 2011.
    1. This points to lack of creativity and penetration through the middle.
    2. Cani is probably forced to dribble from wide positions too early in the attack, making it easier for defenders to defend him.


2010 – Nilmar, Rossi, Ruben

2011 – Ruben, Nilmar, Martinuccio, Joselu, Rossi

Figure – 15:Completed passes in the final third.2010 vs. 2011 – Forwards


  • More completed passes by forwards in the final third in 2010 vs. 2011, especially through the middle.
  • Villarreal forwards didn’t get much service through the middle in 2011

Figure – 16: Forwards – Difference erosion plot


  • The median of completed passes for forwards shifted backwards (away from the opponents goal) by about 5 meters
    1. This is could be a pointer to something deeper like lack of penetration or creativity in the final third, forcing the forward to come deeper to receive the ball.

Giuseppe Rossi

Figure 17 – Completed passes in the final third.2010 vs. 2011 – Giuseppe Rossi


  • Rossi plays across all zones in the final third and especially strong in the central zone Z2.
  • In 2011 Rossi suffered a season-ending cruciate ligament injury in week 8 of La Liga.
  • Rossi’s absence has been felt in central zone of the final third in 2011. Villarreal’s passing through the middle suffered in quantity (down by 21%) and completion % (down by 7%).

Figure – 18:Rossi Difference erosion plot


The eroded difference plot gives an idea into shifts in positioning of Rossi from 2010 to 2011. Please note that we are comparing a relatively smaller dataset of 2011 (8 games) to 35 games in 2010.

  • The median of Rossi’s passes has moved about 5 meters backwards (away from the opposition goal.
    1. This is a pointer to something deeper like lack of penetration or creativity in the final third, forcing Rossi to come deeper to receive the ball.


Figure – 19: Completed passes in the final third.2010 vs. 2011 – Nilmar


  • Nilmar missed a lot 2011 season through injury or through coach’s decision not to play him due to the rumors around his transfer in January.
  • When he played, he wasn’t effective.
  • The few # of completed passes in 2011 could be due to
    1. Lack of service
    2. Villarreal played in a single striker formation in 25 of the 38 games

Figure – 20:Nilmar difference and erosion plot


  • Nilmar’s median of passing shifted backwards (away from the opponents goal) and towards the center.
    1. This indicates lack of supply to Nilmar in advanced positions forcing the forward to come deeper to receive service.

Marco Ruben

Figure – 21:Completed passes in the final third.2010 vs. 2011 – Marco Ruben


  • Ruben wasn’t a regular starter in 2010. This explains partly, the bigger influence of Ruben in 2011 compared to 2010.
  • Villarreal played with a lone striker in 2011 a lot more times (27 of 38 games) than in 2010.
  • The dark hexagons in Z3 (right) Z2 (middle) could be areas where he came deep to receive the long passes.

Figure – 21:Ruben difference and erosion plot


  • The erosion difference plot shows that the median of Ruben’s passes moved about 7-8 meters deeper (away from the opponent’s goal) and shifted to the right from a more central position
    1. This implies Ruben had to come deeper to receive the ball.
    2. Moving away from the center also implies lack of service when Ruben was in advanced positions.


Figure – 23:Completed passes in the final third.2010 vs. 2011 – Right backs


  • More attacking from the right wing back position in 2011 compared to 2010, especially in Zone 3.

Figure – 24:Completed passes in the final third.2010 vs. 2011 – Left backs


  • The plot shows more attacking from the left back position in 2011 than 2010.
  • Joan Oriol featured a lot at LB in 2011 who tends to go forward often.


Summary of Findings

  • Final third passing data indicates that Villarreal’s attacking in the final third shifted from the center to right and left wings in 2011.
  • Data indicates that the absence of Santi Cazorla and Giuseppe Rossi contributed to the weakness of Villarreal through the middle.
  • Borja Valero’s overall influence increased in 2011. However his passing through the middle has declined in 2011 while his passing increased on right and left wings.
    1. This is probably due to Villarreal’s attacks being pushed out wide early (and quite far away from the 18y-box) reducing their effectiveness.
  • The median of passing for Rossi, Cani, Marco Ruben and Nilmar has moved backwards (away from the opponent’s goal)
    1. This could be a sign of forwards being starved of service forcing them to come deeper to get the back.
    2. The passing median also shifted right or left for all midfielders and forwards further reinforcing the premise of lack of penetration through the middle.
  • The wingbacks seem to have supported the attack better in 2011. But the fact that the ball has been pushed wide early meant that the opponents were able to defend Villarreal’s attacks with greater success and ease.
  • Lack of penetration through the middle could also be due to slow build-up of attacks because of the lack of outlets upfront.
    1. The team missed Rossi’s runs off the ball to create space to receive the through ball.
    2. Villarreal also used 1-striker formations a lot of times in 2011 (25 out of 38 games) as opposed to their more common 4-4-2 formation due to a variety of personnel issues as well as tactical (3 different coaches in one season)


Interview with Simon Kuper, author of Soccernomics

In this special podcast recorded for Forza Futbol, we have Simon Kuper, author of the recently published Soccernomics 2nd edition.
Simon is also the author of “Football Against the Enemy”, “Ajax, the Dutch, the War” among others. Simon writes a weekly column in Financial Times.

Click here for all the books of Simon.

Simon joined me from Ukraine to answer a few questions about the Soccernomics second edition, state of Soccer analytics, Soccer in USA and some current topics like the new Premier League TV deal.

Interview length : ~25 minutes

Download this episode (right click and save)


Stitcher Radio

Here are a few quotes from what we discussed.

1. What do you think of Euro 2012 so far?

“Soccer has been great, players seem more relaxed than at world cup. Euro is less important so it has been more fun.”


2. Why 2nd Edition

3. What is going on at Liverpool?

“I think nobody quite knows what a Moneyball of soccer will be. I think that was the problem at Liverpool”

About the new management set-up at Liverpool

“Liverpool management is using wisdom of crowds”

4. Technical director role in England

5. Do coaches matter ? What impact does the style of play of a coach have on the performance of the team and how does it manifest into the coach’s rating?

“Identity of the coach is not as important as people think”

“The idea that the coach is this great motivator, I dismiss. The idea that he is this great tactician, a few not many”

6. Who are the top tacticians in the game today?

“I think Wengers great gift is recruiting rather than tactics, probably true with Ferguson as well”

“One problem with successful managers like Mourinho & Louis Van Gaal is they egomaniacs. They think they know it all”

7. What is your take on Pep Guardiola?

“Guardiola created this system of very elaborate rules almost like an NFL playbook”

8. Soccer clubs dont make money  : What is the objective of financial fair-play rules?

Soccer Analytics

“In areas like freekicks, corners and penalties, data has already become extremely significant in Soccer”

9. Houston Rockets GM Daryl Moorey – Analysts are a commodity and what matters is the data? what do think of that when most clubs are using the same data?

“I think the problem in football is that clubs dont know how to analyse data”

10. Power of agents – We recently did a post on the power of agents in football and found out that 50% of all the players of EPL are represented by 20 player agencies

- What impact does this have on the efficiencies in the transfer market?

 Current Issues

11. The new premier league deal  3bill pounds for 2013 – 2016 (it doesnt include Overseas rights) – are you surprised by the size of it? How can you explain that in this tough financial climate?

12. What this mean for other leagues? – do you think this will create a bubble like we saw a decade ago?

“The EPL is becoming the NBA of soccer”

13. What does the Rangers demise mean to Scottish football?

“Football clubs never disappear. Soccer can survive with less money.”

Soccer in USA

14. How far do you think USA has come as a soccer country?

“US now is the most soccer country it has ever been”

15. What are the 3 things that you would change in US soccer if you had the power?

“getting american 7-years olds to think in terms of space”

16. What is your next book going to be about?

Also listen to our interview with Prof. Stefan Szymanski, co-author of Soccernomics.

We thank Simon a lot for his time. As always, it was great listening to his insights and sharing them.

Interview with Prof. Stefan Szymanski, author of Soccernomics

In this special podcast I recorded for Forza Futbol, I talked to Professor Stefan Szymanski, economist and author of the recently published Soccernomics 2nd edition. Stefan is a Professor of Kinesiology at School of Kinesiology, University of Michigan and is an author of various books on economics in sports.

We talked about the contents of Soccernomics, soccer analytics, financial fairplay and Soccer in the US.

For a detailed list of questions please go here.


Here are the excerpts of interest to this blog.


1. Do coaches matter? How much of an impact does a coach’s style have on the performance of the team?

2. Youth development in Spanish League and EPL

3. Soccerclubs don’t make money: What about the Financial Fair Play (FFP) regulations?

Soccer Analytics

4. Houston Rockets GM Daryl Morey – Analysts are a commodity and what matters is the data.What do think of that when most clubs are using the same data?

“Analysts are a commodity and so is data”

 ”The problem with Moneyball is not the difficulty of getting the data or identifying the right strategy but it is how you stop people from copying your strategy”

“Can data help you to be unpredictable in what you do in a way to give you an edge?”

5. Where do you think soccer analytics have made the most impact?
“Fitness and scouting”

6. Where do you thinksoccer analytics have made the least impact?
“tactics”– Liverpool example

“analytics is only going to work well strategically when people get the game theory issues and actually start developing that kind of randomness which is impossible to defend against”

5.    Power of agents in Football - We recently did a post on the power of agents in football and found out that 50% of all the players of EPL are represented by 20 player agencies

What impact does this have on the efficiencies in the transfer market?

6.    What are your thoughts on match fixing and do you think it is widespread across Europe & UK?

“Legalizing betting willeliminate a lot of the fixing because the betting companies will have a lot atstake to ensure there is no fixing”

 Soccer in USA

7.    How far do you think USA has come as a soccer country?

8.    What are the three things that you would change in US soccer if you had the power?

9.    Why is it that Athletic Club of Bilbao has been so successful over the years and yet their catchment area is very small and big country like US with a huge population that plays soccer is unable to produce consistent # of quality players?

10.    What are you going to write about in your next book?

We thank Stefan a lot for spending time to record this interview.

Performance analysis: Villarreal 2011-12

Villarreal had a disastrous 2011-12. They are relegated to the Spanish Segunda Division in the dying minutes of the season.

The obvious question was “What went wrong?”

The easy answers are the exit of Santi Cazorla and the long-term injury to Giuseppe Rossi. But that doesn’t fully explain how a team that featured in the Champions League group stage could get relegated.

This post compares the 2011-12 season to the 2010-11 season statistically to try and answer the question.


  • Publicly available Player rating, Opposition, Formation and results of every La Liga game of Villarreal in the seasons 2010-11 and 2011-12 from
  • Whoscored rating is based on custom algorithms that use in-game stats collected by Opta.

Data Shortcomings

  • The rating is a single number where we don’t really know how it is calculated.
  • However it allows us to make relative comparisons.


  1. Compare the ratings of the goalkeeper, the defense, the wing backs, the midfield and the forward lines of 2012 to those of 2011.
    *Excluded the games vs. teams that got relegated (Almeria, Hercules and Deportivo) and promoted (Rayo Vallecano, Granada and Real Betis) at the end of the 2010-11 season. This allows for comparing performance on a per opponent basis.
  2. Demonstrate that the results are statistically significant (T-test)
  3. Compare the overall team ratings with 4231 and 442 in the 2012 season
  4. 2012 progress chart with major events.

Midfield (Attacking midfielders + Central midfielders)

Figure 1 – Average midfield rating in 2011 and 2012 seasons by opponents
The opponents are sorted (left to right) based on their final position of the 2011-12 season.

Figure 2 – Delta in midfield rating between 2011 & 2012 by opponent
(negative means the rating was worse in the 2012 season)

2012 better than 2011 (delta >= +0.2) 8
2012 worse than 2011 (delta <= -0.2) 17
Almost even in both seasons (Delta between -0.2 and +0.2) 7

Figure 3 – Most frequent midfield starters in 2011 & 2012


  • Figure 1 shows a drop-off of 0.2 points in the average midfield rating from 2011 to 2012 (significant at the 95th percentile).
  • The Standard deviation (-1, 1) bands indicate that the 2012 midfield was consistently worse than the 2011 midfield.
  • Figure 2 shows that Villarreal midfield played poorly against a lot teams that finished in the bottom half of the table. Getafe, Espanyol, Málaga and Athletic.  They took just 8 out of the 24 points available against these 4 teams.
  • The midfield fared better in 2012 than in 2012 vs. Valencia but gave up late goals and took only 1 out of 6 points vs. their derby rivals.
  • Against Sporting and Sevilla, the midfield has improved from last year both home and away. ( took 10 out of 12 points against them this season)
  • Villarreal’s midfield played better against Real Madrid & Barcelona at home than they did last year.  Earned two draws in 2012 vs. 2 losses in 2011.
  • Figure 3 shows Santi Cazorla left a huge gap in the starting line-up for 2012.

Wingbacks (Leftback & Rightback)

Figure 4 – Average wingback rating in 2011 and 2012 seasons by opponents

Figure 5 – Delta in Wingback rating between 2011 & 2012 by opponent

2012 better than 2011 (delta >= +0.2) 10
2012 worse than 2011 (delta <= -0.2) 17
Almost even in both seasons (Delta between -0.2 and +0.2) 5

Figure 6 – Most frequent Wingback starters in 2011 & 2012


  • Figure 4: The average rating at wingback dropped only slightly from 2011 to 2012 despite using six different players for at least 6 games in 2012 as opposed to only 4 in 2011 (only significant at the 80th percentile). (Figure 6)
  • Figure 4: Standard Deviation band implies that the performances at wingback were more inconsistent in 2012.
  • Figure 5: The worst performance at Wingback was against Barça in a 5-0 loss at the Camp Nou. Joan Oriol and newly signed Cristian Zapata were the starting wingbacks in that game.

Defence – Centerbacks

Figure 7 – Average Centerback rating in 2011 and 2012 seasons by opponents

Figure 8 - Delta in Centerback rating between 2011 & 2012 by opponent

2012 better than 2011 (delta >= +0.2) 10
2012 worse than 2011 (delta <= -0.2) 19
Almost even in both seasons (Delta between -0.2 and +0.2) 3

Figure 9 – Most frequent Centerback starters in 2011 & 2012


  • Figure 7 shows that Center back average rating has dropped 0.3 points (significant at the 95th percentile)
  • The biggest drop off at the Center back position was against Osasuna at home in a 1-1 draw. In the 2011 season Villarreal won this fixture 4-0.
  • The center-backs have performed poorly against teams that finished in the bottom half of the table.


Figure 10 – Average rating of Forwards in 2011 and 2012 seasons by opponents

Figure 11 – Delta in Forwards’ rating between 2011 & 2012 by opponent

2012 better than 2011 (delta >= +0.2) 7
2012 worse than 2011 (delta <= -0.2) 20
Almost even in both seasons (Delta between -0.2 and +0.2) 5

Figure 12 – Most frequent starters at Forward in 2011 & 2012


  • Figure 10: Unsurprisingly, Villarreal saw the worst performance drop-off at the Forward position with a 0.6 points drop in average rating (significant at the 99th percentile).
  • Figure 12: The team hasn’t been able to find a serviceable replacement for Giuseppe Rossi all season. Nilmar’s injuries and lack of form didn’t help the matters at all.
  • The team switched to an unfamiliar 4-2-3-1 formation with Marco Ruben as the lone forward. He to missed significant time due to niggling injuries.
  • The two worst drop offs in the performance of the forward line were against Levante and Espanyol, both at home. In 2011 Villarreal beat Levante 2-1 and beat Espanyol 4-0 at home.
  • The best rises in the ratings of the forward line were against Sevilla (home draw & away win) and the relegated Sporting (2 wins).


Figure 13 – Average rating of Goalkeeper in 2011 & 2012 seasons by opponents

Figure 14 – Delta in Goalkeepers’ rating between 2011 & 2012 by opponent

2012 better than 2011 (delta >= +0.2) 12
2012 worse than 2011 (delta <= -0.2) 9
Almost even in both seasons (Delta between -0.2 and +0.2) 9


  1. Villarreal’s goalkeeping saw the least drop-off in the rating from 2011 to 2012 (no significant change)
  2. Based on the Standard deviation bands, Diego Lopez was more consistent in 2012 than in 2011
  3. Diego Lopez’s worst drop-offs in performance came against Málaga (away loss vs. an away in 2011), Sevilla (Home draw vs. a home win in 2011) and Athletic (Away draw vs. an away win in 2011)
  4. Diego Lopez’s best performance was against Sevilla (away win).

Formation comparison: 4-4-2 vs. 4-2-3-1 in 2011-12

Figure 15


  1. Figure 15 shows that despite playing 4-4-2 against tougher opposition (with Barça and Real Madrid home & away) Villarreal has performed better in 4-4-2 than 4-2-3-1/4-3-3
  2. This underlines:
    1. the importance of a serviceable 2nd forward in the squad
    2. Villarreal performs more optimally in 4-4-2.

Season Progress & major events

Figure 16


  • 2 – Longest winning streak
  • 4 – Longest undefeated streak
  • 14 – Draws
  • 3 – Coaches
  • 14 – Villarreal failed to score in 14 games this season
  • 26 – Villarreal scored 1 goal or less in 26 games this season
  • 39 – Villarreal only scored 39 goals this season.

Nervous endings – the struggles of Villarreal to close out games

Figure 17


  1. 15 – Villarreal lost 15 points in the last 10 minutes. The lost leads in 10 games (lost 5, drawn 5).
  2. 6 – Villarreal was able to gain only 6 points in the last 10 ten minutes. They came back in 4 games (won 2, drawn 2)

Whoscored Ranking vs. League position

Figure 18


  • Villarreal’s average rating at the end of the season is at #11 despite finishing 18th in the league.



  • Rossi’s absence and Cazorla’s exit were huge. But there was a drop-off in performances in the other areas of the pitch, most notably at the Centerback position.
  • Injury to Rossi and the lack of an in-form 2nd forward also forced the team to change their optimal formation. They lined up in 4-4-2 only in 33% of the games (13/38)

Conceding late goals

  • Losing a lead after the 80th minute in 10 out of 38 games a season (26.3%) is very bad. They missed avoiding relegation by 1 point. Needing one point to stay up, they lost the last game of the season to an 88th minute goal by Falcao.
  • This possibly explains why Villarreal’s overall rating at the end of the season is 11th despite finishing 18th in the league standings.

    • Could this be a fitness issue?
    • Did players get tired mentally or physically towards the end of the games?

 I would love to explore further to find answers for these questions. But I don’t have data to work with.


Lack of goals

  • Villarreal failed to score in 14 games. The best result possible in such games is only a draw.
  • Villarreal failed to score more than 1 goal in 12 games. It is hard to win a lot of games 1-0 with an inconsistent defense.


  • The longest winning streak of the season is 2. The longest positive streak  of the season is 4 (WDWW)
  • There was a drop off in performances in all positions except at GK. Most acute drop-off was seen in the Forward and Centerback positions.
  • Poor performances against teams in the bottom half of the table.

Agents in Football – Focus on EPL

Club football seasons across most of Europe have ended. The lack of serious football (apart from the Euro 2012 in a few weeks) means the off-field stuff like transfer rumors and player agents take the center stage. Recent news items like the deal of Bebe to Manchester United highlight the inefficiencies and the lack of transparency in the transfer market.

In the first post of a series of posts on this topic, we take an in-depth look at the top player agencies in the English Premier League over the past 5 years.


- All data is taken from the website Transfermarkt.

- The agent information is not available for about 38.9% of the total number of players. However they accounted for only 15.7% of the overall market value.
This number includes players

  • whose market value is low
  • who have retired over the past 5 years and
  • who have agents but the information is not available

- We have excluded the set of data with no agent info unless explicitly stated.


- Players rarely change agents. They change teams much more often.


-          Histograms of Agent-Club linkage based on:

  • # of players of a team having the same agent
  • Total market value of players of a given agent who play for the same team

-          Visualized Agent-Club relationship using heat maps

Figure 1:  Histograms – Breakdown by Market Value per Agent (in Mil)


Figure 2: Histogram – Total player per agent

Histogram of % of players per agent

Figure 3: Histogram of player % and market value %



  • More than 50% of the players and the market value of the transfer market are controlled by a handful of agents.
  • 20 agencies control 50% of the player transfer market. The other 50% is made up of 279 agencies.
  • Some agents have very few clients but who are super stars. The % of market value is disproportionate to the # of players in their clientele.A good example of this is Gestifute, the agency run by Jorge Mendes, the agent of the likes of Cristiano Ronaldo, Nani et al. Gestifute is ranked 5th in market value map but only 11th in the player count map. See the histogram of player % & market value %. Another example is Pinhas Zahavi, agent of Carlos Tevez.

Figure 4: Heat Map of Agent-Club by player count

Figure 5: Heat Map of Agent-Club by market value

Figure 6: Heat map by % of players of a team with an agent. (Players whose agent info is unavailable has been included in this calculation)


Figure 7: Player Percentage per team sorted by the highest % of players of a team with the same agent.


  • 10 – Maximum number of players of the same club with the same agent. Sunderland & Stellar Football. (Figure 4 – C)
  • € 145.5m – The highest amount of market value of players of a team linked to the same agent. Manchester United and Gestifute. (Figure 5 – E)
  • 27 – Stellar Football’s players played in 27 of the 29 EPL teams over the last 5 years. This is the maximum number. (Figure 4 – D)
  • 103 – Total number of Stellar Football’s players who played in EPL over the past 5 years (Figure 4 – D)
  • 6 – Arsenal has the highest # & % of players without an agent among the players whose agent info is available. There could be more players with no agents among the players who agent info is not available. (Figure 4- A,6)
  • 14 – EPL clubs dealt with a 14.15 agents on an average over the last 5 years. The Median is 14. (Figure 7)
  • 22 – Arsenal, Tottenham Hotspur and Portsmouth
    have the highest diversity of agents over the past 5 seasons. Fulham is a close second with 21. (Figure 7 – G)
  • 5 – Norwich City and Swansea City with 6 have the least diversity of agents among their player ranks. Both teams were promoted to the EPL at the beginning of this season. (Figure 7 – H)
  • The differences between the market value heat map to player count heat map are due to the fact that some agencies (e.g.: Gestifute, Pinhas Zahavi) have a lot of average-to-good players as clients’ vs. others who have only a few clients but are superstars. (Figure 4 – B, 5 – E, 5 – F)
  • In the player count heat map the relationship between Gestifute & Chelsea appears very strong. When Mourinho was the coach at Chelsea, all the Portuguese players at Chelsea (e.g.: Carvalho, Paulo Ferreira, Deco) were represented by Jorge Mendes’s agency. (Figure 4 – B)
  • On a similar note Gestifute has a strong relationship with Manchester United in the market value heat map. This is due to the astronomical market value of Cristiano Ronaldo. (Figure 5 – E)

In the next post we will further analyze the linkage between clubs & agents and also look at other leagues.


Top Agents in EPL:

Stellar Football Ltd – David Manasseh, Jonathan Barnett, Ertan Göksu

Base Soccer Agency – Frank Trimboli, Leon Angel           

James Grant Sports Management – Craig Sharon, Lyle Yorks

World in Motion – Andy Evans, Bill Pethybridge, James Lippett, Freddy Akehurst et al

Key Sports Management – John Colquhoun

Gestifute – Jorge Mendes, Luis Correia

Putting into perspective the spending of Manchester City

If you are not from the blue half of Manchester, any discussion that involves Manchester City quickly boils down to buying titles.

A few weeks ago when City played Arsenal at the Emirates, there was this banner:

With the Manchester Derby looming on Monday, there are a slew of articles centered on arguments like buying titles and class.

I don’t know how to quantify “class”. However, I wanted to analyze how Manchester City’s spending stacks up with the rest of the contenders in the Premier League.


1. Compared the inflation adjusted spending numbers from 1999-2011 of United, City, Arsenal, Spurs and Liverpool.

2. I used the Consumer Price Index based inflation numbers of the GBP for the first round of analysis.

But Football transfer fee inflation is hard to measure.  It can fluctuate much more because unlike CPI based inflation (which is based on the price changes of a basket of goods), Football transfers form a very niche segment in a niche industry.

3. I did another view of the data using the definition of inflation based on the average annual transfer fee in the Premier League from the site Transfer Price Index
A quote from the TPI article summarizes why CPI based inflation rate might not be a good indicator of the football player transfer fee inflation
“The cumulative Transfer Price Index is running at
730% for the 20 year history of the Premier League compared to a Bank of England cumulative Consumer Price Index of 77.1%.”

4. I overlaid the spending patterns of Real Madrid & FC Barcelona who are two very successful clubs in Europe and regularly buy top players.

5. I also looked at the Deloitte Money League rankings over the past 10 years to visualize the size of Manchester City before and after the takeover by Sheikh Mansour.


1. All the transfer price data is taken from the site All prices in millions of Euros.
2. The CPI inflation numbers are taken from the Bank of England.
3. Used the average transfer fee chart from the Transfer Price Index
4. Deloitte Money League rankings of the past 10 years from Deloitte website via  Sarah Rudd

Play with the Interactive Visualization of the TPI & CPI based transfer spend from 1999 to 2011.

TPI based transfer spend 1999-2011

CPI based transfer spend 1999-2011

Play with the Interactive Visualization of the TPI & CPI based transfer spend from 1999 to 2011.


Club Overall Spending 1999-2011 (€ mil) Overall Spending 1999-2007 (€ mil)
Chelsea 1399.46 1196.35
Manchester City 679.70 (2nd) 195.17 (5th)
Spurs 556.63 501.81
Liverpool 486.75 431.9
Manchester United 426.07 431.61
Arsenal 63.3 93
  • City spent a net total of € 679 mil on transfers from 1999 to 2011, higher than everyone else except Chelsea.
  • However before City got taken over the Abu Dhabi United Group their overall spending is significantly less than everyone except Arsenal.
  • The average end-of-season league position of City from 1999-2007 was 14.7. After the takeover in 2008, the average league position of City is 5 (including 2011-12). An impressive improvement in such a short span of time.
  • Teams like Manchester United, Liverpool and Spurs have a longer history of spending. This makes City’s spending in a compressed time-frame look exaggerated.
  • Chelsea did something similar between 2002 and 2005 to break into the top 4.

Comparing City to United

There is no doubt that City has spent a lot more than United between 1999 and 2011.

However if you discount the sales of extraordinary* sales of Cristiano Ronaldo & David Beckham to Real Madrid, the overall numbers will be lot closer. (*extraordinary sales are explained below)

City United
Overall net spend 1999-2011 € 679 mil € 426 mil
Excluding Ronaldo & Beckham € 679 mil € 647 mil

Here is a list of top transfers of United between 1999 and 2011 with inflation adjusted prices.
Criteria: TPI adjusted price greater than or equal to 30 mil euros.

Season Player Bought Actual price TPI adjusted CPI Adjusted
(€ mil)
2001-02 Juan Veron 42.6 71.1 58.4
Van Nistelrooy 28.5 47.6 39
2002-03 Rio Ferdinand 46 95.7 62.1
2003-04 Cristiano Ronaldo 17.5 43.2 22.7
Louis Saha 17.5 43.2 22.7
2004-05 Rooney 37 84 47
2006-07 Carrick 27.2 59 32.4
2007-08 Anderson 31.5 45 36.2
Nani 25.5 36.5 29.3
Hargreaves 25 35.7 28.7

In contrast there are only very few big sales that they have made a lot of money off of.

Season Player Sold Actual price TPI adjusted CPI Adjusted
(€ mil)
2001-02 Jaap Stam 25.7 43 35.3
2003-04 Beckham 37.5 93.7 48.7
Veron 22.5 56.2 29.2
2009-10 Ronaldo 94 117.5 104.4
  • The Cristiano Ronaldo’s sale is an extraordinary sale as was Beckham deal on its day. In both cases the buyer was Real Madrid under Florentino Perez.
  • Beckham’s price was driven-up because of Perez openly touting his “Galactico policy” of signing the hottest player on the market each year during his tenure.
  • Cristiano Ronaldo’s price was driven up because one of the election promises of Perez was to sign Ronaldo. This meant Manchester United had all the leverage during the negotiations.

These are extraordinary scenarios that don’t happen on a regular basis.

Here is a list of top transfers of City over this period of time.
Criteria: TPI adjusted price greater than or equal to 30 mil euros.

Player Bought Actual price TPI adjusted CPI Adjusted
(€ mil)
2002-03 Nicolas Anelka 19.8 41.2 26.7
2008-09 Robinho 43 42.1 47.3
2009-10 Carlos Tevez 29 36.2 32.2
E. Adebayor 29 36.2 32.2
J. Lescott 27.5 34.4 30.5
2010-11 Edin Dzeko 37 38.5 38.8
Yaya Toure 30 31.2 31.5
Mario Balotelli 29.5 30.7 312
David Silva 28.75 29.9 30.2
2011-12 Kun Aguero 45 45 45
Season Player Sold Actual price TPI adjusted CPI Adjusted
(€ m
2005-06 S. Wright-Phillips 31.5 71.5 38.7


  1. City has spent a lot but the compressed time-frame of the spending makes it look exaggerated.
  2. City made up almost 10 positions in their average league finish from 14.7 to 5 after the takeover.
  3. Apart from Arsenal, all other top 4 contenders have been spending regularly over a longer period of time.
  4. Sheikh Mansour’s Abu Dhabi United Group took over Manchester City in August of 2008. But Man City was ranked thrice in the top 20 of the Deloitte Money League even before the takeover. This shows that they have always had a sound financial base and fan support.

    Deloitte Money League rankings of City from 2001-201
Year Revenue Matchday Broadcasting Commercial Ranking
2001 54 NA NA NA NR
2002 43 NA NA NA NR
2003 71 NA NA NA NR
2004 94 NA NA NA 16
2005 90 22.3 38.7 29.1 17
2006 89.4 22.7 35 31.7 17
2007 85 NA NA NA NR
Post-takeover by Abu Dhabi United Group
2008 104 23.4 54.6 26 NR
2009 102.2 24.4 56.7 21.1 19
2010 152.8 29.8 66 57 11
2011 169.6 29.5 76.1 64 12

Other observations:

  1. Chelsea’s total spending curve is a surprise. It is common knowledge that Abramovich had spent a lot in early 2000s but the total amount is staggering.They are on par with Real Madrid over the 12 years. The only difference being the steep slope between 2002 and 2005 vs. a fairly linear spending pattern of Real Madrid.
  2. Arsenal is the only club that seems to be consciously balancing the books year after year. Their curve oscillates year to year.
  3. Similar to the steep slope in Chelsea’s curve between 2002 and 2005 is the steep slope in City’s curve between 2006 and 2010 but not nearly as steep.
  4. Real Madrid and FC Barcelona spend a lot of money annually, especially the former.
  5. For all the hype surrounding “La Masia”, FC Barcelona spent as much as Manchester City between 1999 and 2011.

The Visual Display of Qualitative Information

The astute reader will recognize the title of this post as a play on Edward Tufte’s book of a similar name.  While Tufte’s work focuses on turning quantitative data into an easily consumable format that has a clear message, it’s also important to do so with qualitative data.  Qualitative data can often be the “how” or “why” to go along with the “what” provided by quantitative data.

The New York Times recently did an excellent job illustrating the qualitative aspects of Jeremy Lin’s performances.  The sports media has done a great job covering what Jeremy Lin has done, but this New York Times piece goes into how Lin is accomplishing what he has and why he is a good point guard, all with 3 simple animations.  It reminded me a lot of this video which calls for exactly this type of analysis in soccer.  The closest I’ve found are the brilliant videos that AllasFCB2 puts together.

New Year, New Opportunities

I’m pleased to announce that I’ve joined StatDNA as Vice President of Analytics and Software Development.  This is a super exciting opportunity for me as I’ll be combining my loves of software development, data analysis and soccer.  What could be better?  I’ll hopefully have some blog posts up for StatDNA over at their blog soon using their best in breed data.  I will continue to update this site as well although not as frequently.  Thanks to Jaeson and the rest of the StatDNA team for giving me this opportunity!

Goal Glut November Update

There was an interesting article this morning on Soccernet about Robin Van Persie being in the “injury red zone”.  Hyperbole aside, it raises the point that Arsenal have had the luxury of playing Van Persie in every league match so far (starting 12 of 13) but will have to manage his workload a little more conservatively or risk a decrease in performance or potential injury.  Arsenal aren’t the only club facing this problem, with many top clubs still involved in multiple competitions (Newcastle’s and Liverpool’s league form is probably benefiting from their absence from Europe).

Why do I bring it up?  After much hype about the goal glut in the Premier League this season, things are starting to quiet down.  Goals per match dropped from 3.3 in October to 2.87 in November which is expected based on previous years’ data.  If this season continues to be like others, we can expect the dip to continue through February.

Goals per match by month for the Premier League from 2005-present. Orange marks are for the current season. The grey area represents one standard deviation from the mean.


Looking at how total goals are progressing, this season isn’t much different from previous seasons.

Running total of goals in the Premier League. Orange is this season.

We have a decent idea of what is going on here (goal scoring pace slows in the middle of the season) but we don’t know why.  Is fatigue and squad rotation responsible?  It certainly is an interesting theory to investigate.