Splunking Euro2016 – an analytics approach to who’s going to win

UPDATED 8th July for France vs Portugal

So I’m having mixed results. I said Germany would win. Clearly wrong after they lost to France last night. I think that says more about my analysis than Splunk. It also shows the importance of having the most upto date, real-time data and not basing decisions on a 2012 data set! Nevertheless as Winston Churchill said “Success is stumbling from failure to failure with no loss of enthusiasm” so I thought I’d see what the data says for the final of France vs Portugal.

First up goals scored and conceded in 2012:

FP1

Portugal had the edge in 2012. So far in Euro 2016, France have been pretty good defensively and in terms of scoring goals – especially “Monsieur Griezmann”.

Next was shooting analysis:

FP2

Portugal had a lot more shots in total but in terms being on target – both teams were the same. This together with the passing and possession data below shows that France tend to make better use of the ball when they have it. Portugal make up for that with higher scores for tackles made and clearances.

FP3

Last up was looking at the discipline of the two teams:

FP4

Portugal seem to have a worse record when it comes to yellow cards and fouls conceded.

So what does that all mean? Based on 2012 it is really close – France will use the ball better and be more accurate, Portugal will defend but give away more free kicks. However Portugal will score more goals on the counter attack and win.

However – this is historic data and to really make accurate predictions you’d need some real-time data. Most of the bookmakers provide these data streams and APIs which gives me an idea for the 2018 World Cup…

Who’s going to win? Home advantage for France vs the Ronaldo factor for Portugal…”allez les bleus”.

I hope it has been fun reading and laughing at the poor quality of analysis. Enjoy the final and as always, thanks for reading.

 

UPDATED 7th July for France vs Germany:

So Splunk (and my novice analytical skills) weren’t far wrong when predicting Italy vs Germany (close win for Germany – always bet on the Germans for penalties). France play Germany tonight so I thought I’d try and predict the outcome using the same analysis. I’ve started with the number of goals scored and conceded. France went out in the quarter finals and didn’t score a lot of goals but didn’t conceded that many. Germany clearly have the edge on goals scored:

FG1

Having looked at goals scored, I thought I’d look at the number of shots and how accurate the teams are:

FG4

Germany had a lot more shots on target (more red means higher number and allows you to compare between the two teams). Total shots is higher for Germany but France are quite close with shots on/off target. I thought I’d then look at defence and possession:

FG3

France and Germany are pretty close again but Germany edge it with number of tackles made. In terms of passing and possession, Germany were way ahead of France in passes attempted but passes completed are much closer.

Last up I thought I’d look at the discipline to see if there was any insight that might help predict the outcome.

FG2

Pretty close again. There’s a lot more analysis below that you might find interesting. It looks like a close game again, especially as France have improved since 2012 and have home advantage. I’m still going for Germany, purely based on goals scored and tackles made.

Enjoy the game! I’ll be back tomorrow to analyse the winner against the other finalist – Portugal.

UPDATED for Italy vs Germany:

Italy play Germany tomorrow in one of the biggest games in the Euros to date. I thought I’d spend an hour on a Friday afternoon and have a quick look into Euro 2012 to try and predict the outcome of the game.

I started off with goals scored vs conceded. It looks like Germany have the edge here in terms of scoring more goals:

Dash1

It looks like it is going to be a close game so I looked the discipline of both teams. Only one penalty was scored by either team in Euro 2012 and there were no red cards for Italy or Germany. However if you look at fouls conceded and the number of yellow cards then Germany have the edge again by quite a margin:

Dash2

I then I thought I’d try and delve into the myth that the Italian team are great in defense and on the counter attack. It was pretty close but Italy seemed slightly stronger in defence than Germany with a higher score in terms of clearances, tackles. In terms of passing, Italy made more passes but a higher percentage of Germany’s passes were completed.

Dash3

When it comes down to shooting, Italy had more shots in Euro 2012 but they had a higher percentage off target. Number of shots on target was the same for both Italy and Germany.

Dash4

The data doesn’t point to a clear winner and both Italy and Germany have had a strong tournament so far. In the post below I went for Germany and I’m going to stick to my guns. The data suggests that Germany will score more goals but Italy have a string defence. It may just come down to poor discipline so don’t be surprised to see it settled by a German free kick. Also remember – never bet against Germany on penalties!

Enjoy the weekend!

 

EurosSo it is football time again with the upcoming Euro 2016 football (soccer for those US readers) so prompted by my colleague Emanuele, I thought I’d try and use Splunk for some analysis into the last Euros and the players in this summer’s competition. A couple of caveats – I was pretty awful at football when I was a kid. Secondly, I did an appalling job trying to predict the winner of the last World Cup (embarrassment here – I went for Spain).

However, I’m not good at learning from previous mistakes so I’m going to try and use Splunk to predict the winner again this time.

First thing I did was find a couple of sources of data – firstly was a CSV file that had all the data from Euro 2012 (that Spain DID win) including goals, shot accuracy, pass data, red/yellow cards, clean sheets etc. I added this to Splunk and got started with the analysis.

 

Defensive performance

I started by looking at which team had the best defence. Spain and Italy (finalists) had the most clean sheets. They also put in the most number of tackles throughout the tournament.

Defending1

Click to enlarge

Next up I wanted to look at defensive performance of interceptions, number of saves made, clearances off the line and clearances in general. I wanted to try some of the newer visualizations in Splunk and settled on this one that showed a heatmap. The brighter or “hotter” the colour – the higher the defensive performance. You can see Italy did well defensively.

 

Defending2

Click to enlarge

 

Possession

Moving from defence to midfield, I wanted to look at pass and touch data to get an idea of possession and who looked after the ball best. Spain came in highest (the size of the square shows the best performance). Italy and Germany weren’t far behind.

 

Click to enlarge

Click to enlarge

 

I then wanted to look at the accuracy of passing to see if the usual suspects (Spain, Netherlands etc.) are actually better at using the ball. Turns out the data supports that with a higher accuracy of passing being the hotter score in the below

 

Click to enlarge

Click to enlarge

 

Attacking

To round it off – I looked at the attacking side of the team from Euro 2012. Spain scored the most goals followed by Germany. I then wanted to look at the gap between goals scored and conceded. Germany conceded quite a few goals compared to the number they scored. Spain however hardly conceded any goals but scored the most. This might be an indicator to why they won.

Click to enlarge

Click to enlarge

I then thought I’d look at the number of shots by team – once again Spain won, followed by Italy and Germany.

Click to enlarge

Click to enlarge

I then switched over to a different JSON dataset that was a detailed breakdown of each player in each team going to Euro 2016. The data included DOB, caps, position, goals, club, DOB etc.

 

England

As is the way, English people tend to get over excited that “this is the year we’ll win something” and then lament when we lose on penalties (usually to Germany). I thought I’d look at England first. I looked at goals scored and it seems like England are highly dependent on Wayne Rooney for the goals. I thought I’d then go a bit deeper and look for experience and influence. If you compare goals and caps and the picture looks a bit less Rooney-centric. Other players carry the load with lots of caps but less so on goals.

Click to enla

Click to enlarge

It was an interesting English Premier League with Leicester winning. I thought I’d look at how many players made up the English squad from each club. Somewhat surprisingly there were a lot of Liverpool players but Manchester United and Tottenham weren’t far behind. Strangely there weren’t many players from Leicester or Arsenal.

Click to enlarge

Click to enlarge

 

The big teams

Apologies for leaving any of the other teams out (I know the Splunk team in The Netherlands and Sweden aren’t going to be happy) but I picked England, France, Italy, Spain and Germany. I looked at the top scorers who will be playing at the event. Germany has 7 of the top 10 scorers for their country (they did win the last World Cup).

Europe1

Click to enlarge

 

I thought I’d then look at the most experience players at the tournament. There’s a good mix of Spain, Italy and Germany. Lastly I thought I’d just look at which clubs are most represented at the Euros. It turns out Juventus, Manchester United, Liverpool and Bayern Munich make the biggest contributions to those top four of five teams. I expected a few more from Barcelona.

Click to enlarge

Click to enlarge

 

I’m reluctant for so many reasons to predict the winner again but due to the experience of the team, the number of top scorers and their win at the World Cup – I’m going for Germany (and that isn’t easy as an Englishman who’s spent most of their life watching Germany beat England on penalties). If they can make sure they concede less goals than in 2012 then as they say, never bet against the Germans.

 

Enjoy the Euros and as always thanks for reading.

reporting a European event I would assume that you take care of the European languages and show names correctly.
e.g.
http://blogs.splunk.com/wp-content/uploads/2016/06/Europe1.jpg

Müller
Schürle
Özil
Götze

June 14, 2016

Hi Matt!
Where did you get those data from? I was curious to upgrade my FIFA2014 Splunk App with those results!!!

Marco

June 16, 2016

Hi Marco – a lot of it came from here: https://github.com/jokecamp/FootballData

July 4, 2016

Hi Herzog

Apologies – the source data I used seems to corrupted some of the names.

I’ll look into it.

Cheers
Matt

July 4, 2016