• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • NFL
    • Super Bowl Odds and Predictions
  • NBA
  • MLB
  • Soccer/Football
  • NCAAB
  • NHL
  • NCAAF
  • MMA
  • Other Sports
    • Auto racing
    • Esports
    • Golf
    • Horse Racing
    • Tennis
  • Sports Betting Systems
    • NFL Betting Systems
    • NBA Betting Systems
    • MLB Betting Systems
    • NCAAB Betting Systems
    • NCAAF Betting Systems
    • EPL Betting Systems
    • NHL Betting Systems
    • CFL Betting Systems
    • WNBA Betting Systems

WagerBop

Sports News, Strategy, Tips, and Results

  • The WagerBop News Team
  • Contact

Football modelling and expected goals.

August 24, 2018 By Steve Matthews Leave a Comment

Introduction

During the past few years, football modelling has emerged as a standard for teams’ success. Football clubs are grasping every chance and possibility that could improve the performance of their first team. The forward movement of mathematics and machine learning now makes it possible for football teams to use data science for their analysis. Team and player evaluation is a must in gaining an edge and improvement in the competitive world of football.

Nowadays, the detailed metrics offered by data companies such as OPTA are fascinating and sophisticated. Every event on the pitch is recorded with additional variables by manual labor and in this way the quality of chance created and conceded is estimated. That detailed data provided the background for the development of these so-called “Expected Goals” models.

The probability of shot being converted is called “Expected Goal” (or xG short) and nowadays this metric is widely adopted in football analysis. Expected goals can help you understand what happened in a match -e.g. does the actual score reflect the quality of chances created? That same metric can advise you on the teams’ performance and players’ form better than other measures such as goal difference and points accumulated.

Toolbox

A reasonable analytics work is possible if you have basic computation and statistical knowledge. In football modelling, reasonable is not enough and you need to excel in areas such as programming, applied math, building databases and researching.

A well-organized and up-to-date database is critical to your football analysis. You need to be able to extract and prepare datasets for analytics in the quickest possible way. This is probably the most time-consuming process when modelling football and having an easy-to-query data store is simply gold.

You can choose between 2 industry standards: SQL and NoSQL databases.

In a nutshell, SQL databases requires that you use pre-defined schemas to determine the structure of your data. The flow of information from your source must follow the same structure and for this reason a significant up-front preparation is required. Ignoring this step will cause all sorts of trouble in the long term! Some of the industry leading SQL databases are MySQL and PostgreSQL.

NoSQL databases, on the other hand, allow you to populate your data store without defining a schema and you can add fields as you go. NoSQL databases examples include MongoDB, BigTable, Redis.

It is possible to run your database on your personal computer, but the better option is to set up a cloud service that will have auto back-ups and protect you from power failures. Moreover, this will allow you to access your filtered statistics from anywhere.

Programming is essential part in the football modelling process. Coding is needed when you are populating your database, extracting the data in a specific form and reporting the results of the analysis. Connecting to football data APIs and web scrapping requires a decent knowledge in programming. Today there are many online accessible educational tools that will allow you to learn coding in a quick and straightforward way.

There are two main languages used in the data science community: Python and R.

Python is often associated as being a general-purposes language and it has an easy-to-understand syntax. On the other hand, R has been developed for the sole purpose of statistical analysis. Python, along with Anaconda distribution seems to be the better choice here for data science. If you put the effort to learn it to a great standard, you will be able to do miracles on the data science battlefront.

Regardless of the programming language that you choose, a version control software is essential. This will allow you to move between different version of your software without any hustle and understand the progress of your coding. Git is industry standard here and you can have a free or paid account in no time.

Football modelling is a complex process that needs knowledge in advanced statistics in order to derive robust insights. Simple counting and averaging will not get you in line with the professional betting syndicates and punters. Learning statistics on your own will be a long and difficult process and for this reason, attending an online course will be more efficient. The Generalized Linear Models is a comprehensive statistical method that will instantly improve your football analysis.

The main idea behind your data science toolbox is to build a platform that will follow ordered processes leading to efficient football analysis.

Expected Goals

Nowadays, describing and studying football consists entirely of on-the-ball event types. These can be “tackles”, “big chances”, “aerial duels”, “dribbles” that are not the fundamental unit of analysis. These attributes are part of the process called football, but the main established concept currently is expected goals.

In simple terms, expected goals assign a value to the chances of a shot resulting in a goal. It takes into account data from thousands of shots and filter them based on factors such as distance, angle, type of shot, pattern of play, assist type, the number of defenders between the shooter and the goal, possession chains, body type and more. All these factors and more, depending on the football model, are used to create a percentage chance of a shot becoming a goal.

The expected goals are usually expressed as a number between 0 and 1, with 1 being a certain goal. An expected goal of 0.2 means that one out of every five occasions will result in a goal.

Some of the attributes are common to all expected goal models – i.e. shot distance and angle. Other parameters depend on the richness of the data and include variables such as assist type, pattern of play, possession chain. Discovering which parameters are essential in your model is the key to further development and success.

Let’s go over some of the attributes and explain their influence on expected goals.

Distance/Angle:

Shot location data provides coordinates (x,y) of the start and end position of each shot/event on the field and it is by far the most important predictor. These values can be used to calculate shot distance and angle. A shot closer to the goal has a greater chance of being converted than one further away.

Angle is another important metric, because a shot near the front of the goal has a greater chance of being converted. Two lines can be drawn from the shot location to each post and the angle between these lines reflects the view that the player has. These are standard parameters when calculating expected goal per shot.

Play type:

There are data sets that describe shots as a result of set-pieces, through balls, crosses, corners, dribbles, key passes. Some of these events are more likely to produce a shot/goal based on the team and leagues analysis.

For example, through balls eliminate one or more defenders and increase the scoring chance. The one pass after a through ball is even better as it has a chance to eliminate players along with the goalkeeper. Crosses are an efficient way to create goal chances, but they do not necessarily create quality attempts. Dribbles have smaller effect than through balls but they still eliminate at least one defender and increase the odds of scoring.

These attributes depend on your data source and if available, it is a good idea to include them in your model based on your analysis.

Body type:

Some data sources include the body part used for the shot – i.e. Left/Right foot, Header. In some situations, certain body parts are more likely to be used -i.e headers from crosses or corners.  Obviously here, foot shots are better than headers.

Competition/Country:

It is a no-brainer that shot conversion rates depend on the quality and characteristics of the competition. Analyzing each tournament and adjusting your model for it will definitely give you an advantage.

Big Chance:

A “big chance” is an attribute that would be assigned when a player has a one on one chance against the goalkeeper. Moreover, data companies like OPTA add this metric when the player is reasonably expected to score. This attribute has a significant impact on expected goals and it is the perfect example where data is corrected by a human judgement. It is an important attribute to make note of and include in your model.

Assist:

There are attempts that are assisted and ones that are not. These assists can be intentional or not. The unintentional assists lead to a shot, but these weren’t meant to provide a scoring chance. When a player deliberately makes a choice to allow a team mate to shoot the ball on goal, then we note this attempt down as an intentional assist. Intentional assists are very important for your expected goal model as they illustrate a quality attempt.

Game State:

Creating goal scoring chances when 1 or 2 goals down is much different when there are no goals scored in the game or when a team is leading. This means that teams defend differently according to the score line in open play. This is another factor that should be taken into account when dealing with expected goals.

The impact defense has on xG:

Defensive positioning and reducing your opponent’s chance of scoring is just as important. For example, defenders can force a player to shoot a different way or make last minute movement adjustment that make it harder to score.

When analyzing the entire attacking process, from a chance creation to where the final action takes place and using the proximity of defenders and their influence on the quality of the shots, adds another level of detail to expected goals modelling.

This means that looking at where the goalkeeper and defenders are positioned in relation to where a shot is taken from, could produce the most accurate expected goals output of all.

 

Team/Player Performance

Player performance can be easily understood when we compare the goals that a player scored during a full season with the chances available to him through expected goal. If the player’s scored goals are significantly above his expected goals, this might be a sign of an unsustainable run. The expected goals variable can tell us more about the player’s shot selection. We can find out whether a player is taking high quality shots by comparing his average expected goal per shot.

Two good examples here are Jamie Vardy and Roberto Firminho in 2017-2018 EPL Season.

As we can see from the table below (courtesy of www.understat.com), Vardy and Firminho outperformed their season expected goals tally, which tells us that they converted more opportunities which had a lower probability of resulting in a goal. Essentially, this tells us about the strikers positioning and shot quality.

These expected goals projections can help us show the real performance of a team, who might be under/over performing based on the actual number of goals they are scoring.

A significant player/team insight from the expected goals is when the xGs are plotted for different location on the field. This is easily showing that a player is often shooting from his favorite place on the field but never scores. If the probabilities of these goal scoring opportunities are high, the player is obviously doing something wrong in these cases and his actions could be analyzed in more detail. If, however, the probabilities are low for a player on the field, but that player shoots very often, someone could point out to him that shooting might not be the best decision at that part of the field.

Another example comes from the case where players, especially strikers, score

many goals in one season. It could, however, be the case that such a player did score a lot but had a much lower amount of expected goals. This could suggest that the specific player was lucky during that season.

If a player surpasses his expected goals for a few games and does not have a notable history of being a prolific goal scorer, he is probably on a hot streak that will not last forever.

But someone like Harry Kane, who scores more goals than the chances he gets suggest he is clearly just better in front of goal than the average player and moreover being at the right place than the average player. In the 17/18 EPL season, he has scored 30 goals, having 26.86 expected goals chances and slightly overperformed. Another such player with an impressive goal tally of 32 is Mohamed Salah, who has had 25.14 xG and has clearly done much better. He is the only player with such high goal/xG difference of 6.86, and deservedly becoming the EPL top goal scorer.

This can help clubs make decisions in many ways. For example, expected goals can identify players who are good at getting into goal scoring positions before they have started scoring a quantity of goals that actually makes teams take notice.

Of course, more research has to be performed on that player’s performance, but the expected goals indicator could be a useful tool in player acquisition.

Predicting match outcome

The next section in this document is for those who have more than academic interest in predicting football results. Nowadays, expected goals models have a significant use in sports betting syndicates, professional punters and bookmakers. As noted above, there isn’t a perfect formula for an expected goal model that works fluently. There are loads of performance metrics waiting to be exploited and create a better view of teams’ or players’ recent performance. Betting syndicates are improving their xG models constantly in order to find the needed value for their sports betting investment. Placing 400-500k GBP bets, spread on 3 or more Asian handicap lines per football match is a daily business for the top world betting syndicates. Moreover, 4-5% yield on their yearly betting turnover is a standard.

Your expected goal model comes handy in here for the simple reason that it suggests the expected goals for team A and team B. The more accurate our expected goal model is, the more likely we are to find value bets. To put it simply, your betting profit depends on your expected goal model capacity to forecast accurate match score lines.

We already have an estimated expected goal numbers for team A and B and we can use these numbers to generate the odds for home, draw and away in that particular match. Goals in football matches closely follow a Poisson distribution model. Poisson distribution helps you calculate the probability of each possible score line in the match if you have the xGs for each team on hand. This means that if we assume that team A will score on average 1.2 goals, the Poisson distribution tells the odds of team A scoring exactly. 0,1,2,3 goals, etc. Furthermore, from here we can derive the odds for team A scoring more goals than team B.

Microsoft Excel has an easy implementation of the Poisson Distribution in the following formula:

=POISSON(x, mean, cumulative)

The above formula represents “x” as the exact number of goals we want to find the probability of. The “mean” is our expected goals values that we have come up with from our expected goal model. One last addition here is the “cumulative” variable that we have to set to FALSE. This will result in POISSON distribution returning the probability that a random variable takes on a value exactly equal to x.

When we calculate the probability of each team scoring 0,1,2,3 or more goals, we can find the chances for the match to finish home, draw or away. We can do that by estimating the probability of all possible results (i.e. 1-0, 2-0,1-1,1-2 etc.)

All this leads us to calculated home, draw and away odds from which we can create Asian handicap and total goals markets with the help of xG as well. Here on comes the fun part, where you have to find odds better than your projections and bet on these. The logic behind this is that the odds offered by the bookmakers are significantly better/wrong than your ones and you are taking advantage of the bookmakers’ prices. This is the whole logic behind value betting – comparing your football betting model odds with that of a bookmaker and finding where the value is. This is being done on hourly basis by professional players and being first in this part of the betting industry is key.

Conclusion:

Expected goal is a complicated attribute that is an irreplaceable part of the football analytics. It explains what has happened in a match better than other parameters and goals scored, but there are still some significant limitations. For example, expected goals models don’t catch dangerous phases of play that don’t end in shots.

We do have to address again the problem with data acquisition. The top data companies use special panoramic cameras in order to obtain such a detailed level of statistics for each match. This is the reason why the data is very expensive and becomes a problem for the average data scientist. Some analysists move forward to scrapping the data off free data provision websites such as: www.whoscored.com and www.squawka.com. Having a well-organized and up-to-date data source is probably the most important part in your expected goals adventure.

Obviously, there are limitations to any expected goal model in terms of subjective factors such as unrest in the squad, new managers, top players are injured, hard and demanding weekly schedule. That kind of information can be easily researched and make note of in order to adjust the expected goal model.

If you manage to make a note of these non-stats factors and adjust your xG model, along with a well-formulated and accurate team’s attack and defense rating, you will have a complex match performance metric that will help you with your analysis and value hunting.

 

The expected goal concept has been extended to other attributes in soccer, from assists to save to passes and even defensive actions. One thing sure is that expected goals does deserve its place in the toolbox of football analytics.

Steve Matthews
Steve Matthews
Share
Tweet
Share
0 Shares

Filed Under: Soccer/Football, Sports Betting, Sports News Tagged With: Football

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Primary Sidebar

We Are Social

MLB Daily Double

Sign up

Sign Up

Sign up

Authors For This Section

Georgie L

Georgie has been in the industry for over 11 years, working as a trader and a broker for some of the largest syndicates in the world. Georgie has focused his model development on international soccer leagues.

Twitter: @WagerBop
Email: georgie@wagerbop.com

Jake Nichols

Jake is a mixed martial arts reporter from Australia. At WagerBop, he merges his appreciation of combat sports and a strong passion for analytics to uncover opportunities for readers.

Twitter: @JakeNicholsMMA
Email: jake@wagerbop.com

Jay Sanin

Jay is a sports writer who has been featured on Deadspin, BetAdvisor. In addition to penning wager previews and features, Jay has broadcasted for MAAC school as well as ESPN Radio's Northeastern Affiliates.

Instagram: @jaysanin
Email: jay@wagerbop.com

Nikola Velickovic

Nikola Velickovic is a sports journalist who loves to write and read on all sports. Nikola contributes both news updates and functions as a sports breaking news writer at WagerBop.
Twitter:
Email: nikola@wagerbop.com

Kreighton R

Kreighton loves sports, math, writing, and winning -- he combines all of them as a writer for WagerBop. His favorite sports to review are MLB, NFL, NBA, NCAAF, and NCAABB.

Twitter: @WagerBop
Email: kreighton@wagerbop.com

Kurt Boyer

Kurt has authored close to 1000 stories covering football, soccer, basketball, baseball, ice hockey, prize-fighting and the Olympic Games. Kurt posted a 61% win rate on 200+ college and NFL gridiron picks last season. He muses about High School football on social media as The Gridiron Geek.
Twitter: @scorethepuck
Email: kurt@wagerbop.com

Shehryar Raza

Shehryar is a professional eSports analyst with a particular affinity for CS:GO, LoL, and DOTA. He covers major eSports events across the world for WagerBop.

Twitter: @raza_shehryar
Email: shehryar@wagerbop.com

Oscar Cantu

Oscar is a rabid sports fan who started to develop his own models when he realized that lots of sports betting "experts" rely too much on their own opinions and publicly available information to provide picks. Oscar focuses on football, basketball, and soccer.

Twitter: @WagerBop
Email: social@wagerbop.com

derek

WagerBop Authors

Latest

UEFA Champions League Group Stage Draw Completed

August 31, 2023 By Nikola Velickovic Leave a Comment

MLS: LA FC vs. Inter Miami Preview, Odds, Predictions, Picks

August 31, 2023 By Nikola Velickovic Leave a Comment

Serie A: Napoli vs. Lazio Preview, Odds, Prediction, Picks

August 30, 2023 By Nikola Velickovic Leave a Comment

Bundesliga: Borussia Monchengladbach vs. Bayern Munich Preview, Odds, Prediction, Picks

August 30, 2023 By Nikola Velickovic Leave a Comment

Premeir League: Arsenal vs. Manchester United Preview, Odds, Prediction, Picks

August 29, 2023 By Nikola Velickovic Leave a Comment

Categories

  • Auto racing
  • Boxing
  • College Football
  • Esports
  • EuroLeague
  • FBS
  • FIFA
  • General Strategy
  • Golf
  • Horse Racing
  • Men's Ice Hockey
  • MLB
  • MMA
  • NBA
  • NCAAB
  • NCAAF
  • NFL
  • NHL
  • Olympics
  • Soccer/Football
  • Sports Betting
  • Sports News
  • Sportsbook Reviews
  • Tennis
  • US Soccer
  • Women's Ice Hockey

Twitter

Twitter feed is not available at the moment.

Footer

About WagerBop

WagerBop is a sports website committed to journalism. We are founded by sports fans — for sports fans — and aim to provide the latest happenings in athletics across the board. We are a team of real, dedicated, eccentric people who aim to deliver the high level of objectivity and quality found throughout our site.

CONTACT DETAILS

Address: 3505 Olsen Blvd
Amarillo, TX 79109
Phone: 806-355-7200
Email Id: admin@wagerbop.com

FREE BETTING STRATEGIES

Free NFL Betting Strategy
Free NCAAF Betting Strategy
Free MLB Betting Strategy
Free NHL Betting Strategy
Free NBA Betting Strategy
Free NCAAB Betting Strategy
Free CFL Betting Strategy
Free EPL Betting Strategy
Free WNBA Betting Strategy

Other Pages

  • Cookie Agreement
  • Editorial Policy
  • GDPR Privacy Policy
  • Privacy Policy
  • Terms and Conditions

Copyright © 2025 Wagerbop.com · Log in

We use cookies to ensure that we give you the best experience on our website Close