My 2015/16 Fantasy Premier League (FPL) season was ruined by the onset of the double gameweeks from week 33 onwards. I felt like I was prepared; I had mapped out my squad weeks in advance and knew when to play the second wildcard of the season, who to bench, who to transfer in and when, and even left a little flexibility in there to account for the unforeseen. My plan was to rinse the double gameweeks for all they were worth. My theory was that if I had multiple players playing twice each double gameweek, my odds of getting points were increased. In summary, I favoured the fixtures based on the form of the opposition, and the form of the players became subservient to that.
It was a terrible idea.
Out went numerous players in form (who of course scored big points immediately after I’d transferred them) and in came out of form players based solely on their fixtures, although I will take some limited solace in being not the only FPL manager to be stung by the ineptitude of Everton’s so-called ‘star’ attackers; I know from his stats that Romelu Lukaku must be a decent striker but my word he was dreadful for his DGWs 33&34 captain appearances.
What made the Belgian’s points return even more frustrating for me was that it wasn’t the first time I’d let the fixtures influence my selection. Indeed, most FPL managers assess forthcoming games when deciding who to transfer in. It was how earlier in the season I was able to look at the ten goals in 16 games of Watford’s bargain striker Odion Ighalo and think ‘no, I won’t bring him in, he’s playing Liverpool, Chelsea, Tottenham and Man City in the next four’. Four goals in three games later his value had risen a further £0.4m and I was left wondering what the hell I was doing.
In both these cases I ignored form – good and bad – and placed an emphasis on fixtures to dictate my thinking, which I have suspected ever since was a contributing factor in me falling someway short of my target for the season. Therefore, for this analysis I have resolved to look at whether form of the player or the opposition have any bearing on the points that a player will score, with the hope that it will either confirm or, and this will be more difficult to adapt to, disprove my assumptions.
This data set is comprised of 38,254 rows from the last two seasons of FPL, one row for each game a player is recorded by the FPL as having been an option to transfer in. However, 17,350 of these rows are comprised of games where the player recorded zero minutes on the pitch, which means there are 20,904 rows of data where the player has completed at least one minute.
There are a couple of important points to note before analysing the data. First, each game is treated as an individual data point, so this analysis does not factor in double gameweeks. If a player scores four points followed by three points for a double gameweek of total of seven, this analysis counts his output as one game of four points, and another of three.
Second, there will be some analysis conducted here which looks at past and future form, where the average points from the previous six games and the following six games are tallied. For example, if I want to know the past form of a player in game 14, I will look at games 8-13, and future form will be from 15-20. Under these circumstances, I will include as many games as possible up to six, but it must be noted that any past form values that occur before game six will not feature the maximum amount. For example, if I look at a past form of a player from game four, the average will be taken from games 1-3. The same is true when looking at future form. For this reason, there will be occasions in this analysis when I need to remove games 1 and 38 because they have no past or future form.
I will periodically remove the data described in the conditions above, and will annotate each chart with descriptions of what we’re seeing for the sake of clarity.
The first thing to note here is that there is hardly any relationship between the points from a current game and the points that player will score next week. When we run the current game’s points and the next game’s points (e.g. game one and two, two and three, three and four, etc) through a statistical analysis technique called regression analysis, we discover that the R2 value (the strength of the relationship between the two variables) is 0.12, which is very weak. In simple terms, it means that next week’s point total can be explained by last week’s point total only 12% of the time.
Figure 1: Game Points vs. the Game Points for the following week, 2014-16 (n=37,183, the final game of the season for each player removed)
So we can immediately disavow ourselves of the hope of building a predictive model based purely on form from the previous week; the points a player scores next week will not be determined by the points he’s scored this week. However, what about long-term form? Are the points he scores today foretold by the points he has scored in recent games? Before we address this, let’s look at how the points are distributed throughout the game.
The data shows that for players contributing at least one minute in a game, the likelihood is that they will be scoring between zero and three points; 37.2% will score 0-1, and a strikingly similar 37.2% will score 2-3. A FPL manager will tell you that a good return is six or more points, and this happens on 19.2% of occasions.
Figure 2: Occurrences of points distribution, 2014-16 (n=20,904, an entry must have a game total at least one minute to qualify)
This shows us that major point hauls are relatively rare in the game. There is a temptation for FPL managers to ‘follow the form’; if a player has a positive points haul then a manager will transfer him in in the hope that he will continue this form. In a previous article I wrote that my strategy for next season will include looking for form in the hope of longevity to minimise the need for transfers in many key positions. The theory goes that once a player is on a roll then that should continue, even though the data above suggests that in a completely random system he will score less than six points in four of every five games.
Over the course of the season, there is some evidence to support this. The chart below shows the average points scored in the preceding (up to) six games against the current game’s points.
Figure 3: Average points scored in preceding (up to) six games vs. current game points, 2014-16 (n=20,904, entries must have a game total at least one minute to qualify; first game of the season removed)
The data shows a definite upward trend, but this just indicates that the probability of the higher points is linked to form over the course of the season. It doesn’t explain whether it is likely to happen on any given game.
Therefore, the question we need to address is whether there is a predictive element to the previous and forthcoming games, evidence that we are not dealing with a completely random system. A player might score high this week, and over the course of the season be statistically more likely to continue to do so more frequently than a player who has just scored zero, but does that mean he is actually going to?
In short, the answer is no. There is essentially no way of predicting points based purely on form on a week-by-week basis. Figure 4, below, shows the relationship between the current game’s points and the average points per game for the previous six games, whilst Figure 5 shows the current game and the following six games.
Figure 4: Game Points vs. the Game Points for the previous six weeks, 2014-16 (n=37,183, the first game of the season for each player removed)
Figure 5: Game Points vs. the Game Points for the next six weeks, 2014-16 (n=37,183, the last game of the season for each player removed)
The data reveals that there is a slightly more cohesive relationship than that which exists between the current game and the following game as demonstrated in Figure 1, but still nowhere near enough to be predictive. What this information tells us in simple terms is that the current game’s points can be explained by the average performance of a player in the last six games just 16% of the time. Similarly, the predictive capacity of this game’s points to explain what will happen in the next six games is also only 16%. This effectively means that if you look to the current week’s points to explain what type of season a player is having and is going to have, with no other information to cloud your judgement, you are not going to be able to do so.
Consider as examples the following two players picked at random from within the database.
Figure 6: Game Points, Player A
Figure 7: Game Points, Player B
The examples above show spikes in productivity, dips in form and sustained periods of relatively consistent points, but with no regular pattern to indicate the ability to predict what would happen from one week to the next. It is only with subjective context that we can make sense of the points and have any hope of predicting what will happen next. Knowledge that Player A in this scenario is the mercurial yet inconsistent midfielder from a mid-table team where the strikers don’t score regularly will help you to decide whether the potential points of Yannick Bolasie (2014/15) is worth the price tag, whilst the playmaker for a title chasing team (Player B, Christian Eriksen, 2015/16) makes the dips in form more forgiving considering the potential for high points regardless of the opposition.
So what does this mean for game strategy? It suggests that form alone should not be a guiding principle when attempting to predict where points are going to come from. In crude terms, if you pick a player who has just scored big and hope that this will be a guide for his performance over the coming run of games, you may well hit the jackpot, but this will only happen around one in every six occasions. Further evidence comes in the form of the standard deviation of points. This is a mathematical term that explains the extent of variation between data. If a player is consistent, such as scoring two points in each of his 38 games, the standard deviation will be zero, but if a player is inconsistent, such as scoring two points one week, then 12 points, then 5, etc. then the standard deviation will be a lot higher. The charts below show the standard deviation of the last two seasons’ players ranked from highest points (left) to lowest (right), and they show a clear pattern that suggests the more points a player scores, the more inconsistent he will be.
Figure 8: Standard deviation of points, 2014/15 (all players with at least one point, ranked by points total)
Figure 9: Standard deviation of points, 2015/16 (all players with at least one point, ranked by points total)
This does stand up to reason, as players at the lower end of the spectrum will play less and so have more zero scores, thus keeping their standard deviation low. But at the top of the rankings, the ‘explosive’ points totals increase the standard deviation, which has the impact of making the top scorers – the players we want to choose – more unpredictable on a week-by-week basis.
In summary, player form alone should not be used as the predictor. For many experienced FPL managers, this will seem apparent and intuitive. Transfer decisions are made utilising a number of factors, for which form is just one. Another key variable to consider when picking a team is forthcoming opponent’s form.
For this analysis we have to develop a method of assessing how ‘easy’ a fixture is to determine whether this has a bearing on points. Logically, it seems reasonable to assume the ‘easier’ an opponent, the greater the potential for points, but then football has a habit of throwing up unexpected results such as a flying Manchester City’s 0-0 against a dire Aston Villa team in 2015/16, or conversely the up-and-coming, £5.4m Harry Kane’s 18 points against the eventual 2014/15 champions Chelsea.
To quantify how easy a game is then, we will use a similar system as we did for measuring form by looking at the average performance of the last six games. The values attributed to performance are as follows:
- Defeat = 0 points
- Home draw = 1 point
- Away draw = 1.5 points
- Home win = 2 points
- Away win = 2.5 points
So, for example, if a club’s previous six games read: home win, away defeat, home draw, away win, home draw, away defeat, then the opponent’s form will be the average of 2+0+1+2.5+1+0=6.5, which is 6.5/6=1.0833 (note again that for all players their first game of the season has been removed because there is no previous form to call upon, and in the early weeks there will be fewer than six games, which will explain how a team can have an average form of 2.5; it is not that they have won six away games in their last six matches, but they have won one in round one only and we are looking at it from the position of game 2).
The distribution of an opponent’s form, shown below, indicates that variations of the scenario described above (with the form of 1.0833) are amongst the most common occurrences over the last two seasons.
Figure 10: Occurrences of points distribution, 2014-16 (n=20,904)
If we look at the average points scored against the opponents ranked by form, we see a downward trend that suggest the better the current form of the opposition, the harder it is to score points against them. However, the pattern is far from smooth; for example, teams with a form rating of 0.2-0.4 (e.g. really terrible form) are more difficult to score points against than teams with a difficulty rating of 1.2-1.4 (e.g. average form).
Figure 11: Average points scored vs. form of the opponents, 2014-16 (n=20,904, entries must have a game total at least one minute to qualify; first game of the season removed)
Of course this shows the average of what is occurring over the course of a season and is not the most stable trend. If we again try to identify the predictive ability of the data on the game points, we again come up short. We have already seen that there was a very weak correlation between player form and a particular game’s points, but here we see that the relationship is quite literally non-existent
Figure 12: Game Points vs. the form of the opponent, 2014-16 (n=20,904, the first game of the season for each player removed; entries must have a game total at least one minute to qualify)
The R2 value of less than 0.01 indicates that there is no identifiable pattern at all. This means that picking a player because he is, for example, playing at home against a Newcastle side that has lost all their previous six games is not a guarantee of points; on the contrary, players coming up against a side with a form rating of 0 (all defeats in their last six or fewer if at the start of the season) have been recorded to score all variations between -3 (Carl Jenkinson for West Ham in a 3-4 defeat to Bournemouth in game 3, 2015/16; newly promoted Bournemouth had lost their two games to date before Jenkinson was sent off) and 16 (two-goal Pedro for Chelsea away to Aston Villa who had lost their previous six in 2015/16).
Of course, what I haven’t done here is to incorporate the inherent strength of the opposition into the calculations. There are potential ways of assessing for the strength of the opposition, but I feel that with regards to FPL this is still a subjective judgement on the part of the manager to determine who he/she feels is strong and weak. Some clubs will retain a ‘fear factor’ in spite of poor form, such as Manchester United or Chelsea in 2015/16, whereas some teams are perceived to be punching above their weight and will still be seen as the less strong team in spite of the evidence, most famous Leicester in 2015/16. Form of the opposition, therefore, is an objective view of the recent form but makes no effort to determine other biases such as home advantage or real or perceived inherent strength.
- There is an overall pattern that indicates higher points are preceded by better form of the players. However, the predictive capacity of such a relationship is extremely limited and liable to fail around one in every six times.
- Similarly, teams in poor form will generally concede more points than teams in good form, however the trend is less steady than seen for player form and the predictive relationship between a player’s points and his team’s opponent is non-existent.
What we’re seeing here is evidence that, on a week-by-week basis, FPL and indeed football in general is very difficult to predict based on recent form. A previous article I wrote shows that season-wide points can be forecasted to a certain extent using the underlying metrics attributed to a player, by which I mean we have the information (shots on target, touches in the final third, etc) to understand what is supposed to happen and, when aggregated over the course of the season, this is what is probably going to happen. The same is true here. We know that a player who has experienced good recent form is more likely than a player in poor form to score high this week, although on average only slightly more.
The problem is that the data doesn’t provide us a way of understanding what will happen next week. A player in good form might be more likely to score higher next week, but there is an almost equally high chance that he will get very little next week.
The fallacy of this analysis is that we are studying a single factor at a time. When we combine both form of players and opposition in to a single chart, we get an alternative view of the data, although not an entirely unsurprising one based on what we have seen already.
Figure 13: Average of Game Points vs. the average form of the opponent (up to last six games) and previous player form (up to six games), 2014-16 (n=20,807, the first game of the season for each player removed; entries must have a game total at least one minute to qualify; all totals of 10 and under have been removed)
The numbers in the middle represent the average number of points scored. Reading the above chart left to right we see that, with only a couple of exceptions, the average points scored increases the better the player form going into the game is. However, reading from top to bottom, the pattern is a lot less stable when we consider the average form of opponents faced in the last six games. This data seems to indicate that if the player is in good form then he will have a greater chance of scorer more points regardless of his opponent’s form.
This seems to be the key insight from this analysis, that form of the player is slightly more important than the form of the opposition.
However, mimicking what we have already seen, there is absolutely no predictive relationship between the two, as an R2 value of 0.001 indicates that the two values are entirely independent of each other. For the sake of clarity, I have also looked at the player vs. opposition relationship for the top 25 players in each season only, to see if narrowing the field to players who were a success would reveal something more compelling, but the distribution of data points was equally as random.
FPL managers will subconsciously weigh up several variables when making selection decisions, including form but also factors such as likely team formations, the likelihood of rotation, fixture congestion, competition priorities, the inherent strength of the opposition and indeed subjective analysis and believe in what might happen. Skilled FPL managers learn to trust their intuition (and, indeed, the wisdom of the crowd) and understand how to weigh these prejudices against the data they see in front of them. This analysis has done nothing to dissuade me from believing this remains the best approach.
Previous analysis I conducted showed that monitoring key underlying stats that do not result in direct points will correlate with points over the course of the season, and this analysis has showed something similar; that points follow form when looked at as an aggregate over the course of the season. However, this time I have attempted to go deeper into week-by-week analysis to see whether we could be more predictive in the short term, if not on a game-by-game basis then at least over a six game period rather than a 38 game period, but alas the answer is we can’t, at least not with two key variables monitored by FPL managers, namely player and opposition form.
The best that can be offered from the lessons of the last two seasons is that monitoring form and making transfers based on the performance of a player in the last six games provides an increased probability that your player will score slightly more average points per game than someone who has been in poor form, but the relationships are nowhere near strong enough to treat this as a golden rule.
What is potentially more useful is that the form of the opposition based on recent form appears to have such little impact to be almost non-existent. It is surely common practice to look at the fixtures and think, for instance, “this in-form team is playing that weak team, therefore I must bring in a striker because the defence is playing poorly and liable to concede goals.” In our heads we assume we are playing the probabilities well, but in reality there are numerous other variables in play; the opposition manager may play ultra-defensive and restrict the goal-scoring opportunities, or the complacency of the striker may come into play as he is thinking about keeping himself fit for the forthcoming Champions League game. In reality, the striker is no more or less likely to score against the out-of-form team than he would against the in-form team, and the statistics show that we’d be (slightly) better off looking at the form of the player (points, shots on target, etc) than the form of who he is playing. If there is one benefit that I have taken from this analysis it is that my dependence on forthcoming fixtures as a guide for transfers will be reduced.
Reduced, but not eliminated. As I mentioned at the start of this conclusion, an FPL manager will need to look at multiple factors when making a decision on whom to play. There is skill is being able to read the cues that the season is giving us, and despite what you may believe based on how I have approached this article, a dependence on data alone is not the preferred tactic. The trick is in finding a balance between understanding what data to listen to, what to ignore, and crucially, when.