Summary¶

Gather historical Premier League data

Analyze results to generate standings at any given time

Find teams that held the same position in the league table for longer than Manchester United's 104 days in 6th place (2016-17 PL season)

One of the biggest stories of the 2016-17 Premier League season has been the ~~rise~~ ~~fall~~ stationary grace of Manchester United Football Club. United had claimed sole position of 6th place for 104 consecutive days. This included a 17 game unbeaten streak where the clubs ahead of and behind them did everything in their power to keep pace with the consistent Red Devils.

Unsurprisingly, the Internet has turned United's stranglehold on 6th into one of this season's spiciest memes.

It also got me thinking: which team has the dubious distinction of holding the same position in the league table for longer than Manchester United? If you'll allow me: let's not let Manchester United's 6th place reign distract us from the fact that the Golden State Warriors blew a 3-1 lead in the NBA finals.

Let's use our PyData skills and find an answer!

What You Need to Follow Along¶

Development Tools (Stack)¶

Python 3.6 (Yes, 3.6. We will be using f-strings)
PyData stack (Pandas, numpy)

Code¶

Jupyter Notebook on Github

Examining the Problem¶

The hardest thing in Data Science is asking the right question.

Let's take a closer look at our problem so we can get a sense of:

the kind of data we need to gather
the workflow we need to follow to answer our question
the metrics we will use to judge success
- i.e. when do we consider our analysis to be complete?

Working backwards from our desired result gives us the following workflow for our analysis:
Data Study Workflow

Premier League Data Analysis¶

Gathering Data¶

There are a ton of data sources available for us to use. For our analysis, we will be using match results from football-data.co.uk.

Let's download the file which contains results from the 2016-17 Premier League season.

In [1]:

!mkdir data
!wget -P data/ http://www.football-data.co.uk/mmz4281/1617/E0.csv

--2017-04-10 17:15:38--  http://www.football-data.co.uk/mmz4281/1617/E0.csv
Resolving www.football-data.co.uk... 217.160.223.109
Connecting to www.football-data.co.uk|217.160.223.109|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 82422 (80K) [text/csv]
Saving to: 'data/E0.csv'

data/E0.csv         100%[=====================>]  80.49K   290KB/s   in 0.3s   

2017-04-10 17:15:39 (290 KB/s) - 'data/E0.csv' saved [82422/82422]

Setting Up Environment¶

In [2]:

import math
import numpy as np
import pandas as pd

Loading Data¶

After glancing at the data notes and our csv, we will load data into a pandas DataFrame:

In [3]:

results = pd.read_csv(
    'data/E0.csv', 
    usecols=[x for x in range(11)],
    parse_dates=['Date'],
    dayfirst=True)

In [4]:

results.head()

Out[4]:

	Div	Date	HomeTeam	AwayTeam	FTHG	FTAG	FTR	HTHG	HTR	Referee
0	E0	2016-08-13	Burnley	Swansea	0	1	A	0	D	J Moss
1	E0	2016-08-13	Crystal Palace	West Brom	0	1	A	0	D	C Pawson
2	E0	2016-08-13	Everton	Tottenham	1	1	D	1	H	M Atkinson
3	E0	2016-08-13	Hull	Leicester	2	1	H	1	H	M Dean
4	E0	2016-08-13	Man City	Sunderland	2	1	H	1	H	R Madley

Wrangling Data¶

As previously mentioned, we are using a match results dataset to conduct our analysis. Each row in this DataFrame represents a single match result. Leaving data in this format will make it difficult to write idiomatic pandas expressions to slice-and-dice our DataFrame later on.

Why? Going back to our workflow diagram, we require a function to calculate the league table at the end of each day. In its current format, we would need to iterate over our data a total of 2 times: once to calculate results when current_team == HomeTeam and, the second time, when current_team == AwayTeam.

A better data structure would have each row in our DataFrame represent a result for each team, regardless of whether they are playing home or away. We can use the pandas.melt() function to transform our data as follows:

In [5]:

## converting each matchup into 2 rows
## one where each team is 'current_team' and opponent is identfied
results['H'] = results['HomeTeam']
results['A'] = results['AwayTeam']
cols_to_keep = ['Div', 'Date', 'HomeTeam', 'AwayTeam', 'FTHG',
                'FTAG', 'FTR', 'HTHG', 'HTAG', 'HTR', 'Referee']

team_results = pd.melt(
    results, 
    id_vars=cols_to_keep, 
    value_vars=['H', 'A'],
    var_name='Home/Away',
    value_name='Team')

team_results['Opponent'] = np.where(team_results['Team'] == team_results['HomeTeam'],
                                    team_results['AwayTeam'],
                                    team_results['HomeTeam'])

In [6]:

team_results.head(2)

Out[6]:

	Div	Date	HomeTeam	AwayTeam	FTHG	FTAG	FTR	HTHG	HTAG	HTR	Referee	Home/Away	Team	Opponent
0	E0	2016-08-13	Burnley	Swansea	0	1	A	0	0	D	J Moss	H	Burnley	Swansea
1	E0	2016-08-13	Crystal Palace	West Brom	0	1	A	0	0	D	C Pawson	H	Crystal Palace	West Brom

We need to transform 'home' and 'away' goals to goals scored for each team and then calculate a result given the combined score. This is also a good spot to calculate how many points the team was awarded for the match result.

We will use this post from StackOverflow (Praise Be) as a guide and proceed as follows:

In [7]:

points_map = {
    'W': 3,
    'D': 1,
    'L': 0
}

def get_result(score, score_opp):
    if score == score_opp:
        return 'D'
    elif score > score_opp:
        return 'W'
    else:
        return 'L'

In [8]:

# full time goals
team_results['Goals'] = np.where(team_results['Team'] == team_results['HomeTeam'],
                                 team_results['FTHG'],
                                 team_results['FTAG'])
team_results['Goals_Opp'] = np.where(team_results['Team'] != team_results['HomeTeam'],
                                     team_results['FTHG'],
                                     team_results['FTAG'])
team_results['Result'] = np.vectorize(get_result)(team_results['Goals'], team_results['Goals_Opp'])
team_results['Points'] = team_results['Result'].map(points_map)

# 1st half goals
team_results['1H_Goals'] = np.where(team_results['Team'] == team_results['HomeTeam'],
                                    team_results['HTHG'],
                                    team_results['HTAG'])
team_results['1H_Goals_Opp'] = np.where(team_results['Team'] != team_results['HomeTeam'],
                                        team_results['HTHG'],
                                        team_results['HTAG'])
team_results['1H_Result'] = np.vectorize(get_result)(team_results['1H_Goals'], team_results['1H_Goals_Opp'])
team_results['1H_Points'] = team_results['1H_Result'].map(points_map)

# 2nd half goals
team_results['2H_Goals'] = team_results['Goals'] - team_results['1H_Goals']
team_results['2H_Goals_Opp'] = team_results['Goals_Opp'] - team_results['1H_Goals_Opp']
team_results['2H_Result'] = np.vectorize(get_result)(team_results['2H_Goals'], team_results['2H_Goals_Opp'])
team_results['2H_Points'] = team_results['2H_Result'].map(points_map)

In [9]:

# Drop unnecessary columns and sort by date
cols_to_drop = ['HomeTeam', 'AwayTeam', 'FTHG', 'FTAG', 'FTR', 'HTHG', 'HTAG', 'HTR']
team_results = (team_results
                    .drop(cols_to_drop, axis=1)
                    .sort_values(by=['Date', 'Referee']))

In [10]:

team_results.head()

Out[10]:

	Div	Date	Referee	Home/Away	Team	Opponent	Goals	Goals_Opp	Result	Points	1H_Goals	1H_Result	1H_Points	2H_Goals	2H_Goals_Opp	2H_Result	2H_Points
1	E0	2016-08-13	C Pawson	H	Crystal Palace	West Brom	0	1	L	0	0	D	1	0	1	L	0
314	E0	2016-08-13	C Pawson	A	West Brom	Crystal Palace	1	0	W	3	0	D	1	1	0	W	3
0	E0	2016-08-13	J Moss	H	Burnley	Swansea	0	1	L	0	0	D	1	0	1	L	0
313	E0	2016-08-13	J Moss	A	Swansea	Burnley	1	0	W	3	0	D	1	1	0	W	3
5	E0	2016-08-13	K Friend	H	Middlesbrough	Stoke	1	1	D	1	1	W	3	0	1	L	0

Calculating League Table (Standings Table)¶

In [11]:

# Testing the standings function against real world data
(team_results
     .groupby('Team')
     .sum()['Points']
     .sort_values(ascending=False))

Out[11]:

Team
Chelsea           75
Tottenham         68
Liverpool         63
Man City          61
Man United        57
Everton           54
Arsenal           54
West Brom         44
Southampton       40
Watford           37
Stoke             36
Leicester         36
Burnley           36
West Ham          36
Bournemouth       35
Crystal Palace    34
Hull              30
Swansea           28
Middlesbrough     24
Sunderland        20
Name: Points, dtype: int64

This matches the current table (as of April 10 2017)
Data Study Workflow

In [12]:

def standings(frame, result_col, goals_col, goals_opp_col, points_col):
    """This function takes in a DataFrame and strings identifying fields
    to calculate the league table.
    
    Making it generalized will allow us to calculate league tables for
    First Half Goals only. Second Half Goals only.
    """
    record = {}
    
    record['Played'] = np.size(frame[result_col])
    record['Won'] = np.sum(frame[result_col] == 'W')
    record['Drawn'] = np.sum(frame[result_col] == 'D')
    record['Lost'] = np.sum(frame[result_col] == 'L')
    record['GF'] = np.sum(frame[goals_col])
    record['GA'] = np.sum(frame[goals_opp_col])
    record['GD'] = record['GF'] - record['GA']
    record['Points'] = np.sum(frame[points_col])
    
    return pd.Series(record,
                     index=['Played', 'Won', 'Drawn', 'Lost', 'GF', 'GA', 'GD', "Points"])

In [13]:

# Get League Table
results_byteam = team_results.groupby(['Team'])

(results_byteam 
     .apply(standings,
            result_col='Result',
            goals_col='Goals',
            goals_opp_col='Goals_Opp',
            points_col='Points')
     .sort_values('Points', ascending=False))

Out[13]:

	Played	Won	Drawn	Lost	GF	GA	GD	Points
Team
Chelsea	31	24	3	4	65	25	40	75
Tottenham	31	20	8	3	64	22	42	68
Liverpool	32	18	9	5	68	40	28	63
Man City	31	18	7	6	60	35	25	61
Man United	30	15	12	3	46	24	22	57
Arsenal	30	16	6	8	61	39	22	54
Everton	32	15	9	8	57	36	21	54
West Brom	32	12	8	12	39	41	-2	44
Southampton	30	11	7	12	37	37	0	40
Watford	31	10	7	14	36	52	-16	37
Leicester	31	10	6	15	39	51	-12	36
Stoke	32	9	9	14	34	47	-13	36
Burnley	32	10	6	16	32	44	-12	36
West Ham	32	10	6	16	42	57	-15	36
Bournemouth	32	9	8	15	45	59	-14	35
Crystal Palace	31	10	4	17	42	50	-8	34
Hull	32	8	6	18	33	64	-31	30
Swansea	32	8	4	20	37	67	-30	28
Middlesbrough	31	4	12	15	22	37	-15	24
Sunderland	31	5	5	21	24	56	-32	20

In [14]:

# Get League Table for First Half Goals only
(results_byteam
     .apply(standings,
            result_col='Result',
            goals_col='Goals',
            goals_opp_col='Goals_Opp',
            points_col='Points')
     .sort_values('Points', ascending=False))

Out[14]:

	Played	Won	Drawn	Lost	GF	GA	GD	Points
Team
Chelsea	31	24	3	4	65	25	40	75
Tottenham	31	20	8	3	64	22	42	68
Liverpool	32	18	9	5	68	40	28	63
Man City	31	18	7	6	60	35	25	61
Man United	30	15	12	3	46	24	22	57
Arsenal	30	16	6	8	61	39	22	54
Everton	32	15	9	8	57	36	21	54
West Brom	32	12	8	12	39	41	-2	44
Southampton	30	11	7	12	37	37	0	40
Watford	31	10	7	14	36	52	-16	37
Leicester	31	10	6	15	39	51	-12	36
Stoke	32	9	9	14	34	47	-13	36
Burnley	32	10	6	16	32	44	-12	36
West Ham	32	10	6	16	42	57	-15	36
Bournemouth	32	9	8	15	45	59	-14	35
Crystal Palace	31	10	4	17	42	50	-8	34
Hull	32	8	6	18	33	64	-31	30
Swansea	32	8	4	20	37	67	-30	28
Middlesbrough	31	4	12	15	22	37	-15	24
Sunderland	31	5	5	21	24	56	-32	20

In [15]:

# Get League Table for Second Half Goals only
(results_byteam
     .apply(standings,
            result_col='Result',
            goals_col='Goals',
            goals_opp_col='Goals_Opp',
            points_col='Points')
     .sort_values('Points', ascending=False))

Out[15]:

	Played	Won	Drawn	Lost	GF	GA	GD	Points
Team
Chelsea	31	24	3	4	65	25	40	75
Tottenham	31	20	8	3	64	22	42	68
Liverpool	32	18	9	5	68	40	28	63
Man City	31	18	7	6	60	35	25	61
Man United	30	15	12	3	46	24	22	57
Arsenal	30	16	6	8	61	39	22	54
Everton	32	15	9	8	57	36	21	54
West Brom	32	12	8	12	39	41	-2	44
Southampton	30	11	7	12	37	37	0	40
Watford	31	10	7	14	36	52	-16	37
Leicester	31	10	6	15	39	51	-12	36
Stoke	32	9	9	14	34	47	-13	36
Burnley	32	10	6	16	32	44	-12	36
West Ham	32	10	6	16	42	57	-15	36
Bournemouth	32	9	8	15	45	59	-14	35
Crystal Palace	31	10	4	17	42	50	-8	34
Hull	32	8	6	18	33	64	-31	30
Swansea	32	8	4	20	37	67	-30	28
Middlesbrough	31	4	12	15	22	37	-15	24
Sunderland	31	5	5	21	24	56	-32	20

Ranking Teams¶

We can use the DataFrame.rank() method to rank each team based on Premier League tiebreaks (Total Points -> Goal Difference -> Goals Scored).

In [16]:

# Rank Teams in Standings
league_table = (results_byteam
                    .apply(standings,
                           result_col='Result',
                           goals_col='Goals',
                           goals_opp_col='Goals_Opp',
                           points_col='Points')
                    .sort_values(by=['Points', 'GD', 'GF'], ascending=False))

In [17]:

league_table['rank'] = (league_table
                            .apply(lambda row: (row['Points'], row['GD'], row['GF']), axis=1)
                            .rank(method='min', ascending=False)
                            .astype(int))
league_table

Out[17]:

	Played	Won	Drawn	Lost	GF	GA	GD	Points	rank
Team
Chelsea	31	24	3	4	65	25	40	75	1
Tottenham	31	20	8	3	64	22	42	68	2
Liverpool	32	18	9	5	68	40	28	63	3
Man City	31	18	7	6	60	35	25	61	4
Man United	30	15	12	3	46	24	22	57	5
Arsenal	30	16	6	8	61	39	22	54	6
Everton	32	15	9	8	57	36	21	54	7
West Brom	32	12	8	12	39	41	-2	44	8
Southampton	30	11	7	12	37	37	0	40	9
Watford	31	10	7	14	36	52	-16	37	10
Leicester	31	10	6	15	39	51	-12	36	11
Burnley	32	10	6	16	32	44	-12	36	12
Stoke	32	9	9	14	34	47	-13	36	13
West Ham	32	10	6	16	42	57	-15	36	14
Bournemouth	32	9	8	15	45	59	-14	35	15
Crystal Palace	31	10	4	17	42	50	-8	34	16
Hull	32	8	6	18	33	64	-31	30	17
Swansea	32	8	4	20	37	67	-30	28	18
Middlesbrough	31	4	12	15	22	37	-15	24	19
Sunderland	31	5	5	21	24	56	-32	20	20

Fantastic! This is the exact ranking we see online!
Data Study Workflow

This is great, but there is one problem. How do we rank teams at the start of the season when some have played a game while others have not?

The pandas.rank() function will work most of the time, except for the edge case where not every team has completed their first game. We will need to take this into account when create our custom rank_teams() function.

In [18]:

def rank_teams(league_table, team_list):
    """Return a Series of ranked teams, including those who have yet to play
    
    Args:
        * league_table - League Table DataFrame
        * team_list - List of all teams in league
    """
    
    # sort by tiebraker and rank
    team_rank = (league_table
                     .apply(lambda row: (row['Points'], row['GD'], row['GF']), axis=1)
                     .rank(method='min', ascending=False)
                     .astype(int))
    
    # if not all teams are ranked (i.e. some of them might have not have played yet)
    if team_rank.size < len(team_list):
        # get all teams that need to be added to the table
        ranked_teams = team_rank.index.values
        teams_to_add = {team for team in team_list if team not in ranked_teams}  
        
        # position to rank remaining teams
        rank_to_assign = team_rank.size + 1
        
        # add teams that haven't played a game to rankings
        team_pos = {}
        for team in teams_to_add:
            team_pos[team] = rank_to_assign
        team_rank = team_rank.append(pd.Series(data=team_pos))
    
    return team_rank

In [19]:

# Let's test our function to make sure it works
all_teams = np.sort(team_results['Team'].unique())
rank_teams(league_table, team_list=all_teams)

Out[19]:

Team
Chelsea            1
Tottenham          2
Liverpool          3
Man City           4
Man United         5
Arsenal            6
Everton            7
West Brom          8
Southampton        9
Watford           10
Leicester         11
Burnley           12
Stoke             13
West Ham          14
Bournemouth       15
Crystal Palace    16
Hull              17
Swansea           18
Middlesbrough     19
Sunderland        20
dtype: int64

Tracking Team Rankings Across Entire Season¶

Going back to our workflow diagram, we need to calculate and store Team Rankings at the end of each day during the season. Once we have this list for each team, we can calculate the longest streak at a single position.

In [20]:

# get list of days
rank_history = []
all_dates = team_results['Date'].unique()

# calculate ranks after each day there is a game
for day in all_dates:
    # get results up to current day
    dailyresults_byteam = (team_results[team_results['Date'] <= day]
                               .groupby(['Team']))
    
    # create league table with ranking
    # premier league ranking goes: Points, GD, GF
    league_table = (dailyresults_byteam
                        .apply(standings,
                               result_col='Result',
                               goals_col='Goals',
                               goals_opp_col='Goals_Opp',
                               points_col='Points')
                        .sort_values(by=['Points', 'GD', 'GF'], ascending=False))
    team_rank = rank_teams(league_table, team_list=all_teams)
    
    rank_history.append(team_rank)
    
# create historical ranking dataframe from list of ranks
rank_history_df = (pd.DataFrame
                    .from_records(rank_history, index=all_dates))

# Reindex and include all dates
idx = pd.date_range(start=rank_history_df.index.min(), end=rank_history_df.index.max())
rank_history_df = rank_history_df.reindex(idx, method='ffill')

In [21]:

rank_history_df.head()

Out[21]:

	Arsenal	Bournemouth	Burnley	Chelsea	Crystal Palace	Everton	Hull	Leicester	Liverpool	Man City	Man United	Middlesbrough	Southampton	Stoke	Sunderland	Swansea	Tottenham	Watford	West Brom	West Ham
2016-08-13	15	15	13	15	13	5	1	11	15	1	15	5	5	5	11	3	5	5	3	15
2016-08-14	13	18	16	19	16	7	3	14	2	3	1	7	7	7	14	5	7	7	5	19
2016-08-15	14	20	18	3	18	8	3	15	2	3	1	8	8	8	15	6	8	8	6	15
2016-08-16	14	20	18	3	18	8	3	15	2	3	1	8	8	8	15	6	8	8	6	15
2016-08-17	14	20	18	3	18	8	3	15	2	3	1	8	8	8	15	6	8	8	6	15

In [22]:

rank_history_df.tail()

Out[22]:

	Arsenal	Bournemouth	Burnley	Chelsea	Crystal Palace	Everton	Hull	Leicester	Liverpool	Man City	Man United	Middlesbrough	Southampton	Stoke	Sunderland	Swansea	Tottenham	Watford	West Brom	West Ham
2017-04-06	5	13	14	1	16	7	17	11	3	4	6	19	9	12	20	18	2	10	8	15
2017-04-07	5	13	14	1	16	7	17	11	3	4	6	19	9	12	20	18	2	10	8	15
2017-04-08	5	15	12	1	16	7	17	11	3	4	6	19	9	13	20	18	2	10	8	14
2017-04-09	6	15	12	1	16	7	17	11	3	4	5	19	9	13	20	18	2	10	8	14
2017-04-10	6	15	12	1	16	7	17	11	3	4	5	19	9	13	20	18	2	10	8	14

Now that we have Team Rankings across the entire season, we can adapt code found on StackOverflow (Praise Be) to find consecutive segments in the rank_history_df DataFrame.

We will also use a lambda formula found on StackOverflow (Praise Be) to output the ordinal suffix for each position (i.e. 6=6th. 3=3rd).

In [23]:

ordinal = lambda n: "%d%s" % (n,"tsnrhtdd"[(math.floor(n/10)%10!=1)*(n%10<4)*n%10::4])

# for each team in the league, get the length of the longest streak
for team in all_teams:
    rank_history_team = rank_history_df[team].to_frame()
    rank_history_team.columns = ['A']
    rank_history_team['block'] = ((rank_history_team.A.shift(1) != rank_history_team.A)
                                      .astype(int)
                                      .cumsum())

    streak_lengths = (rank_history_team
                          .reset_index()
                          .groupby(['A','block'])['index']
                          .apply(np.size))
    
    pos = streak_lengths.argmax()[0]
    max_length = streak_lengths.max()
    print(f'{team} was {ordinal(pos)} for {max_length} days')  # f-string! =)

Arsenal was 4th for 27 days
Bournemouth was 14th for 46 days
Burnley was 14th for 28 days
Chelsea was 1st for 121 days
Crystal Palace was 17th for 35 days
Everton was 7th for 82 days
Hull was 18th for 42 days
Leicester was 15th for 33 days
Liverpool was 4th for 27 days
Man City was 1st for 55 days
Man United was 6th for 104 days
Middlesbrough was 16th for 31 days
Southampton was 10th for 31 days
Stoke was 9th for 31 days
Sunderland was 20th for 69 days
Swansea was 19th for 35 days
Tottenham was 5th for 63 days
Watford was 14th for 28 days
West Brom was 8th for 101 days
West Ham was 18th for 27 days

We know that Manchester United held 6th for 104 days. This matches the output of our program so we know our function is working!

Calculating Longest Streaks For Each Season¶

To complete our task, we will need to run the above workflow across all seasons available on football-data.co.uk match results.

I wrote a script to download and analyze each season's data to find teams which have held static in the league table. After the script is run, it prints a list of teams with streaks >=104 days.

The code is a bit messy, but it gets the job done. We can download the script from Github and run it as follows:

$ python longest_streak.py

9394
Blackburn was 2nd for 131 days
Man United was 1st for 259 days
Swindon was 22nd for 260 days

9495

9596
Bolton was 20th for 105 days
Newcastle was 1st for 174 days

9697

9798
Man United was 1st for 175 days

9899
Nott'm Forest was 20th for 149 days

9900
Man United was 1st for 107 days
Sheffield Weds was 20th for 154 days
Watford was 20th for 114 days

0001
Arsenal was 2nd for 110 days
Bradford was 20th for 155 days
Man United was 1st for 218 days

0102
Derby was 19th for 120 days
Leicester was 20th for 137 days

0203
Arsenal was 1st for 126 days

0304
Arsenal was 1st for 105 days

0405
Chelsea was 1st for 191 days
Everton was 4th for 139 days

0506
Chelsea was 1st for 257 days
Sunderland was 20th for 191 days
Tottenham was 4th for 155 days

0607
Chelsea was 2nd for 204 days
Man United was 1st for 204 days

0708
Chelsea was 3rd for 111 days
Derby was 20th for 197 days
Fulham was 19th for 126 days
Tottenham was 11th for 114 days

0809
Everton was 6th for 128 days
West Brom was 20th for 127 days

0910
Portsmouth was 20th for 244 days

1011
Chelsea was 1st for 105 days
Man United was 1st for 127 days

1112
Man City was 1st for 119 days
Man United was 2nd for 119 days

1213
Man City was 2nd for 177 days
Man United was 1st for 177 days

1314

1415
Chelsea was 1st for 268 days
Leicester was 20th for 140 days

1516
Aston Villa was 20th for 206 days
Leicester was 1st for 116 days

1617
Man United was 6th for 104 days

Ahem. Don't let Manchester United's 104 days in 6th place distract you from the fact that Everton were 6th for 128 days in the 2008-09 Premier League season.

Note¶

From the above output, we can see how competitive each PL season was on a high level. This might be a good topic for further study.

Conclusion¶

In this post, we took a Premier League results dataset and transformed it into a pandas DataFrame. We also created functions to make it easy to generate a standing table along with a list of each team's rank.

Finally, we ran historical Premier League matchup results through the above functions to find the longest stretch of time where a team held the same position in the league table for an extended period of time.

Siv Scripts

Premier League Analysis: Holding Steady in the League Table

Summary¶

Suggested Reading¶

TL;DR¶