On the Fantasy Premier League website, there's an endpoint called bootstrap-static which returns a JSON document containing data on various parts of the game.

The bulk of the data is a list of player objects under the elements key. Each player object contains detailed information about a player, for example:

A player object ```json { "id": 257, "photo": "92217.jpg", "web_name": "Firmino", "team_code": 14, "status": "d", "code": 92217, "first_name": "Roberto", "second_name": "Firmino", "squad_number": 9, "news": "Ankle injury - 50% chance of playing", "now_cost": 92, "news_added": "2019-02-24T17:31:21Z", "chance_of_playing_this_round": 50, "chance_of_playing_next_round": 50, "value_form": "0.2", "value_season": "13.3", "cost_change_start": -3, "cost_change_event": 0, "cost_change_start_fall": 3, "cost_change_event_fall": 0, "in_dreamteam": false, "dreamteam_count": 2, "selected_by_percent": "15.5", "form": "2.0", "transfers_out": 2518790, "transfers_in": 1463787, "transfers_out_event": 52014, "transfers_in_event": 758, "loans_in": 0, "loans_out": 0, "loaned_in": 0, "loaned_out": 0, "total_points": 122, "event_points": 0, "points_per_game": "4.5", "ep_this": "1.5", "ep_next": "1.5", "special": false, "minutes": 2058, "goals_scored": 9, "assists": 6, "clean_sheets": 16, "goals_conceded": 11, "own_goals": 0, "penalties_saved": 0, "penalties_missed": 0, "yellow_cards": 0, "red_cards": 0, "saves": 0, "bonus": 17, "bps": 499, "influence": "580.2", "creativity": "503.5", "threat": "952.0", "ict_index": "202.9", "ea_index": 0, "element_type": 4, "team": 12 } ```

Despite the “static” in its name, the bootstrap-static endpoint data is actually updated fairly frequently—“static” seems to refer to not being specific to the logged-in user. By recording the bootstrap-static endpoint over time, we can create a time series of FPL player data!

In fact, I've been saving the information from this endpoint twice daily since September 2018, using a scheduled AWS Lambda function and saving the data in gzip-compressed format to a S3 bucket. The bucket is publicly-accessible and available here:

FPL 2018-19 data
Twice-a-day snapshots of the FPL bootstrap-static endpoint since 12 September 2018
http://fpl-2018-19-data.s3.amazonaws.com/

Getting the data

Using the AWS CLI, you can synchronise the contents of the bucket to your local machine:

$ aws s3 sync s3://fpl-2018-19-data fpl-2018-19-data
download: s3://fpl-2018-19-data/bootstrap-static-2018-09-12T0851Z.json.gz to fpl-2018-19-data/bootstrap-static-2018-09-12T0851Z.json.gz
download: s3://fpl-2018-19-data/bootstrap-static-2018-09-15T0856Z.json.gz to fpl-2018-19-data/bootstrap-static-2018-09-15T0856Z.json.gz
download: s3://fpl-2018-19-data/bootstrap-static-2018-09-13T0220Z.json.gz to fpl-2018-19-data/bootstrap-static-2018-09-13T0220Z.json.gz
download: s3://fpl-2018-19-data/bootstrap-static-2018-09-12T1452Z.json.gz to fpl-2018-19-data/bootstrap-static-2018-09-12T1452Z.json.gz
download: s3://fpl-2018-19-data/bootstrap-static-2018-09-12T0852Z.json.gz to fpl-2018-19-data/bootstrap-static-2018-09-12T0852Z.json.gz
download: s3://fpl-2018-19-data/bootstrap-static-2018-09-13T0856Z.json.gz to fpl-2018-19-data/bootstrap-static-2018-09-13T0856Z.json.gz
...

This will download the contents of the bucket to the fpl-2018-19-data directory.

Here is a short shell script to uncompress the data to a sibling data directory:

#!/bin/sh -

destdir=data
for f in fpl-2018-19-data/*; do
	dest="$destdir/$(basename -s .gz $f)"
	if [ ! -f $dest ]; then
		echo $dest
		gunzip -c $f > $dest
	fi
done

Each file contains the timestamp it was downloaded at in the filename:

$ ls data
bootstrap-static-2018-09-12T0851Z.json
bootstrap-static-2018-09-12T0852Z.json
bootstrap-static-2018-09-12T1452Z.json
bootstrap-static-2018-09-13T0220Z.json
bootstrap-static-2018-09-13T0856Z.json
bootstrap-static-2018-09-13T2056Z.json
bootstrap-static-2018-09-14T0856Z.json
bootstrap-static-2018-09-14T2056Z.json
bootstrap-static-2018-09-15T0856Z.json
bootstrap-static-2018-09-15T2056Z.json
...

I was able to extract the date from the filename using the following strptime pattern in Python:

>>> from datetime import datetime
>>> def extract_timestamp(filename):
...     return datetime.strptime(filename, 'bootstrap-static-%Y-%m-%dT%H%MZ.json')
... 
>>> extract_timestamp('bootstrap-static-2018-09-15T2056Z.json')
datetime.datetime(2018, 9, 15, 20, 56)

Example visualisations

As an experiment to see what I could do with the data, I loaded it into Elasticsearch and made some graphs with Kibana (I'll write about how I did this in a future post):

Average team value over time

Average team value over time

Average team value over time

Manchester City players are far and away the most expensive—although their value seems to have taken a hit recently—followed by Liverpool as a distant second. The unlucky club with ID 3 just before the tight pack at the bottom that seems to be dropping consistently in value is Arsenal.

Player value over time (top 10)

Player value over time (top 10)

Player value over time (top 10)

The undisputed top two most valuable players are Mohamed Salah and Harry Kane, with Sergio Agüero in tenuous third place.

Player ownership over time (top 5)

Player ownership over time (top 5)

Player ownership over time (top 5)

Fantasy managers are fickle, as even the most valuable players see their popularity fluctuate wildly. The subject of the precipitous drop in the middle of the graph is Sergio Agüero falling from being owned by more than half of all teams to just a third in a single week.

Salah's ownership vs value over time

Salah's ownership and value over time

Salah's ownership and value over time

While the FPL pricing algorithm is not fully understood, it's heavily influenced by demand. Here, we can see how Salah's price tightly tracks the percentage of teams he is owned by, and this trend applies for most players.