If you just browse the site, you’ll find that EveryPolitician simply links to the most recently updated version of its data files.
But if you’re accessing the data programmatically, that might not work nicely
for you because the data is hosted on GitHub, which doesn’t serve the right
Content-type when your application fetches the data. Instead, you should get it from
cdn.rawgit.com (while you’re there, you should also read our
notes on using the data in general).
get the latest data file URLs from the
csv_url fields within
EveryPolitician data is frequently updated. Typically you'll be interested in the most recent data, but if you need to get old versions you can. This is because all the files are being stored in git.
This means that every time we change the data, that change is given a unique
code (called a hash, which looks something like
can use this code to uniquely refer to the files as they were at that moment.
If you're familiar with git or GitHub, this will seem like second nature. But if you're not, all you really need to know is that the URLs to the data always have such a hash in them. This means that EveryPolitician data URLs are really pointing at a snapshot of the data.
So if you want to get the most recent version of the data, make sure
you're using the latest hash in the URL. We put the data URLs in
countries.json, as shown below. So to get the most
recent data programmatically, you need to look in the current version of
possibly also inspecting the value of the
lastmod field (which
contains the Unix epoch
time of the last time the data was updated) for the legislature you're interested in.
EveryPolitician publishes a list in JSON format of
all the countries we have data for:
This file includes metadata about what’s available to download, including the
URLs to use.
Key fields that
countries.json has for each country include:
name: the name of the country to help human-reading of the data
slug: the name with spaces replaced by hyphens and punctuation stripped, so suitable for use in URLs and directory names
code: the country code (for example,
EEfor Estonia), which helps with programmatic lookups. We’re using ISO 3166-1 alpha-2 country codes where possible (with additional ISO 3166-2 three-letter codes where appropriate: for example,
legislatures: a list of legislatures within this county. There are lots of ways governments are organised. Some have a single house, others have separate upper and lower chambers. Sometimes a country changes its entire legislative process completely, such as the Council of Deputies in Libya replaced the General National Congress. We separate each of these out individually.
Within the legislatures, in addition to a similar
slug, these fields help you identify and locate the
data you want:
popolo_url: the fully-qualified URL to the Popolo JSON file
popolo: the path to the Popolo JSON file
lastmod: Unix time when the data was last modified
legislative_periods: a list of each of the terms, or periods. Each one of those has these fields:
csv_url: the fully-qualified URL to the CSV file of data for this term
countries.json file contains the path and the URL of two types
of data file:
popolo_url— the Popolo JSON data for the whole legislature
csv_url— the comma-separated data for individual terms within the legislature
This is what an EveryPolitician data URL looks like:
You might expect the domain to be
https://github.com/, but for
your application we recommend you use
For example: the URLs for the most recent (at time of writing — and that’s already out of date, which neatly demonstrates why you need to know how to determine these SHAs) politicians’ data for the UK’s House of Commons legislature, for the 56th term (which started 2015-05-08), look like this: