Getting the most recent data

If you just browse the site, you’ll find that EveryPolitician simply links to the most recently updated version of its data files.

But if you’re accessing the data programmatically, that might not work nicely for you because the data is hosted on GitHub, which doesn’t serve the right Content-type when your application fetches the data. Instead, you should get it from cdn.rawgit.com (while you’re there, you should also read our notes on using the data in general).

In summary: get the latest data file URLs from the popolo_url and csv_url fields within countries.json.

The details

EveryPolitician data is frequently updated. Typically you'll be interested in the most recent data, but if you need to get old versions you can. This is because all the files are being stored in git.

This means that every time we change the data, that change is given a unique code (called a hash, which looks something like 0b47c0c). You can use this code to uniquely refer to the files as they were at that moment.

If you're familiar with git or GitHub, this will seem like second nature. But if you're not, all you really need to know is that the URLs to the data always have such a hash in them. This means that EveryPolitician data URLs are really pointing at a snapshot of the data.

So if you want to get the most recent version of the data, make sure you're using the latest hash in the URL. We put the data URLs in countries.json, as shown below. So to get the most recent data programmatically, you need to look in the current version of countries.json — possibly also inspecting the value of the lastmod field (which contains the Unix epoch time of the last time the data was updated) for the legislature you're interested in.

Explaining countries.json

EveryPolitician publishes a list in JSON format of all the countries we have data for: countries.json. This file includes metadata about what’s available to download, including the URLs to use.

Key fields that countries.json has for each country include:

  • name: the name of the country to help human-reading of the data
  • slug: the name with spaces replaced by hyphens and punctuation stripped, so suitable for use in URLs and directory names
  • code: the country code (for example, EE for Estonia), which helps with programmatic lookups. We’re using ISO 3166-1 alpha-2 country codes where possible (with additional ISO 3166-2 three-letter codes where appropriate: for example, GB-WLS for Wales).
  • legislatures: a list of legislatures within this county. There are lots of ways governments are organised. Some have a single house, others have separate upper and lower chambers. Sometimes a country changes its entire legislative process completely, such as the Council of Deputies in Libya replaced the General National Congress. We separate each of these out individually.

Within the legislatures, in addition to a similar name and slug, these fields help you identify and locate the data you want:

  • popolo_url: the fully-qualified URL to the Popolo JSON file
  • popolo: the path to the Popolo JSON file
  • lastmod: Unix time when the data was last modified
  • legislative_periods: a list of each of the terms, or periods. Each one of those has these fields:
    • id, name, start_date, and slug
    • csv_url: the fully-qualified URL to the CSV file of data for this term

Data URLs: popolo_url and csv_url

The countries.json file contains the path and the URL of two types of data file:

  • popolo_url — the Popolo JSON data for the whole legislature
  • csv_url — the comma-separated data for individual terms within the legislature

This is what an EveryPolitician data URL looks like:

https://domain
   /everypolitician/everypolitician-data
   /commit-SHA
   /path-to-data-file

You might expect the domain to be https://github.com/, but for your application we recommend you use https://cdn.rawgit.com.

For example: the URLs for the most recent (at time of writing — and that’s already out of date, which neatly demonstrates why you need to know how to determine these SHAs) politicians’ data for the UK’s House of Commons legislature, for the 56th term (which started 2015-05-08), look like this:

  • JSON: try it
    https://cdn.rawgit.com/everypolitician/everypolitician-data/0b47c0c/data/UK/Commons/ep-popolo-v1.0.json
  • CSV for the term “56th Parliament of the United Kingdom”: try it
    https://cdn.rawgit.com/everypolitician/everypolitician-data/0b47c0c/data/UK/Commons/term-56.csv