Use EveryPolitician data

We gather together all this lovely data, curate it, and make it reliably available... so you can use it. Here’s the bare bones of how your application can get it.

This page is for you if you’re a developer who is planning on building something wonderful or brilliant with the data we have on your country’s politicians. It tells you how to get the data and, broadly, how it’s structured (spoiler: CSV or Popolo JSON).

We believe that by making this data available in consistent and useful formats, if you write something neat that uses, for example, the data for your country’s politicians, and open it up under a free software open source license... then other people will be able to adapt what you’ve done to run in their country too. That works because their data will be formatted the same way that yours is: that’s our side of the deal.

How to find what data we’ve got for you

Humans look at the interactive map. But you’re a developer: you look in countries.json.

That’s because countries.json is effectively EveryPolitician’s index. It contains a simple JSON array of objects, one for every country we’ve got data for. These contain identifying fields (like name and country code) as well as metadata for each legislature within those countries. From countries.json you can get the URLs of the data files you want.

Every time we change any of the data, we update countries.json. See getting the most recent data for details about using this crucial file.

Four approaches to getting the data

You’re going to write the code. All you need is the data.

But before you go any further, we want you to think about how you’re going to be getting it. We deliberately don’t provide an API — instead, we make the data available and encourage you to adopt a good policy on how and when you actually go about getting it. There are various approaches, depending on your need:

use the libraries
download it once, never worry again
download it periodically
download it whenever it changes: event-driven

Use the libraries (for Python, Ruby and R)

If you're a programmer and you just want to use the data, you can use the Ruby gem or the python package. These fetch the latest data for you, and you don't have to spare a moment wondering about file formats.

If you're a researcher you can use EveryPolitician directly from R using the everypoliticianR package. This directly downloads the latest information, and a series of examples are included on the package page.

Download it once, never worry again

If you’re doing a one-off — maybe building an infographic or plotting a graph for a printed magazine — this could work for you. But remember that political data changes — elections happen, politicians die, errors get corrected. So if it matters that the data you’re using is timely, and these sort of changes should be reflected in it, then it’s probably better to use of the following approaches.

Download it periodically, so you keep up to date

You can download the data as often as makes sense for your application — perhaps every day, or every week, or whenever you press a button. This means your code will often be updating existing data, as well as adding new data. You need to consider things like: what happens if a politician’s name changes? Someone is removed? A whole new group of politicians is elected?

Once you’ve got this working, you can safely download the data whenever you know it has changed... or even automate it — for example, setting up a task to pull down the latest data every night.

Download it whenever it changes: event-driven

We run a service that will notify you whenever the data is updated.

You can register your application with the EveryPolitician app manager (you’ll need to login with your GitHub account) and whenever the data changes we’ll make a POST request to the URL you nominate. This operates like a webhook: your application can run any code in response to the EveryPolitician data having been updated. Typically this would be to pull down the latest data files.

We use this event-triggered mechanism for the Gender Balance application. It’s registered with the app manager, so it is alerted when there is new data available to download — basically, it can keep itself up-to-date. So whether there’s an election in Sweden or a politician in Thailand changes their name, Gender Balance will incorporate the change as soon as EveryPolitician has the data.

Pick a format

The data itself is available in two formats. Which one you want will depend on what you’re trying to do with it.

CSV data for straightforward data for a given term
Popolo JSON data for full structured data

CSV data

We make the data available as CSV (comma separated values) because it’s just so useful. If you just want the basic data, it might also be the simplest way for you to absorb it. For any given legislature we slice the data up into separate CSV files for each legislative period, or “term”.

The disadvantage of the CSV format is it can’t be as rich, because, by definition, it can’t easily represent structure. So that’s what the JSON is for.

Popolo JSON data

Popolo is an open standard for expressing political data — exactly for the kind of thing we’re doing here. So we provide our data in JSON format too, complying with Popolo standard. These are the same politicians as in the CSV of course (we generate both the CSV and JSON anew every time anything changes, keeping them perfectly in sync), but the data is richer. So, unlike the CSV which is sliced by terms, if you pull down the JSON data you’ll be getting all the data for all the politicians across all the terms.

If you’re doing anything which needs this sort of richness, grab the JSON data and use your favourite JSON library to jump right into having structured data in your application.

Where to get it from: `cdn.rawgit.com`

The EveryPolitician data is published on GitHub, which means the raw files are automatically available from raw.githubusercontent.com too. But you can also get it from cdn.rawgit.com instead. Here’s why that might be useful.

In order to discourage the use of its repos for static hosting, GitHub itself doesn’t serve the data with the right MIME-type content headers. This works passably well for humans (your web browser thinks it’s getting plain text), but won’t work properly with your application if it needs the correct Content-type. This is likely to matter if you’re making client-side AJAX requests, for example.

So instead, we encourage you to get the data from RawGit. RawGit acts as a caching proxy. If you ask it for a file it will get it from GitHub on your behalf, but significantly when it sends it back to you it sets the correct Content-Type. Of course, it also caches it so if you or anyone else asks for it again, GitHub won’t be troubled.

This works because those files are static. And they’re static because every RawGit URL must have a commit SHA in it — that is, you’re explicitly referring to a specific file in EveryPolitician’s repo. Remember countries.json? That’s why you should get their URLs from that file.