Our little bot works tirelessly collating the EveryPolitician data and keeping it up to date.
OK, so in fact it’s not really a robot, and it doesn’t look like this. Nonetheless, if you’re interested in the some of the implementation and design details behind the project, we think the bot’s point-of-view might be a good way to explain things.
An introduction to the bot (on mySociety’s blog).
The bot’s high-level overview of our data-building process: from multiple sources to useful CSV and JSON files.
The bot makes lots of commits and pull requests, triggered by webhooks.
We use the bot to inspect pull requests and add summaries to them as comments.
names.csv, and an example of an app being kept up-to-date by using EveryPolitician
How the data gets from multiple scrapers into the EveryPolitician data repo.
When the bot generates a new data pull request, it automatically tidies up previous ones.
How we use GitHub webhooks to add SHAs to
countries.json, keeping it timely.
We’ll make a POST request to your URL whenever the EveryPolitician data changes.
The bot makes EveryPolitician’s (static) website by spinning up a dynamic one and spidering its pages into
A simple benefit from using GitHub Pages to host the EveryPolitician website.
The bot detects each new data pull request, and spins up a preview site on Heroku showing how it will look.
The same mechanism for previewing and building the live site lets us look at old commits too.
Using GitHub’s library wrappers to access the GitHub API.
We reprogrammed to bot to stop it closing pull requests that humans are working on.
The bot really can make a lot of commits and pull requests.
Using git as a data store, and how it’s OK that we rebuild the data just to determine nothing’s changed.
EveryPolitician adds its own UUIDs, but also retains useful IDs from external sources.
Although the data pull requests are usually prepared by the bot, a human decides whether or not to merge them.
Wikidata is a wonderful, always-updating source of international transliterations.
The bot merges multiple sources into a single set of data for every legislature.
A look at the event-driven nature of the EveryPolitician system.
We needed to import CSVs because people use spreadsheets, and we use the same simple format for more complex imports too.
Everything the bot needs to know in order to combine data from different sources is in the file
The easy way to access and manipulate the data in Ruby: use the gem.