On April 23, MuckRock is joining BuzzFeed and dozens of other transparency enthusiasts to figure out what we can learn from a database of tens of thousands of FOIA and public records requests. The free event is currently full, but there is a waitlist available.
If you’re going and want a head start, we’re working to pull and clean up our data for access, with plans to open up four different data sets:
- A CSV of every agency in our database, with metadata on response times, average number of communications per requests, and other bits of data data (currently includes 5570 agencies).
- A CSV of every jurisdiction in our database, with similar metadata as above. This CSV will cover every jurisdiction (city, state, county) in America, but many of these we don’t have any request data on yet.
- A CSV with metadata on every public request filed through MuckRock.
- A complete export of every public request filed through MuckRock. This is about 112 gigabytes of data, so it’s available through a .torrent. We’ll also have it available on USB sticks at the event.
We also have a script that allows programmatic filing of requests, although it is currently a little crusty.
We’ll update these data sets over the coming week on this page. If you have additional columns or ideas for data that would be helpful, email us or tweet us at @MuckRock, and we’ll see what we can provide.
If you can’t make it, we’d still love to hear what you do with the data!
Join us, in person or on the web, and let’s make FOIA a little more open.
- 3 p.m. 4/17/16: First pass at Request data is posted. Agency and jurisdiction is listed with our internal ID, which helps if you’re using the API but otherwise isn’t super useful. Working on updating to include both our internal IDs as well as plaintext names for agencies and jurisdictions.
- 6 p.m. 4/16/16: Second passes at Agency and Jurisdiction data are posted, including detailed jurisdiction information and additional fields. Thanks Mitch!
- 10 a.m. 4/16/16: Currently, first passes at Agency and Jurisdiction data are posted. Both of these are missing state information, which we’re working to add now. We’re working on transferring the full request data set to a computer that can seed the torrent.