DockIns: Machine Learning on Deadline for Journalists

As journalists dealing with data and document sets, we find that the most interesting information is usually hidden in large, unstructured, and incomplete sets of documents. Especially information in public contracts: what the government is buying, how much money is being spent, and who are the suppliers. To answer these questions, four media organizations — La Nacion, CLIP, Ojo Público, and MuckRock — joined forces under the JournalismAI Collab and experimented with different machine learning tools and techniques in order to build a platform that helps investigative reporters understand and process unstructured documents to get useful insights.

This work builds directly off pioneering interface and modelling research by Prof. Eli T. Brown and the Laboratory for Interactive Human-Computer Analytics. It also extends MuckRock’s prior work supported by the Ethics and Governance in AI Initiative and the Knight Foundation, and we’re excited to roll these technologies out in preview as we work to continue gather feedback and explore new ways to help newsrooms, researchers, and civic technologists stay on top of an ever-increasing flow of documents and data.

To get started, existing DocumentCloud users can read our tutorial for using the hosted version of SideKick, or you can run these technologies locally by using the standalone version of SideKick.

If you give SideKick a try — or you’re interested in putting it to work on a large document set you have — we’d love to hear from you! Ping us at michael@muckrock.com or slide into the MuckRock FOIA Slack and let us know how it goes.

8 Articles

DockIns: una interfaz para usuarios finales

DockIns: una interfaz para usuarios finales

En los últimos seis meses de nuestra colaboración con LSE, testeamos diferentes herramientas y técnicas para construir una plataforma que ayude a los periodistas de investigación a comprender y procesar documentos poco estructurados y obtener conocimientos útiles.

Read More

Cómo correr Sidekick

Cómo correr Sidekick

Alguna vez ¿has tenido una pila de documentos y has querido comenzar a concentrarte rápidamente en una parte determinada de material? ¿Te gustaría contar con ayuda para trabajar solamente en los contratos, o quizás, los informes policiales que detallan un determinado tipo de encuentro, o bien, poder dividir rápidamente las cartas de respaldo de aquellas negativas dirigidas a un político sobre un tema clave?

Read More

Dockins: machine learning para periodistas

Dockins: machine learning para periodistas

El acceso a la información pública tiene un rol fundamental en la exigibilidad de otros derechos y es una de las herramientas principales que la sociedad civil requiere para controlar e influir en los gobiernos.

Read More

View all...