8834 Tags

documentcloud api

4 Articles

View all...

Release Notes: Introducing DocumentCloud searchable notes, advanced OCR and additional internationalization options

Release Notes: Introducing DocumentCloud searchable notes, advanced OCR and additional internationalization options

DocumentCloud has had many feature updates within the last couple of months, including the ability to de-index documents from DocumentCloud’s public search and search engines like Google, the ability to upload documents via email, several new Add-Ons and pro features which include the ability to search publicly accessible notes and use Amazon’s Textract OCR to get better text extraction from within hard-to-OCR documents.

Read More

Upload large collections of documents to DocumentCloud with ease

Upload large collections of documents to DocumentCloud with ease

Uploading large sets (hundreds, thousands, or even millions) of documents to DocumentCloud using the user interface can be laborious and requires careful monitoring of uploads for processing errors and splitting up the document set into smaller batches.

DocumentCloud’s Batch Upload Script was initially written to upload the CIA Crest files, which contains almost 1 million files. It keeps track of which files were uploaded successfully, so that it can be stopped and restarted and it will pick up where it left off, and errors can be retried. It uploads files in batches. It can be stopped gracefully by pressing CTRL+C (once) while it is running. A recent rewrite allows the script to run on any directory of documents.

Read More

Developing your first DocumentCloud Add-On

Developing your first DocumentCloud Add-On

Learn how to use GitHub Actions and Python to help build tools to make DocumentCloud even more powerful.

Read More

DocumentCloud Add-Ons: Automate data extraction, alerts, ingestion, and much more with our simple, open source plugin system

DocumentCloud Add-Ons: Automate data extraction, alerts, ingestion, and much more with our simple, open source plugin system

Today, we’re launching Add-Ons, which makes it easier to launch, maintain, and share new capabilities right within DocumentCloud, ranging from exporting notes to applying machine learning techniques.

Read More