Release Notes: Introducing DocumentCloud searchable notes, advanced OCR and additional internationalization options
DocumentCloud has had many feature updates within the last couple of months, including the ability to de-index documents from DocumentCloud’s public search and search engines like Google, the ability to upload documents via email, several new Add-Ons and pro features which include the ability to search publicly accessible notes and use Amazon’s Textract OCR to get better text extraction from within hard-to-OCR documents.
DocumentCloud has long offered a number of options when it comes to making documents easier to search, including support for OCR, which “reads” through text that’s saved as images to make it searchable. With the latest upgrades to the DocumentCloud Beta, we’re expanding support from 22 languages to 103, so whether you’re analyzing a document in Afrikaans or Yiddish, or anything in between, we’ve got you covered.
Longtime MuckRock readers might have noticed that there are two ways to link to responsive records on the site: A direct link to the .pdf on our server, or a link to the DocumentCloud document viewer on the request page. Wherever possible, we try to link to the latter, as it has a few advantages over your standard browser viewer that we’d like to highlight today.