This is pretty rough, but I’m tired of it being a draft. Comment if something’s unclear and I’ll clean it up. This page is a tutorial on how to use OCR from MarkLogic. In particular, we’ll use MLJAM to call Tess4J, a Java wrapper around the open source tesseract-ocr. The tutorial assumes you’re running MarkLogic […]