OCR Quickstart
Run OCR with the GoodMem CLI and choose the right output format.
OCR Quickstart
Use the goodmem ocr CLI command to run layout-aware OCR on PDFs and images. This guide covers
input handling, page ranges, output formats, and how markdown is formatted on the client.
Availability
OCR is powered by the GoodMem OCR add-on service/image and is not included in the base install.
To use OCR, run the add-on service and configure the GoodMem server with GOODMEM_OCR_BASE_URL
or --ocr-base-url.
Before You Start
- GoodMem server running (default gRPC address:
https://localhost:9090). goodmemCLI installed.- GoodMem OCR add-on service/image running and reachable by the server.
- API key with the
OCR_DOCUMENTpermission (GOODMEM_API_KEYor--api-key). - Input file in a supported format: PDF, TIFF, PNG, JPEG, or BMP.
By default the REST server listens on HTTPS with a self-signed certificate (typically on https://localhost:8080). For local development, add -k to cURL and --verify=no to HTTPie. If you configure REST to run without TLS, switch the URL to http:// and drop those flags.
If you want to call the REST endpoint directly, set:
export GOODMEM_REST_URL="https://localhost:8080"
export GOODMEM_API_KEY="gm_your_key"Run OCR on a File
goodmem ocr --file document.pdf --format jsoncontent=$(base64 -w 0 document.pdf)
curl -sS -k --json @- "$GOODMEM_REST_URL/v1/ocr:document" \
--header "x-api-key: $GOODMEM_API_KEY" <<JSON
{
"content": "$content",
"format": "PDF"
}
JSONcontent=$(base64 -w 0 document.pdf)
http --verify=no POST "$GOODMEM_REST_URL/v1/ocr:document" \
x-api-key:"$GOODMEM_API_KEY" \
content="$content" \
format="PDF"For larger files, HTTPie can hit shell argument limits when passing base64 inline. In that case, write the JSON body to a file and pass it via stdin:
python3 - <<'PY' > request.json
import base64
import json
with open("document.pdf", "rb") as f:
payload = {
"content": base64.b64encode(f.read()).decode("ascii"),
"format": "PDF",
}
print(json.dumps(payload))
PYjq -n --arg content "$(base64 -w 0 document.pdf)" \
'{content: $content, format: "PDF"}' > request.jsonhttp --verify=no POST "$GOODMEM_REST_URL/v1/ocr:document" \
x-api-key:"$GOODMEM_API_KEY" \
< request.jsonThe CLI sends the file to the OCR service and prints an OcrDocumentResponse JSON payload.
Use --output (CLI) or a redirect (REST) to write the response to a file:
goodmem ocr --file scans.tiff --format json --output ocr-output.jsoncontent=$(base64 -w 0 scans.tiff)
curl -sS -k --json @- "$GOODMEM_REST_URL/v1/ocr:document" \
--header "x-api-key: $GOODMEM_API_KEY" \
-o ocr-output.json <<JSON
{
"content": "$content",
"format": "TIFF"
}
JSONcontent=$(base64 -w 0 scans.tiff)
http --verify=no POST "$GOODMEM_REST_URL/v1/ocr:document" \
x-api-key:"$GOODMEM_API_KEY" \
content="$content" \
format="TIFF" \
> ocr-output.jsonStream Input via stdin
If you are piping bytes, specify the input format explicitly to avoid ambiguity:
cat document.pdf | goodmem ocr --input-format pdf --format jsoncontent=$(cat document.pdf | base64 -w 0)
curl -sS -k --json @- "$GOODMEM_REST_URL/v1/ocr:document" \
--header "x-api-key: $GOODMEM_API_KEY" <<JSON
{
"content": "$content",
"format": "PDF"
}
JSONcontent=$(cat document.pdf | base64 -w 0)
http --verify=no POST "$GOODMEM_REST_URL/v1/ocr:document" \
x-api-key:"$GOODMEM_API_KEY" \
content="$content" \
format="PDF"--input-format auto (the default) inspects file signatures and works for most inputs, but
explicit formats are safer when streaming.
Choose Output Format
--format json(default) returns the full OCR response structure: detected format, per-page layout, image metadata, and timings.--format markdownprints concatenated page markdown for human review.
REST always returns JSON. Use include flags to add markdown or raw OCR JSON fields:
goodmem ocr \
--file document.pdf \
--format json \
--include-markdown \
--include-raw-jsoncontent=$(base64 -w 0 document.pdf)
curl -sS -k --json @- "$GOODMEM_REST_URL/v1/ocr:document" \
--header "x-api-key: $GOODMEM_API_KEY" <<JSON
{
"content": "$content",
"format": "PDF",
"includeMarkdown": true,
"includeRawJson": true
}
JSONcontent=$(base64 -w 0 document.pdf)
http --verify=no POST "$GOODMEM_REST_URL/v1/ocr:document" \
x-api-key:"$GOODMEM_API_KEY" \
content="$content" \
format="PDF" \
includeMarkdown:=true \
includeRawJson:=truePage Ranges (0-based, Inclusive)
Use --start-page and --end-page to limit work:
goodmem ocr --file document.pdf --start-page 0 --end-page 2content=$(base64 -w 0 document.pdf)
curl -sS -k --json @- "$GOODMEM_REST_URL/v1/ocr:document" \
--header "x-api-key: $GOODMEM_API_KEY" <<JSON
{
"content": "$content",
"format": "PDF",
"startPage": 0,
"endPage": 2
}
JSONcontent=$(base64 -w 0 document.pdf)
http --verify=no POST "$GOODMEM_REST_URL/v1/ocr:document" \
x-api-key:"$GOODMEM_API_KEY" \
content="$content" \
format="PDF" \
startPage:=0 \
endPage:=2- Omitting
--start-pagemeans "start at page 0". - Omitting
--end-pagemeans "process through the last page". - Errors are returned if the range is negative, inverted, or outside document bounds.
Markdown Behavior (CLI)
The server returns raw markdown derived from the layout text. When you request --format markdown,
the CLI applies formatting by default:
- Wraps paragraphs to
--markdown-width(default80, use0to disable wrapping). - Preserves code fences, lists, block quotes, tables, and indented blocks.
- Keeps math block delimiters on their own lines (
$$,\[and\]).
To emit the raw server markdown without any client-side formatting, add --markdown-raw.
REST responses include the raw markdown only (no client-side formatting).
Error Handling at a Glance
If OCR fails on specific pages, the response still succeeds and includes per-page error status. The CLI prints warnings to stderr and leaves failed pages blank in markdown output. See the OCR reference pages for full details on output structure and limits. If the request fails outright, verify your API key and permissions and confirm the OCR add-on service is reachable.
Optimize Document Ingestion for Better Search
Learn chunking strategies and configuration techniques to improve search performance for your documents
Hybrid Search Pipeline Guidelines
A comprehensive guide for implementing and optimizing intelligent search systems that combine multiple AI models to find the most relevant information