AI prepress checker

How an email-in / email-out print-prep bot is wired together.

A print shop publishes one address. Customers email files in. A worker reads the inbox, checks each PDF for the things that ruin a print run, and replies in plain English. No human in the loop, no portal to log into.

Flow

Customer email subject · body · PDF attached Gmail IMAP · poll every 10s Worker (Python) long-running loop · single process 1 · Relevance filter "Is this asking us to print business cards?" · Haiku 4.5 2 · PDF analysis poppler · zbar · pdfplumber · pypdf QR cross-ref placeholders bleed colour space image DPI text · strokes structured findings list → severity + category + detail 3 · Reply drafting plain-English summary of findings · Sonnet 4.6 Gmail SMTP · threaded reply Customer inbox "looks press-ready" · or list of issues

Why these checks?

QR cross-reference

Rasterise each page, decode any QR codes, compare the encoded URL to the URLs in the card's text. A misspelt domain in the QR (exarnple.com vs example.com) is invisible to a human eye but lethal to print.

Placeholders & typos

Regex finds obvious leftovers (Lorem ipsum); a small LLM pass catches fuzzier mistakes — URL/email domain mismatches, near-miss spellings.

Bleed

Compares MediaBox against TrimBox / BleedBox. If the page sits at trim size and ink runs to the edge, the cut leaves a white sliver. Pure-white edges downgrade to a note.

Colour space

Press wants CMYK. Any RGB image embedded in the PDF is flagged so the shop can convert before they output plates.

Image DPI

For each embedded raster, divide source pixels by placed size. Anything under ~250 DPI will look soft on press.

Text size & strokes

Type under 6pt and strokes under 0.25pt sit at the edge of what offset can hold cleanly. Both get flagged.

What is and isn't in the loop

The bot writes and sends the reply itself — no draft review, no human approval. The relevance filter ahead of the analysis is the safety net: anything that doesn't look like a printing request gets read and ignored, no auto-reply to spam or newsletters.

Replies are threaded under the original message, so the conversation looks like a normal back-and-forth with the print shop's address rather than a bot UI.

Stack. Python 3.12 · Anthropic SDK (Haiku 4.5 for the filter, Sonnet 4.6 for drafting) · imap-tools / smtplib · pdfplumber + pypdf for parsing · pdf2image (poppler) + pyzbar for QR decode. Single-file loop, polls every 10s, in-memory dedup. A Dockerfile keeps the system deps (poppler-utils, libzbar0) reproducible.