04 / project
ClaudeJob
Agentic Resume Tailoring Pipeline · Personal



The Problem
AI-tailored resumes have a predictable failure mode: the model hallucinates statistics, generates clichéd language, and drifts away from the candidate's actual voice. The standard solution is "trust the LLM and proofread manually" — fine for a single application, untenable across fifty.
I needed a pipeline I could hand a job description and trust to produce a resume I'd be willing to send before reading it.
The Approach
Build a deterministic guardrail layer around the LLM. The model never produces free-form text — it produces a JSON document matching a strict schema, and a separate validator stack checks it before a single byte of PDF gets rendered.
Three components do the heavy lifting:
- JSON source of truth. A canonical
RESUME_BASE_JSONis the only authoritative version. The LLM emits a modified copy of the same shape — bullets, dates, sections — never plain text. Format can never drift. - Validator stack. Banned-phrase regex (30+ AI-resume clichés like "leveraged", "spearheaded"), source-fact validation against the pinned base (catches fabricated metrics — if a number didn't exist in the base, it can't exist in the output), and a jargon-lead heuristic that rejects bullets starting with weak verbs.
- Deterministic adjacency-skill injection. A curated
ADJACENCY_MAP(keyword → list of skills the candidate has that justify the claim) handles the JD's required-skills section. Never LLM-fabricated.
PDFs render straight from JSON via pdfkit — pure Node.js, pixel-matching the candidate's existing template, no subprocess calls.
Engineering Decisions That Mattered
- JSON, not text, as the LLM contract. Plain-text resume templates drift. The model decides to add an em-dash, change a section header, drop a date. JSON with a strict schema makes those failure modes impossible — the validator catches schema violations before any human-facing artifact is generated.
- Pinned-base source-fact validation over self-consistency checks. Asking the LLM "did you make up this stat?" is a research problem. Asking "is this stat in the source document?" is a string match. The constraint is the moat.
- Adjacency injection by curated map, not LLM judgment. The JD says "FastAPI", the candidate has "Flask" → the map appends FastAPI to the skills line with Flask as the justifier. No model in the loop = no fabrication. Edit the map; ship the change.
- Pure Node PDF rendering via pdfkit. Earlier iterations spawned a Python subprocess. On Apple Silicon the arch mismatches were a constant source of flaky failures. Switching to pdfkit eliminated an entire class of bugs — and made the whole pipeline portable.
Outcome
47 passing unit tests across the validator stack. Currently powers my own AI Engineer applications — every resume going out the door has been through this pipeline.
Tech
Node.js · Express · Anthropic SDK (Claude Sonnet) · SSE Streaming · pdfkit · Jest
next project
AI Chatbot & Agentic Copilot
T-Mobile for Business · Enidus