sahil_mehta.
all work

04 / project

ClaudeJob

Agentic Resume Tailoring Pipeline · Personal

Node.jsAnthropic SDKpdfkitSSE StreamingExpress
ClaudeJob pipeline mid-stream — agentic resume tailoring with live JD analysis, resume preview, cover letter, and matched skills
Pipeline mid-stream: JD analysis, structured-output resume tailoring, and cover letter all generated and validated in one SSE flow.
ClaudeJob kanban tracker with applications across Wishlist, Applied, Phone Screen, Technical, Offer, Rejected columns
Kanban tracker — every pipeline run lands here as an active application.
ClaudeJob dashboard with funnel, weekly goals, and recent activity feed
Dashboard with the application funnel + recent activity.

The Problem

AI-tailored resumes have a predictable failure mode: the model hallucinates statistics, generates clichéd language, and drifts away from the candidate's actual voice. The standard solution is "trust the LLM and proofread manually" — fine for a single application, untenable across fifty.

I needed a pipeline I could hand a job description and trust to produce a resume I'd be willing to send before reading it.

The Approach

Build a deterministic guardrail layer around the LLM. The model never produces free-form text — it produces a JSON document matching a strict schema, and a separate validator stack checks it before a single byte of PDF gets rendered.

Three components do the heavy lifting:

  1. JSON source of truth. A canonical RESUME_BASE_JSON is the only authoritative version. The LLM emits a modified copy of the same shape — bullets, dates, sections — never plain text. Format can never drift.
  2. Validator stack. Banned-phrase regex (30+ AI-resume clichés like "leveraged", "spearheaded"), source-fact validation against the pinned base (catches fabricated metrics — if a number didn't exist in the base, it can't exist in the output), and a jargon-lead heuristic that rejects bullets starting with weak verbs.
  3. Deterministic adjacency-skill injection. A curated ADJACENCY_MAP (keyword → list of skills the candidate has that justify the claim) handles the JD's required-skills section. Never LLM-fabricated.

PDFs render straight from JSON via pdfkit — pure Node.js, pixel-matching the candidate's existing template, no subprocess calls.

Engineering Decisions That Mattered

  • JSON, not text, as the LLM contract. Plain-text resume templates drift. The model decides to add an em-dash, change a section header, drop a date. JSON with a strict schema makes those failure modes impossible — the validator catches schema violations before any human-facing artifact is generated.
  • Pinned-base source-fact validation over self-consistency checks. Asking the LLM "did you make up this stat?" is a research problem. Asking "is this stat in the source document?" is a string match. The constraint is the moat.
  • Adjacency injection by curated map, not LLM judgment. The JD says "FastAPI", the candidate has "Flask" → the map appends FastAPI to the skills line with Flask as the justifier. No model in the loop = no fabrication. Edit the map; ship the change.
  • Pure Node PDF rendering via pdfkit. Earlier iterations spawned a Python subprocess. On Apple Silicon the arch mismatches were a constant source of flaky failures. Switching to pdfkit eliminated an entire class of bugs — and made the whole pipeline portable.

Outcome

47 passing unit tests across the validator stack. Currently powers my own AI Engineer applications — every resume going out the door has been through this pipeline.

Tech

Node.js · Express · Anthropic SDK (Claude Sonnet) · SSE Streaming · pdfkit · Jest

github.com/sahilmehta17/claudejob →


next project

AI Chatbot & Agentic Copilot

T-Mobile for Business · Enidus