I build backend and AI-infrastructure systems and ship them to production. Right now I'm solo on homesty.ai, a commission-driven real-estate advisor running on Next.js, pgvector, and a refusal-first retrieval layer.
The reliability patterns I needed there became open source: Anchor (RAG that refuses instead of hallucinating), Tripwire (mid-stream LLM guardrails), and Anvil (an idempotent webhook → BullMQ pipeline with dead-letter replay). Same engine, made public.
I care about the unglamorous parts — constant-time HMAC verification, backoff schedules, golden-dataset evals that block a merge on regression. I'm open to backend / AI-infra roles and contract work on RAG, streaming LLMs, and queue/webhook reliability.
Anvil v0.0.1 — webhook → BullMQ pipeline. 5 design contracts green in CI. HMAC-SHA256 constant-time verify, backoff [1s, 5s, 30s, 5m], dead-letter replay.
- ›anvil v0.0.1 — webhook→BullMQ pipeline, 5 contracts green in CI
- ›tripwire v1.0.1 — mid-stream LLM guardrail
- ›quickdraw v1.0.1 — LLM streaming benchmark
- ›codecraft-ai live at codecraft-ai-tau.vercel.app
- ›anvil — examples/stripe + examples/github, publish @ykstormsorg/anvil to npm
- ›backend / AI infra roles at Mumbai or Bangalore startups
- ›YC seed founding-engineer roles
- ›contract work on RAG, streaming LLM, queue/webhook reliability
Mumbai, India · Open to Bangalore / Remote
Homesty.ai
Live commission-driven AI real-estate advisor.
Production AI advisor on Next.js 15 + pgvector + GPT-4o + Claude. Solo-built. Refusal-first retrieval and mid-stream guardrails were extracted from this work into Anchor and Tripwire — same engine, both public.
Anchor
Provenance-first RAG that refuses to hallucinate.
Cosine-floor retrieval returns chunks when similarity is high, refuses when it isn't. Postgres + pgvector + Next.js. Live playground available.
Codecraft AI
In-browser AI IDE. WebContainer-backed live coding playground.
Boots a real Vite + React dev server in the browser tab via WebContainers. Editable Monaco is wired to the WebContainer filesystem with debounced writes and sub-2s hot-reload; an xterm terminal runs real shell commands; an IndexedDB snapshot cache drops repeat-visit boot under 20s. COOP/COEP cross-origin isolation is what makes SharedArrayBuffer and the in-tab runtime work.
Tripwire
Mid-stream LLM safety. Catch the lie before the user finishes reading it.
Pattern engine watches LLM streams token-by-token, aborts on rule trip. Sub-millisecond decision latency. Ships an OpenAI-compatible sidecar proxy (POST /v1/chat/completions) that streams responses through the guard and aborts mid-stream on a trip.
Goldset
Golden datasets + LLM-as-judge + structural assertions, as a GitHub Action.
Three eval runners (golden dataset, LLM-as-judge with a pluggable OpenAI/Anthropic provider, structural assertions), packaged as a GitHub Action that runs on every PR, posts a delta-vs-base comment, and blocks the merge on regression.
@ykstormsorg/tripwire
…
@ykstormsorg/goldset
…
@ykstormsorg/quickdraw
…
GitHub contributions (30d)
…
Timezone
IST (UTC+5:30)
Status
Available