Dmitry
Dmitry@medonomator·Batumi, GE
Booking · Q3 2026

AI engineer.
I build LLM systems that survive real users.

The plumbing under products with paying users: token-level billing, cognitive memory, multi-stage evals, voice.

Three years on AI infra. Twelve on backend.

12y
TypeScript
3y
LLM systems in production
15K+
LLM calls per day, live
01What I build

Four areas. All shipped to production more than once.

The cases that break under load: cost runaway, hallucinated entities, provider outages, prompt drift, rate-limit storms. These are what I design around, not what I patch after.

LLM infrastructure

Token-level billing across text, TTS, and Whisper. Provider fallback chains. Prompt versioning with hash-based drift detection. Full Langfuse traces from request to model output.

billingevalsobservability

Cognitive memory

Episodic and semantic memory over Qdrant or pgvector. Bayesian Knowledge Tracing for skill mastery. Hidden Markov Models for emotional state. Thompson sampling for adaptive content.

memoryBKTbandits

AI pipelines

Multi-stage quality gates with Zod-validated outputs. Entity reference checks against retrieved context. Retry loops that pass rejection reasons back into the prompt instead of blind regeneration.

RAGvalidationagents

Multi-modal

Streaming chat over μWebSockets. TTS and Whisper with hallucination sanitization on both ends. Function calling with project-aware context. Real-time interrupts.

voicestreamingWS
02Selected work

Four projects in production. One open. Three under NDA.

Adaptive learning
Solo architect and engineer

MindForge. AI engineering curriculum.

64 lessons of AI engineering, taught by a Socratic tutor that grades exercises semantically and adapts coding tasks to the learner's actual GitHub history. End-to-end stack, mine.

  • Prompt composition framework. 15+ task prompts, hash-based drift detection on every change
  • Token-level billing across text, TTS, Whisper, and function calls
  • WebSocket streaming via μWebSockets.js for chat, hints, and live transcription
  • GS2 lesson schema validated with Zod, with misconception flags attached per concept
Visit MindForge
Behavior change
Lead AI engineer, NDA

Cognitive notification engine for a habit app.

Hyper-personalized push generation for ~2k MAU. A Thompson-sampling bandit learns per-user tone and angle. Four-stage LLM quality gate: generate, judge, entity-check against context, fallback. An HMM-driven override forces supportive tone when the model detects risk.

  • 8 notification types with tier-based budgets (ACTIVE, SLOWING, DORMANT, COLD)
  • Bayesian Knowledge Tracing for habit mastery probabilities
  • Hidden Markov Model for emotional trajectory and crisis prediction
  • Adaptive timing: Redis prefilter plus SQL verify, scales to millions of users
Long-context memory
Architect and engineer

Episodic and semantic memory layer.

A cognitive context service that enriches every LLM prompt with relevant past interactions, learned facts as subject-predicate-object triples, emotional trends, and behavioral predictions. Each module degrades gracefully on its own.

  • OpenAI text-embedding-3-small over Qdrant with filtered similarity
  • Survival analysis for churn prediction, injected as warnings into the prompt
  • Concept ingestion from PR reviews feeds automatic mastery tracking
  • Sub-100ms P95 enrichment with async-safe failure modes
Developer tools
Solo build

PR-aware AI tutor for real repos.

A GitHub App that generates personalized coding tasks against the learner's own repos, then reviews their PRs with inline comments. Difficulty calibrates from past review decisions: three rounds of changes makes the next task easier, two clean approvals raises the bar.

  • Octokit-based diff fetching with line-number validation and SHA-dedup
  • Stack-aware persona for ML, distributed systems, or web, each with its own antipatterns
  • Concept extraction from PR review flows into the learner mastery model
  • BullMQ async pipeline, observability via Sentry and Langfuse
03What clients say

Said by founders who shipped. Names anonymized at their request.

Shipped the cognitive memory layer in six weeks. Clean handoff, evals included from day one. The kind of engineer you wish was on payroll.

AK
TODO · ReplaceVP Engineering · Series A · B2C habits app

Audited our LLM cost stack and cut spend 38% in two weeks. Found a prompt-drift bug we had been chasing for months.

MS
TODO · ReplaceCTO · AI tutoring platform

Drops in, ships, leaves the codebase better than he found it. Three sprints, zero handholding, voice pipeline survived launch.

DR
TODO · ReplaceHead of Engineering · Voice-first SaaS
04Stack

What I reach for. Boring infra, sharp on the AI layer.

TypeScript across the stack, Python at the AI layer. Postgres, Docker, and Hetzner for plumbing. Specialized tools where the difference shows up in the bill.

Models & APIs
OpenAI (GPT-5, GPT-4o, o1)Anthropic (Claude Opus, Sonnet)WhisperTTSEmbeddings
AI tooling
LangfuseZod schemasStructured outputsFunction callingts-fsrs
Vector & memory
QdrantpgvectorRedisKeyDBPostgreSQL
Backend
TypeScriptPythonNestJSBullMQμWebSockets.jsTypeORMSentry
Frontend
Next.jsReact 19Tailwind v4ZustandFramer Motion
Infra
DockerHetznerOracle CloudCloudflareGitHub Actions
05How to engage

Three engagement modes. Every one ends with code in your repo.

Fixed scope. Fixed price. I don't sell hours. Each engagement closes with something deployed, something documented, or both.

4 to 12 weeksfrom $12K
Build

One AI feature, spec to production.

I take a defined system from spec to deploy: cognitive memory, RAG, evals harness, voice mode, or a billing layer. Fixed scope. Fixed price. Weekly demos against a checklist you sign off.

  • Architecture doc and delivery plan up front
  • Deployed to your infra, not mine
  • Evals and observability shipped with the feature
Discuss this engagement
1 to 2 weeksfrom $4K
Audit

Architecture review of a live system.

Your AI system is in production but something feels off. Costs creeping. Hallucinations slipping past your checks. Latency spiking on hot paths. I find what's actually wrong and write a prioritized fix plan with effort estimates.

  • Code and infra walkthrough
  • Cost, risk, and drift report
  • Remediation roadmap with effort estimates
Discuss this engagement
monthly engagementfrom $5K/month
Embed

Senior engineer on your team.

I drop into your team for a focused push. Pair on the hardest features, ship them, leave behind PRs and design docs your engineers can extend. No onboarding ramp on basics.

  • Delivery measured in merged PRs
  • Architecture pairing with your leads
  • Decisions captured in writing
Discuss this engagement