- Sole engineer building a B2B SaaS platform from zero —
40+ REST endpoints, PostgreSQL, Stripe payments, real-time
SSE client portals, and role-based access across designer / admin / client
tiers, live at
tradeliv.design. - Built an AI extraction pipeline using Claude + Puppeteer / Browserless Chrome that parses any e-commerce product URL into structured data — a single universal extractor replacing brittle per-site scrapers.
- Designed a 3-stage CI/CD pipeline that offloads Next.js and TypeScript builds to GitHub Actions runners, deploying compiled artifacts to a 1GB OCI VM — no OOM kills, zero-downtime deploys.
Manish Botta — building production AI & backend systems.
Engineer with 3+ years shipping LLM fine-tuning, agentic pipelines, RLHF alignment, RAG architectures, and the high-scale APIs and event-driven services that hold them up. Comfortable owning the full stack from model training to cloud deployment.
I build production AI-focused backend systems — REST APIs serving enterprise clients, async AI pipelines, and a real-time simulation engine with live users. I've shipped models serving 100M+ daily conversations, cut inference latency by 40%, and reduced harmful outputs 28% in live deployments.
My experience spans LLM fine-tuning (LoRA / QLoRA), RLHF / RLAIF, agentic pipelines, and RAG architectures on one side; high-scale event-driven microservices, REST APIs, and real-time engines on the other. I don't wait for a spec to exist before I start working.
I'm also forward-deployed at heart — I shipped GenAI systems to 24+ enterprise clients (IHCL, Dr. Reddy's, Bajaj Finserv, Spotify), owning on-site integration, dual-sided InfoSec approvals, demos, and onboarding.
Looking to join a Series A/B/C company and build the backend — from model training to cloud deployment — from the ground up.
- Built LLM-powered autonomous agents with tool-use, memory, and planning loops (FastAPI + LangChain + HubSpot OAuth 2.0); eliminated 70% of manual data-handling overhead.
- Designed real-time ingestion pipelines for continuous RLHF / RLAIF feedback loops, improving model data freshness and inference availability by 60%.
- Shipped Python ETL pipelines and REST API integrations replacing manual analyst workflows — cutting processing time 70%, reporting from weekly to same-day.
- Containerized AI microservices via Docker + GitHub Actions on Azure; prompt-level regression testing cut integration defects 40% with near zero-downtime releases.
- Delivered GenAI chatbot systems for 24+ enterprise clients — IHCL, Dr. Reddy's, Bajaj Finserv, Spotify — embedding with each to scope use cases, check feasibility, and ship production agents.
- Led on-site integration of client CRM, ticketing, and data systems into deployed agents; drove InfoSec approvals on both sides to unblock launches stuck for weeks.
- Fine-tuned BERT and open-source LLMs (LoRA / QLoRA); RLHF reward checkpoints cut harmful responses 28% and lifted chatbot accuracy 60% across 100M+ daily conversations.
- Designed and operated event-driven microservices (Node.js, Python) processing 100M+ events / day at 99.9% uptime — the core messaging backbone for enterprise bot workflows.
- Cut API response time 40% by overhauling MySQL / MongoDB query plans, adding indexes, and introducing Redis caching — then used that headroom to onboard 50+ enterprise CRM integrations without touching latency.
- Migrated legacy rule-based pipelines to LLM-powered systems (−50% maintenance overhead); automated end-to-end ML deployment on AWS (EC2, Lambda, S3) with Docker + Jenkins CI/CD — release cycles down 3× across 13 production chatbots.
- Decomposed a legacy monolith into Java Spring Boot microservices, improving team velocity 30%, cutting query response time 40%, and maintaining 80%+ test coverage throughout.
- Lowered p95 latency on core endpoints by building RESTful APIs in Spring Boot and tuning the underlying MySQL queries.
Role Collector — Job Sourcing Agent
Built a local-first sourcing agent on a 13-node LangGraph
workflow with a local Ollama LLM (qwen3:8b) and Langfuse
tracing with PII redaction — full observability on every run.
Enforced a hard-coded URL-allowlist guardrail in code, before any browser tool runs — the agent can never touch login, apply, shortener, or non-allowlisted URLs.
Beat Google's CDP-layer bot detection — which defeated every Playwright / Puppeteer stealth variant — with a nodriver WebSocket-debug approach, backed by 55 passing tests and a deterministic query generator.
SimCricketX — Simulation Backend
Built a probabilistic cricket simulation engine serving 256+ users and 444+ matches in production on OCI — resolves 120 deliveries per match in under 5 seconds.
~21K lines of Python: a 126-endpoint REST API, a 13-model SQLAlchemy schema, and a 7-layer Game State Momentum Engine with exponential decay over a 13-dimensional state vector.
Kept concurrent match state correct under load with per-match threading locks; 80%+ test coverage across a 6-config CI matrix. Shipped solo with session auth, Nginx, and zero-downtime deploys.
AI-Driven Resume Tailoring
Built a full RAG pipeline with FAISS hybrid dense retrieval and cross-encoder re-ranking + a QLoRA-fine-tuned Llama — improved resume-to-JD match scores 45%.
Added RLHF-style preference re-ranking driven by live user feedback; deployed as autoscaling FastAPI microservices with 4-bit quantization (60% less GPU memory).
Constitutional AI self-critique loops for output quality control.
Async Audio Intelligence Pipeline
Built a 3-worker async pipeline over Redis queues that processes 1-hour meetings in 7–11 minutes, down from a 45–75 minute baseline.
Integrated Whisper + Gemini / GPT-4 via fully decoupled workers with idempotent retry logic — horizontal scaling needs no API or state-layer changes.
Prompt-engineered tool-call chains for structured entity extraction (action items, owners, deadlines); cut manual post-meeting review 80%.
Applyd — Job Aggregation Platform
Built a two-service backend (FastAPI identity service + dashboard) with cookie-based Argon2id authentication, role-gated admin APIs, and runtime-tunable per-IP and per-email rate limiting.
Designed a three-signal expired-job lifecycle across 10+ ATS providers — a state machine gated by two kill switches and per-ATS circuit breakers that fail closed.
Engineered safe data operations: idempotent URL-keyed UPSERTs, partial indexes, FTS5 search, atomic SQLite backups behind a multi-gate restore, and a tamper-evident admin audit log.
Let's build the next one.
Open to Software Engineer, AI Engineer, GenAI Developer and Forward Deployed Engineer roles — and I'll talk to anyone building something interesting.
mabotta12@gmail.com →