[ Index — 2026 ]

Manish Botta — building production AI & backend systems.

Engineer with 3+ years shipping LLM fine-tuning, agentic pipelines, RLHF alignment, RAG architectures, and the high-scale APIs and event-driven services that hold them up. Comfortable owning the full stack from model training to cloud deployment.

Roles
SWE · AI · GenAI · Forward Deployed
Based
Los Angeles, CA
Email
mabotta12@gmail.com
Links
GitHub · LinkedIn

I build production AI-focused backend systems — REST APIs serving enterprise clients, async AI pipelines, and a real-time simulation engine with live users. I've shipped models serving 100M+ daily conversations, cut inference latency by 40%, and reduced harmful outputs 28% in live deployments.

My experience spans LLM fine-tuning (LoRA / QLoRA), RLHF / RLAIF, agentic pipelines, and RAG architectures on one side; high-scale event-driven microservices, REST APIs, and real-time engines on the other. I don't wait for a spec to exist before I start working.

I'm also forward-deployed at heart — I shipped GenAI systems to 24+ enterprise clients (IHCL, Dr. Reddy's, Bajaj Finserv, Spotify), owning on-site integration, dual-sided InfoSec approvals, demos, and onboarding.

Looking to join a Series A/B/C company and build the backend — from model training to cloud deployment — from the ground up.

3+
Years in prod
100M+
Events / day
24+
Enterprise clients
99.9%
Uptime SLA
TradeLiv
Founding Software Engineer
Apr 2026 — Present
  • Sole engineer building a B2B SaaS platform from zero — 40+ REST endpoints, PostgreSQL, Stripe payments, real-time SSE client portals, and role-based access across designer / admin / client tiers, live at tradeliv.design.
  • Built an AI extraction pipeline using Claude + Puppeteer / Browserless Chrome that parses any e-commerce product URL into structured data — a single universal extractor replacing brittle per-site scrapers.
  • Designed a 3-stage CI/CD pipeline that offloads Next.js and TypeScript builds to GitHub Actions runners, deploying compiled artifacts to a 1GB OCI VM — no OOM kills, zero-downtime deploys.
Microsoft
Software Engineer
Aug 2025 — Mar 2026
  • Built LLM-powered autonomous agents with tool-use, memory, and planning loops (FastAPI + LangChain + HubSpot OAuth 2.0); eliminated 70% of manual data-handling overhead.
  • Designed real-time ingestion pipelines for continuous RLHF / RLAIF feedback loops, improving model data freshness and inference availability by 60%.
  • Shipped Python ETL pipelines and REST API integrations replacing manual analyst workflows — cutting processing time 70%, reporting from weekly to same-day.
  • Containerized AI microservices via Docker + GitHub Actions on Azure; prompt-level regression testing cut integration defects 40% with near zero-downtime releases.
Yellow.AI
Software Engineer — Forward Deployed / Studio Solutions
Sep 2021 — Jul 2023
  • Delivered GenAI chatbot systems for 24+ enterprise clients — IHCL, Dr. Reddy's, Bajaj Finserv, Spotify — embedding with each to scope use cases, check feasibility, and ship production agents.
  • Led on-site integration of client CRM, ticketing, and data systems into deployed agents; drove InfoSec approvals on both sides to unblock launches stuck for weeks.
  • Fine-tuned BERT and open-source LLMs (LoRA / QLoRA); RLHF reward checkpoints cut harmful responses 28% and lifted chatbot accuracy 60% across 100M+ daily conversations.
  • Designed and operated event-driven microservices (Node.js, Python) processing 100M+ events / day at 99.9% uptime — the core messaging backbone for enterprise bot workflows.
  • Cut API response time 40% by overhauling MySQL / MongoDB query plans, adding indexes, and introducing Redis caching — then used that headroom to onboard 50+ enterprise CRM integrations without touching latency.
  • Migrated legacy rule-based pipelines to LLM-powered systems (−50% maintenance overhead); automated end-to-end ML deployment on AWS (EC2, Lambda, S3) with Docker + Jenkins CI/CD — release cycles down across 13 production chatbots.
Cognizant
Program Trainee Analyst — Full-Stack
Mar 2021 — Aug 2021
  • Decomposed a legacy monolith into Java Spring Boot microservices, improving team velocity 30%, cutting query response time 40%, and maintaining 80%+ test coverage throughout.
  • Lowered p95 latency on core endpoints by building RESTful APIs in Spring Boot and tuning the underlying MySQL queries.
/ 01 — Agent GitHub

Role Collector — Job Sourcing Agent

May 2026 → Present
LangGraphOllamaQwen3 8BPydanticSQLiteLangfuseStreamlit

Built a local-first sourcing agent on a 13-node LangGraph workflow with a local Ollama LLM (qwen3:8b) and Langfuse tracing with PII redaction — full observability on every run.

Enforced a hard-coded URL-allowlist guardrail in code, before any browser tool runs — the agent can never touch login, apply, shortener, or non-allowlisted URLs.

Beat Google's CDP-layer bot detection — which defeated every Playwright / Puppeteer stealth variant — with a nodriver WebSocket-debug approach, backed by 55 passing tests and a deterministic query generator.

/ 02 — Real-Time Live

SimCricketX — Simulation Backend

Probabilistic engine · In production
PythonFastAPINginxOracle CloudPostgreSQL

Built a probabilistic cricket simulation engine serving 256+ users and 444+ matches in production on OCI — resolves 120 deliveries per match in under 5 seconds.

~21K lines of Python: a 126-endpoint REST API, a 13-model SQLAlchemy schema, and a 7-layer Game State Momentum Engine with exponential decay over a 13-dimensional state vector.

Kept concurrent match state correct under load with per-match threading locks; 80%+ test coverage across a 6-config CI matrix. Shipped solo with session auth, Nginx, and zero-downtime deploys.

/ 03 — RAG GitHub

AI-Driven Resume Tailoring

Aug 2025 — Sep 2025
FAISSQLoRALlamaRLHFFastAPIHuggingFace

Built a full RAG pipeline with FAISS hybrid dense retrieval and cross-encoder re-ranking + a QLoRA-fine-tuned Llama — improved resume-to-JD match scores 45%.

Added RLHF-style preference re-ranking driven by live user feedback; deployed as autoscaling FastAPI microservices with 4-bit quantization (60% less GPU memory).

Constitutional AI self-critique loops for output quality control.

/ 04 — Async GitHub

Async Audio Intelligence Pipeline

Nov 2025 — Dec 2025
FastAPIRedisWhisperGeminiGPT-4Async Workers

Built a 3-worker async pipeline over Redis queues that processes 1-hour meetings in 7–11 minutes, down from a 45–75 minute baseline.

Integrated Whisper + Gemini / GPT-4 via fully decoupled workers with idempotent retry logic — horizontal scaling needs no API or state-layer changes.

Prompt-engineered tool-call chains for structured entity extraction (action items, owners, deadlines); cut manual post-meeting review 80%.

/ 05 — Backend GitHub

Applyd — Job Aggregation Platform

Multi-service backend
FastAPISQLiteFTS5Argon2id

Built a two-service backend (FastAPI identity service + dashboard) with cookie-based Argon2id authentication, role-gated admin APIs, and runtime-tunable per-IP and per-email rate limiting.

Designed a three-signal expired-job lifecycle across 10+ ATS providers — a state machine gated by two kill switches and per-ATS circuit breakers that fail closed.

Engineered safe data operations: idempotent URL-keyed UPSERTs, partial indexes, FTS5 search, atomic SQLite backups behind a multi-gate restore, and a tamper-evident admin audit log.

AI / ML
RLHFRLAIFReward ModelingFine-Tuning (QLoRA, LoRA, PEFT)RAG PipelinesConstitutional AIClaude Code
Agentic & NLP
LangChainLangGraphReActTool-Use AgentsMulti-Agent OrchestrationMemory & PlanningLangfuseTransformersBERTSentence EmbeddingsWhisperGeminiOpenAIClaudeOllamaHuggingFace
Languages
PythonNode.jsTypeScriptJavaSQL
Frameworks
FastAPIFlaskDjangoExpress.jsSpring Boot
Data & Vectors
PostgreSQLMySQLMongoDBSQLiteFAISSChroma
Messaging & Caching
RedisKafkaRedis QueuesWebSockets
Cloud
AWS (EC2, S3, Lambda, CloudWatch)AzureOracle Cloud
DevOps
DockerGitHub ActionsJenkinsNginxGunicorn
APIs & Auth
RESTJWTOAuth 2.0
Customer & Delivery
Client OnboardingOn-Site IntegrationInfoSec / Security ReviewsStakeholder DemosAgile / Standups
California State University, San Bernardino
M.S. Computer Science
Distributed Systems · Advanced Algorithms · Machine Learning · Database Systems · Cloud Computing
Aug 2023 — May 2025
Amrita Vishwa Vidyapeetham
B.S. Computer Science · Coimbatore, India
Data Structures · Operating Systems · Computer Networks · Software Engineering
Jul 2017 — May 2021

Let's build the next one.

Open to Software Engineer, AI Engineer, GenAI Developer and Forward Deployed Engineer roles — and I'll talk to anyone building something interesting.

mabotta12@gmail.com