[ Index — 2026 ]

Manish Botta — building production AI & backend systems.

Engineer with 3+ years shipping LLM fine-tuning, agentic pipelines, RLHF alignment, RAG architectures, and the high-scale APIs and event-driven services that hold them up. Comfortable owning the full stack from model training to cloud deployment.

Roles: SWE · AI · GenAI · Forward Deployed

Based: Los Angeles, CA

Email: mabotta12@gmail.com

Links: GitHub · LinkedIn

01 About

I build production AI-focused backend systems — REST APIs serving enterprise clients, async AI pipelines, and a real-time simulation engine with live users. I've shipped models serving 100M+ daily conversations, cut inference latency by 40%, and reduced harmful outputs 28% in live deployments.

My experience spans LLM fine-tuning (LoRA / QLoRA), RLHF / RLAIF, agentic pipelines, and RAG architectures on one side; high-scale event-driven microservices, REST APIs, and real-time engines on the other. I don't wait for a spec to exist before I start working.

I'm also forward-deployed at heart — I shipped GenAI systems to 24+ enterprise clients (IHCL, Dr. Reddy's, Bajaj Finserv, Spotify), owning on-site integration, dual-sided InfoSec approvals, demos, and onboarding.

Looking to join a Series A/B/C company and build the backend — from model training to cloud deployment — from the ground up.

Years in prod

100M+

Events / day

24+

Enterprise clients

99.9%

Uptime SLA

02 Experience

TradeLiv

Founding Software Engineer

Apr 2026 — Present

Sole engineer building a B2B SaaS platform from zero — 40+ REST endpoints, PostgreSQL, Stripe payments, real-time SSE client portals, and role-based access across designer / admin / client tiers, live at tradeliv.design.
Built an AI extraction pipeline using Claude + Puppeteer / Browserless Chrome that parses any e-commerce product URL into structured data — a single universal extractor replacing brittle per-site scrapers.
Designed a 3-stage CI/CD pipeline that offloads Next.js and TypeScript builds to GitHub Actions runners, deploying compiled artifacts to a 1GB OCI VM — no OOM kills, zero-downtime deploys.

Microsoft

Software Engineer

Aug 2025 — Mar 2026

Built LLM-powered autonomous agents with tool-use, memory, and planning loops (FastAPI + LangChain + HubSpot OAuth 2.0); eliminated 70% of manual data-handling overhead.
Designed real-time ingestion pipelines for continuous RLHF / RLAIF feedback loops, improving model data freshness and inference availability by 60%.
Shipped Python ETL pipelines and REST API integrations replacing manual analyst workflows — cutting processing time 70%, reporting from weekly to same-day.
Containerized AI microservices via Docker + GitHub Actions on Azure; prompt-level regression testing cut integration defects 40% with near zero-downtime releases.

Yellow.AI

Software Engineer — Forward Deployed / Studio Solutions

Sep 2021 — Jul 2023

Delivered GenAI chatbot systems for 24+ enterprise clients — IHCL, Dr. Reddy's, Bajaj Finserv, Spotify — embedding with each to scope use cases, check feasibility, and ship production agents.
Led on-site integration of client CRM, ticketing, and data systems into deployed agents; drove InfoSec approvals on both sides to unblock launches stuck for weeks.
Fine-tuned BERT and open-source LLMs (LoRA / QLoRA); RLHF reward checkpoints cut harmful responses 28% and lifted chatbot accuracy 60% across 100M+ daily conversations.
Designed and operated event-driven microservices (Node.js, Python) processing 100M+ events / day at 99.9% uptime — the core messaging backbone for enterprise bot workflows.
Cut API response time 40% by overhauling MySQL / MongoDB query plans, adding indexes, and introducing Redis caching — then used that headroom to onboard 50+ enterprise CRM integrations without touching latency.
Migrated legacy rule-based pipelines to LLM-powered systems (−50% maintenance overhead); automated end-to-end ML deployment on AWS (EC2, Lambda, S3) with Docker + Jenkins CI/CD — release cycles down 3× across 13 production chatbots.

Cognizant

Program Trainee Analyst — Full-Stack

Mar 2021 — Aug 2021

Decomposed a legacy monolith into Java Spring Boot microservices, improving team velocity 30%, cutting query response time 40%, and maintaining 80%+ test coverage throughout.
Lowered p95 latency on core endpoints by building RESTful APIs in Spring Boot and tuning the underlying MySQL queries.

03 Selected Work

/ 01 — Agent GitHub ↗

Role Collector — Job Sourcing Agent

May 2026 → Present

LangGraphOllamaQwen3 8BPydanticSQLiteLangfuseStreamlit

Built a local-first sourcing agent on a 13-node LangGraph workflow with a local Ollama LLM (qwen3:8b) and Langfuse tracing with PII redaction — full observability on every run.

Enforced a hard-coded URL-allowlist guardrail in code, before any browser tool runs — the agent can never touch login, apply, shortener, or non-allowlisted URLs.

Beat Google's CDP-layer bot detection — which defeated every Playwright / Puppeteer stealth variant — with a nodriver WebSocket-debug approach, backed by 55 passing tests and a deterministic query generator.

/ 02 — Real-Time Live ↗

SimCricketX — Simulation Backend

Probabilistic engine · In production

PythonFastAPINginxOracle CloudPostgreSQL

Built a probabilistic cricket simulation engine serving 256+ users and 444+ matches in production on OCI — resolves 120 deliveries per match in under 5 seconds.

~21K lines of Python: a 126-endpoint REST API, a 13-model SQLAlchemy schema, and a 7-layer Game State Momentum Engine with exponential decay over a 13-dimensional state vector.

Kept concurrent match state correct under load with per-match threading locks; 80%+ test coverage across a 6-config CI matrix. Shipped solo with session auth, Nginx, and zero-downtime deploys.

/ 03 — RAG GitHub ↗

AI-Driven Resume Tailoring

Aug 2025 — Sep 2025

FAISSQLoRALlamaRLHFFastAPIHuggingFace

Built a full RAG pipeline with FAISS hybrid dense retrieval and cross-encoder re-ranking + a QLoRA-fine-tuned Llama — improved resume-to-JD match scores 45%.

Added RLHF-style preference re-ranking driven by live user feedback; deployed as autoscaling FastAPI microservices with 4-bit quantization (60% less GPU memory).

Constitutional AI self-critique loops for output quality control.

/ 04 — Async GitHub ↗

Async Audio Intelligence Pipeline

Nov 2025 — Dec 2025

FastAPIRedisWhisperGeminiGPT-4Async Workers

Built a 3-worker async pipeline over Redis queues that processes 1-hour meetings in 7–11 minutes, down from a 45–75 minute baseline.

Integrated Whisper + Gemini / GPT-4 via fully decoupled workers with idempotent retry logic — horizontal scaling needs no API or state-layer changes.

Prompt-engineered tool-call chains for structured entity extraction (action items, owners, deadlines); cut manual post-meeting review 80%.

/ 05 — Backend GitHub ↗

Applyd — Job Aggregation Platform

Multi-service backend

FastAPISQLiteFTS5Argon2id

Built a two-service backend (FastAPI identity service + dashboard) with cookie-based Argon2id authentication, role-gated admin APIs, and runtime-tunable per-IP and per-email rate limiting.

Designed a three-signal expired-job lifecycle across 10+ ATS providers — a state machine gated by two kill switches and per-ATS circuit breakers that fail closed.

Engineered safe data operations: idempotent URL-keyed UPSERTs, partial indexes, FTS5 search, atomic SQLite backups behind a multi-gate restore, and a tamper-evident admin audit log.

04 Toolbox

AI / ML

RLHFRLAIFReward ModelingFine-Tuning (QLoRA, LoRA, PEFT)RAG PipelinesConstitutional AIClaude Code

Agentic & NLP

LangChainLangGraphReActTool-Use AgentsMulti-Agent OrchestrationMemory & PlanningLangfuseTransformersBERTSentence EmbeddingsWhisperGeminiOpenAIClaudeOllamaHuggingFace

Languages

PythonNode.jsTypeScriptJavaSQL

Frameworks

FastAPIFlaskDjangoExpress.jsSpring Boot

Data & Vectors

PostgreSQLMySQLMongoDBSQLiteFAISSChroma

Messaging & Caching

RedisKafkaRedis QueuesWebSockets

Cloud

AWS (EC2, S3, Lambda, CloudWatch)AzureOracle Cloud

DevOps

DockerGitHub ActionsJenkinsNginxGunicorn

APIs & Auth

RESTJWTOAuth 2.0

Customer & Delivery

Client OnboardingOn-Site IntegrationInfoSec / Security ReviewsStakeholder DemosAgile / Standups

05 Education

California State University, San Bernardino

M.S. Computer Science

Distributed Systems · Advanced Algorithms · Machine Learning · Database Systems · Cloud Computing

Aug 2023 — May 2025

Amrita Vishwa Vidyapeetham

B.S. Computer Science · Coimbatore, India

Data Structures · Operating Systems · Computer Networks · Software Engineering

Jul 2017 — May 2021

06 Contact

Let's build the next one.

Open to Software Engineer, AI Engineer, GenAI Developer and Forward Deployed Engineer roles — and I'll talk to anyone building something interesting.

mabotta12@gmail.com →

GitHub manishrdy

LinkedIn manish-reddyb

Based Los Angeles, CA