Available for opportunities

GenAI Engineer

Santhosh Kammari

Sole architect of AI systems deployed at the Reserve Bank of India.
Shipping LLM pipelines, hybrid retrieval engines, and multi-agent systems to production.

GitHub LinkedIn Email Resume

                            13.8K
                            LOC
                        
RBI Decision Support SystemSole architect · 57 modules · 5 GPU microservices

                            3
                            weeks
                        
End-to-end deliveryIngestion pipeline → query engine → production APIs

                            40+
                            rules
                        
Trade Finance EngineUCP 600 compliance · deployed at banking clients

                            85%
                            accuracy
                        
BrowseComp BenchmarkMulti-agent system · 1000+ web pages

PythonvLLMMilvusFastAPILangGraphPyTorchQwen3RAGDockerHuggingFace

01. About

I'm a GenAI Engineer at Newgen Software (Number Theory Group) with 2+ years building production AI/ML systems. I specialize in LLM-powered document intelligence, hybrid retrieval systems (RAG + Text2SQL), and multi-agent pipelines for financial and regulatory document processing.

I've been the sole architect of AI systems deployed at the Reserve Bank of India and banking clients — handling everything from GPU-orchestrated ingestion pipelines to stateful multi-turn conversational AI.

I hold an Integrated Dual Degree (BTech + MTech) in Information Technology from IIITM Gwalior.

3+ Production Systems

17K+ Lines of Production Code

900+ GitHub Contributions

1 IEEE Publication

02. Experience

Data Scientist

Newgen Software — Number Theory Group

Jul 2023 — Present

Sole architect of RBI's AI Decision Support System — 114 commits, 57 modules, ~13.8K LOC built in 3 weeks. Covers ingestion pipeline, hybrid query engine, similarity rules engine, 5 FastAPI GPU microservices, and annotation UIs.
Built 8-stage GPU-orchestrated PDF ingestion with dual-model ensemble extraction (Qwen3-14B text + Qwen3-VL vision) and Pydantic-enforced JSON arbitration — handles stamps, handwriting, and tables that OCR alone misses.
Designed hybrid RAG + Text2SQL query engine: LLM classifier routing → GTE dense + BGE-M3 sparse Milvus search → Qwen3-Reranker re-scoring → majority-vote SQL generation with automatic SQL-to-vector fallback.
Developed multi-stage stateful chat memory with history-intent pre-classification, semantic retrieval, 4-variant query refinement, and sufficiency voting — optimizing from 14 LLM calls to 2 on history-only paths.
Built 5-rule NBFC Similarity Engine running concurrent vector + SQL matching across 3 databases for regulatory compliance checking at application-receive time.

AI/ML Engineer

Number Theory

Jul 2022 — Jun 2023

Primary author of Trade Finance Rule Engine — 3,949 lines implementing 40+ UCP 600 compliance rules deployed at banking clients. Dynamic rule routing through configurable schema dispatch.
Built BERT-based signature verification combining Tesseract OCR coordinate extraction, sentence-transformer cosine similarity, and fuzzywuzzy fallback for real-world signature variants.
Designed spatial coordinate engine (1,845 lines) for structured document extraction — declarative geolocation schemas resolved at runtime against live OCR coordinates, eliminating retraining per bank form variant.
Co-authored Bundle API orchestrating 8 downstream microservices with race condition fixes, JWT auth, rate limiting, and per-service retry logic.

03. Featured Work

Production

RBI Decision Support System

Sole architect of RBI's AI-powered regulatory analysis system. 8-stage GPU pipeline, hybrid RAG+Text2SQL engine, stateful chat memory, 5 FastAPI microservices — delivered end-to-end in 3 weeks.

PythonFastAPIMilvusvLLMQwen3

Production

Trade Finance Rule Engine

Primary author. 40+ UCP 600 compliance rules with BERT signature verification, spatial coordinate extraction, and cross-document numeric reconciliation. Deployed at banking clients.

PythonFastAPIMongoDBBERTTesseract

Research

Multi-Agent Deep Research

Multi-agent system operating over 1,000+ web pages with parallel sub-agents and context management. Achieved 85% accuracy on the BrowseComp benchmark.

PythonLangGraphMulti-Agent

Production

3-Tier Retrieval & Reranking Engine

Authority-ranked retrieval over 5 Milvus collections. Alpha/Beta/Gamma tier search with Qwen3-Reranker cross-encoder scoring, query decomposition, and parallel sub-query execution.

FastAPIMilvusvLLMReranking

Production

DocVeda — Document Intelligence

Modular enterprise RAG platform: OmniDocs fetch → DOTS OCR → doclayout-YOLO parsing → semantic chunking → Milvus insert → LLM metadata filter → reranking → synthesis. CUDA 12.6 + ONNX backend.

FastAPIMilvusYOLOQwen3ONNX

Research

Embedding Fine-Tuning & Benchmarks

Fine-tuned domain embeddings with synthetic QA data — +4.2% Precision@1 over base GTE-large. Benchmarked 8 models across 20 metrics; findings drove production architecture decisions.

SentenceTransformersMilvusHuggingFace

04. Skills

Languages

PythonSQL

ML & Deep Learning

PyTorchHugging FaceScikit-learnNumPyPandasOpenCV

LLMs & NLP

RAGText2SQLMulti-hop QAPrompt EngineeringEmbedding Fine-tuningQwen3LLaMAvLLMOllamaQLoRANLTK

Frameworks

LangChainLangGraphLlamaIndexDSPyAutoGenOpenAI

Databases & Vector Stores

MilvusFAISSSQLiteMongoDBMySQL

Engineering

FastAPIGradioDockerGitLangfuseTesseract OCRMLflow

06. Publication

Fraud Detection on Bank Payments using Machine Learning

IEEE — International Conference for Advancement in Technology (ICAT 2022)

ML classification on Banksim dataset achieving 96.64% accuracy.

DOI: 10.1109/ICAT54021.2022.9726104 →

07. Education

Indian Institute of Information Technology and Management, Gwalior

Integrated Dual Degree (BTech + MTech) — Information Technology

Aug 2018 — June 2023

Santhosh Kammari

01. About

02. Experience

Data Scientist

AI/ML Engineer

03. Featured Work

RBI Decision Support System

Trade Finance Rule Engine

Multi-Agent Deep Research

3-Tier Retrieval & Reranking Engine

DocVeda — Document Intelligence

Embedding Fine-Tuning & Benchmarks

04. Skills

Languages

ML & Deep Learning

LLMs & NLP

Frameworks

Databases & Vector Stores

Engineering

05. Open Source

zero-to-hero-ai

multi-agent-deepresearch

openai-agent-terminal

demystifying-ai

06. Publication

Fraud Detection on Bank Payments using Machine Learning

07. Education

Indian Institute of Information Technology and Management, Gwalior