open to work · nyc
github
open to work · ms data science · nyu · new york, ny

DEEPALI
BALAKRISHNA KSHEERSAGAR

AI/ML engineer building systems that work in the real world. LLM pipelines, multi-agent architectures, NLP at scale. Research to production.

Deepali Balakrishna Ksheersagar

I'm a Data Science graduate from NYU (May 2026) with 3+ years across ML engineering, NLP research, and data engineering. My work spans LLMs, agentic AI, healthcare applications, and production ML pipelines. I care about the full journey from messy data to something deployed and reliable.

Previously a Machine Learning Engineer at CGI for 2 years, building NER pipelines on 2M+ documents for a Fortune 500 telecom client. Also at The Global Consortium of Nursing & Midwifery Studies (multilingual NLP across 40+ languages) and Accenture (data engineering). Most recently research at NYU Rory Meyers applying BERT and statistical modeling to global healthcare data.

Outside work, I'm a Women in Data Science Ambassador (2026) and enjoy the intersection of AI with healthcare, building systems that reduce friction in clinical workflows and improve patient outcomes. When I'm not coding, I love painting, reading, and gardening — you can find my art on Instagram ↗.

🏆 3rd place · GDG NYC × NYU 2026 🏅 Violet Internship & Research Award 2025 🌐 WiDS Ambassador 2026 📄 Published · IJERT 💰 ₹37,000 Govt. Funding · IEDC 🦈 GitHub Pull Shark & Pair Extraordinaire
01
TenantShield: AI Building Inspection Platform 🏆 3rd place · GDG NYC × NYU 2026
2026 github ↗

Multi-agent AI platform helping NYC tenants document housing violations and auto-generate formal complaints. 3-agent A2A pipeline (Interacting → Inspection → Filing) with Gemini Live API for real-time voice intake and multimodal image analysis per NYC Housing Maintenance Code. Built at NYC Open Data Week, judged by engineers from Meta, Bloomberg, Instagram & Google.

Gemini Live APIVertex AIA2A multi-agent KotlinJetpack ComposeCameraX FirebaseGoogle CloudMVVM
02
CoffeeAgntcy: Multi-Agent System Optimization
2026 github ↗

Profiled and optimized a distributed multi-agent AI system. Cut latency 17% and achieved 12,000× throughput on cached queries using async Python, TCP connection pooling, and response caching.

Python asyncioLangGraph GPT-4oDockercProfileSLIM TCP
03
Article Ingestion Pipeline, Muck Rackcapstone
2025 github ↗

Production-grade HTML quality detection pipeline achieving 0.98 precision / 0.92 F1. Combined BeautifulSoup + XGBoost with rule-based heuristics. Built a GPT-4.1-mini few-shot labeling pipeline to expand an imbalanced dataset and fully eliminated manual QA bottlenecks.

PythonXGBoostGPT-4.1 BeautifulSoupFew-shot promptingML Pipeline
04
Strategic Clarification in LLM Agents via CMDPs
2026 github ↗

Fine-tuned Qwen2.5-Coder-7B with LoRA + PPO-Lagrangian reinforcement learning to decide when to ask clarifying questions under a strict budget. Achieved +6.2pp improvement in code correctness over baseline on HumanEvalComm benchmark.

PyTorchPPO-LagrangianLoRA Qwen2.5-Coder-7BConstrained MDPHuggingFace
05
Emotion Learning Evaluation for LLMs
2025 github ↗

Benchmarked emotional intelligence of Gemma, Qwen, and Llama using zero-shot and few-shot prompt engineering. All models exceeded chance performance. Evaluation orchestration via LangChain + HuggingFace.

LangChainHuggingFace GemmaQwenLlamaLLM Eval
06
heart-maxxxxing: Cardiac Rehab Gamification Pulse Foundry AI Healthcare Hackathon 2026

Mario-style browser game turning 36-session cardiac rehab into an interactive world-map adventure. Gemini API as in-game AI guide. High-contrast pixel-art UI with Framer Motion. Deployed on Vercel.

Gemini APINext.jsReact Tailwind CSSFramer MotionVercel
07
Voice-Based Disease Prediction (Interactive Health Bot)

End-to-end voice disease prediction deployed on Raspberry Pi. STT → NLP symptom extraction → Random Forest → TTS feedback. 95.02% accuracy. Received ₹37,000 in govt. funding. Published in IJERT.

PythonNLTKScikit-learn Random ForestSTT/TTSRaspberry PiEdge AI
08
VertiBuds: AI-Assisted Vertigo Support System

Digital health system for chronic vertigo combining prompt engineering with Figma UX. Visualizes balance stability via Internal Compass with adaptive audio and textual guidance.

Prompt EngineeringFigma UX DesignHealth TechAccessibility
09
Auto E-Commerce: Agentic Trend-to-Store Pipeline

Autonomously detects niche microtrends and launches a full e-commerce store in minutes. CEO Orchestrator Agent coordinates Nimble (trend scraping), ClickHouse (data storage), and Datadog-monitored sub-agents to go from trend signal to live storefront end-to-end.

Multi-agent orchestrationLangGraph NimbleClickHouseDatadogPython
Sep 2025 – Present
NYU Rory Meyers College of Nursing
Graduate Research Assistant
  • Statistical modeling on survey data from 90+ countries to perform pattern recognition in healthcare during COVID-19.
  • Built interactive Tableau dashboard to visualize findings for the organization's public-facing website.
  • BERT-based multi-class classifier achieving 87% accuracy on 5,000+ survey responses on violence faced by medical professionals.
PythonBERTTransformersStatistical ModelingTableau
Jun 2025 – Sep 2025
The Global Consortium of Nursing & Midwifery Studies
Data Science Intern
  • Analyzed 21,000+ textual survey responses using multilingual RoBERTa (HuggingFace) across 40+ languages on PPE usage and funding during COVID-19.
  • Applied sentiment analysis and topic modeling to surface cross-country trends for public health reporting.
PythonMultilingual RoBERTaSentiment AnalysisTopic ModelingHuggingFace
Feb 2022 – Jun 2023
CGI Inc.
Software Engineer (ML)
  • Led NER implementation as SME for a Fortune 500 telecom client, extracting text and tabular entities from 2M+ unstructured documents using SpaCy.
  • Designed a structured database from extracted entities, reducing manual review time by 71%.
  • Built automated classification system (Flair + FastText) to categorize 10,000+ employee and client reviews for C-suite FY2023 strategic planning.
SpaCyFlairFastTextNERETL PipelinesDatabase Design
2020 – 2021
Accenture
Data Engineer
  • Migrated relational database tables to Hadoop File System (HDFS).
  • Generated 500+ config files, RunDate files, HQL query files and AutoSys Job Boxes using Spark engine for data ingestion pipelines.
SQLHadoop / HDFSApache SparkHiveQLAutoSys
LLMs & agentic AI
LangChain LangGraph Multi-agent (A2A) RAG Systems Prompt Engineering Few-shot / n-shot LLM Evaluation GPT-4 / GPT-4o Gemini API Gemini Live API Vertex AI
model training & fine-tuning
PyTorch HuggingFace Transformers LoRA / PEFT Reinforcement Learning PPO / PPO-Lagrangian BERT / RoBERTa Qwen / Llama / Gemma TensorFlow Scikit-learn XGBoost
NLP
SpaCy Named Entity Recognition Sentiment Analysis Topic Modeling Multilingual NLP Text Classification Natural Language Understanding Flair FastText NLTK
ML engineering & infra
Python async Python ML Pipelines ETL Pipelines Data Labeling Automation Docker Google Cloud Firebase SQL Git
data & evaluation
Statistical Modeling A/B Testing Hypothesis Testing Bayesian Methods Pandas / NumPy Matplotlib / Seaborn Tableau R
languages & tools
Python SQL R C++ Jupyter VS Code GitHub Vercel
degrees
2024 – 2026
MS Data Science
New York University
Deep Learning · Machine Learning · Natural Language Understanding · Reinforcement Learning · Big Data · AI Applications in Business (NYU Stern School of Business)
2016 – 2020
BE Electronics & Communication
B.N.M Institute of Technology
Artificial Neural Networks · Python Application Programming
honors & recognition
🏆
3rd Place, GDG NYC × NYU Tandon "Build With AI" Hackathon 2026 for TenantShield, judged by engineers from Meta, Bloomberg, Instagram & Google.
🏅
Violet Internship & Research Award 2025. NYU recognition for research and internship excellence.
🌐
Women in Data Science Ambassador 2026. Representing NYU in the global WiDS network.
📄
Published Researcher. International Journal of Engineering Research & Technology (IJERT). Symptoms Extraction from Voice Input Using NLP ↗
💰
₹37,000 Government Funding. Innovative and Entrepreneurship Development Centres (IEDC), Indian Govt. for the Voice-Based Disease Prediction project.
🦈
GitHub Pull Shark & Pair Extraordinaire. GitHub achievement badges for open source contributions.
11
public repositories
13
stars given
2
github achievements
pinned repositories

I'm graduating from NYU in May 2026 and actively looking for full-time roles in ML engineering, applied AI, NLP research, and data science.

Open to research collaborations, interesting problems, and conversations about building AI that actually ships. Response time: under 24 hours.

Based in New York, NY
Available from May 2026