open to work · ms data science · nyu · new york, ny

DEEPALI
BALAKRISHNA KSHEERSAGAR

AI/ML engineer building systems that work in the real world. LLM pipelines, multi-agent architectures, NLP at scale. Research to production.

I'm a Data Science graduate from NYU (May 2026) with 3+ years across ML engineering, NLP research, and data engineering. My work spans LLMs, agentic AI, healthcare applications, and production ML pipelines. I care about the full journey from messy data to something deployed and reliable.

Previously a Machine Learning Engineer at CGI for 2 years, building NER pipelines on 2M+ documents for a Fortune 500 telecom client. Also at The Global Consortium of Nursing & Midwifery Studies (multilingual NLP across 40+ languages) and Accenture (data engineering). Most recently research at NYU Rory Meyers applying BERT and statistical modeling to global healthcare data.

Outside work, I'm a Women in Data Science Ambassador (2026) and enjoy the intersection of AI with healthcare, building systems that reduce friction in clinical workflows and improve patient outcomes. When I'm not coding, I love painting, reading, and gardening — you can find my art on Instagram ↗.

🏆 3rd place · GDG NYC × NYU 2026 🏅 Violet Internship & Research Award 2025 🌐 WiDS Ambassador 2026 📄 Published · IJERT 💰 ₹37,000 Govt. Funding · IEDC 🦈 GitHub Pull Shark & Pair Extraordinaire

get in touch

email db5144@nyu.edu ↗ linkedin deepali-bk ↗ github Deepali-BK ↗ devpost Deepali-BK ↗ instagram artkive_by_dee ↗

01

TenantShield: AI Building Inspection Platform 🏆 3rd place · GDG NYC × NYU 2026

2026 github ↗

Multi-agent AI platform helping NYC tenants document housing violations and auto-generate formal complaints. 3-agent A2A pipeline (Interacting → Inspection → Filing) with Gemini Live API for real-time voice intake and multimodal image analysis per NYC Housing Maintenance Code. Built at NYC Open Data Week, judged by engineers from Meta, Bloomberg, Instagram & Google.

Gemini Live APIVertex AIA2A multi-agent KotlinJetpack ComposeCameraX FirebaseGoogle CloudMVVM

02

CoffeeAgntcy: Multi-Agent System Optimization

2026 github ↗

Profiled and optimized a distributed multi-agent AI system. Cut latency 17% and achieved 12,000× throughput on cached queries using async Python, TCP connection pooling, and response caching.

Python asyncioLangGraph GPT-4oDockercProfileSLIM TCP

03

Article Ingestion Pipeline, Muck Rackcapstone

2025 github ↗

Production-grade HTML quality detection pipeline achieving 0.98 precision / 0.92 F1. Combined BeautifulSoup + XGBoost with rule-based heuristics. Built a GPT-4.1-mini few-shot labeling pipeline to expand an imbalanced dataset and fully eliminated manual QA bottlenecks.

PythonXGBoostGPT-4.1 BeautifulSoupFew-shot promptingML Pipeline

04

Strategic Clarification in LLM Agents via CMDPs

2026 github ↗

Fine-tuned Qwen2.5-Coder-7B with LoRA + PPO-Lagrangian reinforcement learning to decide when to ask clarifying questions under a strict budget. Achieved +6.2pp improvement in code correctness over baseline on HumanEvalComm benchmark.

PyTorchPPO-LagrangianLoRA Qwen2.5-Coder-7BConstrained MDPHuggingFace

05

Emotion Learning Evaluation for LLMs

2025 github ↗

Benchmarked emotional intelligence of Gemma, Qwen, and Llama using zero-shot and few-shot prompt engineering. All models exceeded chance performance. Evaluation orchestration via LangChain + HuggingFace.

LangChainHuggingFace GemmaQwenLlamaLLM Eval

06

heart-maxxxxing: Cardiac Rehab Gamification Pulse Foundry AI Healthcare Hackathon 2026

2026 devpost ↗

Mario-style browser game turning 36-session cardiac rehab into an interactive world-map adventure. Gemini API as in-game AI guide. High-contrast pixel-art UI with Framer Motion. Deployed on Vercel.

Gemini APINext.jsReact Tailwind CSSFramer MotionVercel

07

Voice-Based Disease Prediction (Interactive Health Bot)

2020 github ↗ paper ↗

End-to-end voice disease prediction deployed on Raspberry Pi. STT → NLP symptom extraction → Random Forest → TTS feedback. 95.02% accuracy. Received ₹37,000 in govt. funding. Published in IJERT.

PythonNLTKScikit-learn Random ForestSTT/TTSRaspberry PiEdge AI

08

VertiBuds: AI-Assisted Vertigo Support System

2024 github ↗ devpost ↗

Digital health system for chronic vertigo combining prompt engineering with Figma UX. Visualizes balance stability via Internal Compass with adaptive audio and textual guidance.

Prompt EngineeringFigma UX DesignHealth TechAccessibility

09

Auto E-Commerce: Agentic Trend-to-Store Pipeline

2026 devpost ↗

Autonomously detects niche microtrends and launches a full e-commerce store in minutes. CEO Orchestrator Agent coordinates Nimble (trend scraping), ClickHouse (data storage), and Datadog-monitored sub-agents to go from trend signal to live storefront end-to-end.

Multi-agent orchestrationLangGraph NimbleClickHouseDatadogPython

Sep 2025 – Present

NYU Rory Meyers College of Nursing

Graduate Research Assistant

Statistical modeling on survey data from 90+ countries to perform pattern recognition in healthcare during COVID-19.
Built interactive Tableau dashboard to visualize findings for the organization's public-facing website.
BERT-based multi-class classifier achieving 87% accuracy on 5,000+ survey responses on violence faced by medical professionals.

PythonBERTTransformersStatistical ModelingTableau

Jun 2025 – Sep 2025

The Global Consortium of Nursing & Midwifery Studies

Data Science Intern

Analyzed 21,000+ textual survey responses using multilingual RoBERTa (HuggingFace) across 40+ languages on PPE usage and funding during COVID-19.
Applied sentiment analysis and topic modeling to surface cross-country trends for public health reporting.

PythonMultilingual RoBERTaSentiment AnalysisTopic ModelingHuggingFace

Feb 2022 – Jun 2023

CGI Inc.

Software Engineer (ML)

Led NER implementation as SME for a Fortune 500 telecom client, extracting text and tabular entities from 2M+ unstructured documents using SpaCy.
Designed a structured database from extracted entities, reducing manual review time by 71%.
Built automated classification system (Flair + FastText) to categorize 10,000+ employee and client reviews for C-suite FY2023 strategic planning.

SpaCyFlairFastTextNERETL PipelinesDatabase Design

2020 – 2021

Accenture

Data Engineer

Migrated relational database tables to Hadoop File System (HDFS).
Generated 500+ config files, RunDate files, HQL query files and AutoSys Job Boxes using Spark engine for data ingestion pipelines.

SQLHadoop / HDFSApache SparkHiveQLAutoSys

LLMs & agentic AI

LangChain LangGraph Multi-agent (A2A) RAG Systems Prompt Engineering Few-shot / n-shot LLM Evaluation GPT-4 / GPT-4o Gemini API Gemini Live API Vertex AI

model training & fine-tuning

PyTorch HuggingFace Transformers LoRA / PEFT Reinforcement Learning PPO / PPO-Lagrangian BERT / RoBERTa Qwen / Llama / Gemma TensorFlow Scikit-learn XGBoost

NLP

SpaCy Named Entity Recognition Sentiment Analysis Topic Modeling Multilingual NLP Text Classification Natural Language Understanding Flair FastText NLTK

ML engineering & infra

Python async Python ML Pipelines ETL Pipelines Data Labeling Automation Docker Google Cloud Firebase SQL Git

data & evaluation

Statistical Modeling A/B Testing Hypothesis Testing Bayesian Methods Pandas / NumPy Matplotlib / Seaborn Tableau R

languages & tools

Python SQL R C++ Jupyter VS Code GitHub Vercel

degrees

2024 – 2026

MS Data Science

New York University

Deep Learning · Machine Learning · Natural Language Understanding · Reinforcement Learning · Big Data · AI Applications in Business (NYU Stern School of Business)

2016 – 2020

BE Electronics & Communication

B.N.M Institute of Technology

Artificial Neural Networks · Python Application Programming

honors & recognition

🏆

3rd Place, GDG NYC × NYU Tandon "Build With AI" Hackathon 2026 for TenantShield, judged by engineers from Meta, Bloomberg, Instagram & Google.

🏅

Violet Internship & Research Award 2025. NYU recognition for research and internship excellence.

🌐

Women in Data Science Ambassador 2026. Representing NYU in the global WiDS network.

📄

Published Researcher. International Journal of Engineering Research & Technology (IJERT). Symptoms Extraction from Voice Input Using NLP ↗

💰

₹37,000 Government Funding. Innovative and Entrepreneurship Development Centres (IEDC), Indian Govt. for the Voice-Based Disease Prediction project.

🦈

GitHub Pull Shark & Pair Extraordinaire. GitHub achievement badges for open source contributions.

11

public repositories

13

stars given

2

github achievements

pinned repositories

Article-Ingestion-Pipeline-for-Muck-Rack

pythonxgboostgpt-4ml-pipeline

Production-grade HTML quality detection pipeline combining BeautifulSoup + XGBoost with GPT-4 few-shot labeling. 0.98 precision, 0.92 F1.

Python

Vertibuds

figmahealth-techai-promptingux

Wearable system designed to help people living with chronic vertigo. AI-assisted balance visualization and adaptive guidance.

HTML

TenantShield

kotlingeminivertex-aia2a

Multi-agent AI platform for NYC housing complaints. 3rd place at GDG NYC × NYU Tandon "Build With AI" Hackathon 2026.

Kotlin

NLU-project: LLM Emotion Eval

langchainhuggingfacellm-evalnlp

Benchmarked emotional intelligence of Gemma, Qwen, and Llama with zero-shot and few-shot prompt engineering.

Python

Voice-Based-Disease-Prediction

nlprandom-forestraspberry-piedge-ai

STT → NLP → Random Forest → TTS disease prediction on Raspberry Pi. 95.02% accuracy. Published in IJERT.

Python

view all repositories

→

I'm graduating from NYU in May 2026 and actively looking for full-time roles in ML engineering, applied AI, NLP research, and data science.

Open to research collaborations, interesting problems, and conversations about building AI that actually ships. Response time: under 24 hours.

Based in New York, NY

Available from May 2026

email db5144@nyu.edu ↗ linkedin linkedin.com/in/deepali-bk ↗ github github.com/Deepali-BK ↗ portfolio datascienceportfol.io ↗

DEEPALI
BALAKRISHNA KSHEERSAGAR

Selected work

3+ YOE across 4 roles in ML, NLP & data engineering

Built for AI/ML & agentic roles

Degrees, courses & honors

github.com/Deepali-BK ↗

Let's work on something real

DEEPALIBALAKRISHNA KSHEERSAGAR

Selected work

3+ YOE across 4 roles in ML, NLP & data engineering

Built for AI/ML & agentic roles

Degrees, courses & honors

github.com/Deepali-BK ↗

Let's work on something real

DEEPALI
BALAKRISHNA KSHEERSAGAR