๐ New York, NY ย ยทย ๐ง db5144@nyu.edu ย ยทย ๐ผ linkedin.com/in/deepali-bk ย ยทย ๐ Portfolio
I'm a Data Science grad student at NYU who builds AI systems that actually work in the real world. From placing at hackathons with multimodal Gemini agents to shipping production NLP pipelines that process millions of documents, I care about the full journey from idea to deployed system. My work spans NLP, LLMs, healthcare AI, and multi-agent architectures across research, enterprise, and hackathon stages.
- ๐ฌ Graduate Research Assistant @ NYU Rory Meyers College of Nursing
- ๐ 3rd Place โ GDG NYC ร NYU Tandon "Build With AI" Hackathon 2026 (TenantShield)
- ๐ Violet Internship & Research Award 2025 & 2026
- ๐ Women in Data Science Ambassador 2026
- ๐ Published Researcher โ International Journal of Engineering Research & Technology (IJERT)
Languages & Tools
Python R SQL C++ Kotlin TypeScript Git
Machine Learning & Deep Learning
Scikit-learn PyTorch TensorFlow XGBoost Flair FastText LoRA Reinforcement Learning Computer Vision
NLP & Generative AI
HuggingFace Transformers LangChain LangGraph SpaCy BERT RoBERTa GPT-4 RAG Systems
Gemini API Gemini Live API Google ADK Multi-agent AI (A2A) Vertex AI
RLHF PPO-Lagrangian Sentiment Analysis Topic Modeling Named Entity Recognition Few-shot Prompting
Cloud & Backend
Google Cloud Firebase (Auth, Firestore, Storage) Vertex AI FastAPI Temporal Render Vercel
Frontend & Mobile
Next.js React Tailwind CSS Framer Motion Jetpack Compose CameraX Android (Kotlin)
Data & Visualization
Pandas NumPy Matplotlib Seaborn Tableau ClickHouse Spark Hadoop (HDFS)
Statistics
Statistical Modeling Bayesian Methods A/B Testing Hypothesis Testing
๐ TenantShield โ AI Building Inspection & Complaint Platform (GDG NYC ร NYU Tandon "Build With AI" Hackathon โ 3rd Place ๐ฅ)
Multi-agent AI platform helping tenants document housing violations and auto-generate formal NYC housing complaints โ built in one day at NYC Open Data Week.
- Designed a 3-agent A2A pipeline (Interacting โ Inspection โ Filing) coordinated by a central state machine
- Interacting Agent uses Gemini Live API (real-time voice over WebSocket) for conversational tenant intake
- Inspection Agent performs multimodal image analysis to classify violations per NYC Housing Maintenance Code (Class A / B / C)
- Filing Agent generates structured, legally-formatted complaint documents from inspection results
- Full-stack: Android mobile app (Kotlin, Jetpack Compose, CameraX) + web application, backed by Firebase (Auth, Firestore, Storage)
- ๐ค Team project with 3 NYU collaborators โ judged by engineers from Meta, Bloomberg, Instagram & Google
Gemini Live API Vertex AI Firebase Google Cloud Kotlin Jetpack Compose CameraX Multi-agent AI A2A
RL agent trained to decide when to ask clarifying questions vs. answer directly โ balancing answer quality against over-questioning via a Constrained MDP.
- Fine-tuned Qwen2.5-Coder-7B via LoRA + PPO-Lagrangian (RLHF) on the HumanEvalComm benchmark
- Used a GPT-4o-mini simulator to evaluate across clarification budgets; best policy achieves +6.2pp MT pass@1 over untrained baseline (p < 0.0001)
Reinforcement Learning PPO-Lagrangian LoRA RLHF Qwen2.5-Coder-7B LangChain HumanEvalComm
Profiled and optimized a distributed multi-agent AI system running across 10 Docker containers with LangGraph A2A architecture.
- Used cProfile to pinpoint OpenAI API I/O as the bottleneck
- Applied asyncio + aiohttp concurrency, MD5-keyed caching, and TCP pooling โ achieving 17% latency reduction and 12,000ร throughput gain
asyncio aiohttp LangChain LangGraph GPT-4o Docker cProfile
๐ Muck Rack โ HTML Quality Detection Pipeline (Capstone)
Production-grade ML pipeline combining BeautifulSoup + XGBoost with rule-based heuristics for HTML quality detection for PR firm Muck Rack.
- Achieved 0.98 precision / 0.920 F1 score
- Built a GPT-4.1-mini few-shot labeling pipeline to expand an imbalanced seed dataset
- Eliminated manual QA bottlenecks entirely
Python XGBoost BeautifulSoup GPT-4 Few-shot Learning ML Pipelines
Benchmarked emotional intelligence capabilities of Gemma, Qwen, and Llama using zero-shot and few-shot prompt engineering.
- Demonstrated above-chance performance across all three models
- Leveraged LangChain and HuggingFace for evaluation orchestration
LangChain HuggingFace Prompt Engineering Zero-shot Few-shot LLM Evaluation
๐ Auto E-Commerce โ Autonomous Trend-to-Storefront Pipeline (Agentic Engineering Hack)
Autonomous multi-agent system that detects niche microtrends and launches a full e-commerce store end-to-end โ no human required.
- Built a CEO Orchestrator Agent coordinating 5 specialized sub-agents: Research, Buyer, Legal/Risk, Advertising, and Store Creator
- Research Agent scores trend strength via Nimble API; Store Creator spins up a live storefront at a unique subdomain in minutes
- Every agent decision stored in ClickHouse as long-term memory, feeding back into future agent prompts for improved scoring
- Full observability via Datadog traces; durable agent orchestration via Temporal (retry-safe, LLM-flakiness-proof)
- ๐ค Team project with 3 collaborators
Google ADK Gemini 2.5 Flash Temporal ClickHouse Nimble API Datadog FastAPI Next.js Multi-agent AI
๐ฎ heart-maxxxxing โ Cardiac Rehab Gamification (Pulse Foundry AI Healthcare Hackathon 2026)
Mario-style browser game that turns the standard 36-session cardiac rehabilitation program into an interactive world-map adventure โ making recovery feel like play, not a clinical chore.
- Reframed 36 rehab sessions as "Levels" on a platformer world map with Power-Ups, milestones, and safety pause prompts for over-exertion
- Integrated Gemini API as an in-game AI guide providing personalized, encouraging tips at each stage of recovery
- Built accessible, high-contrast pixel-art UI with Framer Motion animations targeting patients of all ages
- Deployed on Vercel with a Next.js + React frontend
- ๐ค Team project with 3 collaborators
Gemini API Next.js React Tailwind CSS Framer Motion Vercel Healthcare AI
Voice-based disease prediction system using NLP & Random Forest deployed on Raspberry Pi โ achieving 95.02% accuracy.
- Designed an end-to-end pipeline: Speech-to-Text โ NLP symptom extraction โ Random Forest classification โ TTS feedback
- Received โน37,000 funding by Innovative and Entrepreneurship Development Centres (IEDC), Indian Govt.
- ๐ Published in IJERT
Python NLTK Scikit-learn Speech Recognition Raspberry Pi
- Statistical modeling on survey data from 90+ countries for COVID-19 healthcare pattern recognition
- Built interactive Tableau dashboard for the organization's public-facing website
- BERT-based multi-class classifier achieving 87% accuracy on 5,000+ survey responses on violence against medical professionals
- Analyzed 21,000+ survey responses across 40+ languages using multilingual RoBERTa
- Applied sentiment analysis and topic modeling on PPE usage data during COVID-19
- Led NER implementation as SME for a Fortune 500 telecom client โ extracted entities from 2M+ unstructured documents using SpaCy
- Reduced manual review time by 71% through automated document processing
- Built classification system with Flair + FastText to categorize 10,000+ reviews for C-suite strategic planning
| Degree | Institution | Year |
|---|---|---|
| M.S. Data Science | New York University | 2024 โ 2026 |
| B.E. Electronics & Communication | B.N.M Institute of Technology | 2016 โ 2020 |
Relevant coursework: Deep Learning ยท Machine Learning ยท Natural Language Understanding ยท Reinforcement Learning ยท Big Data ยท AI Applications in Business (NYU Stern)
Always open to research collaborations and data science opportunities. Feel free to reach out!


