Rohan Jain rohanjain11

Rohan Jain

MS Data Science candidate at CU Boulder (GPA 3.9/4.0), graduating May 2026. I build production ML pipelines, LLM agents, and data systems with a focus on reliability, reproducibility, and measurable outcomes.

📍 Vista, CA | 📧 jainrohanj@gmail.com | 🌐 LinkedIn | 🖥 Portfolio

🛠 Skills

🎓 Education

MS Data Science — University of Colorado Boulder (GPA 3.9/4.0) | Expected May 2026
BE Computer Engineering — Rajiv Gandhi Institute of Technology, Mumbai | May 2024

💼 Experience

Data Science & Engineering Intern — Nexus Weather and Climate (May 2025 - Present)

Replaced baseline SVM with tuned Random Forest; reduced temperature MAE by 47% (0.73 to 0.39) and humidity MAE by 54% (4.68 to 2.37)
Built probabilistic CNN-LSTM with custom CRPS loss for calibrated uncertainty quantification; best CRPS of 0.1787
Implemented automated inference quality gate rejecting models that fail to beat pre-ML baselines; reduced failed runs by 90%, saved 20 engineer-hours per month
Ran controlled loss-function ablation experiments (Hybrid CRPS+MSE, Huberized CRPS); hybrid trained 17% faster with consistent CRPS gains of 0.01 to 0.03

Graduate Research Assistant — Kopf Lab, CU Boulder (Dec 2025 - Present)

Reverse-engineered Thermo Isodat binary formats (MFC CArchive C++ serialization); built deterministic R readers with inheritance-aware decoding and fail-fast validation for 7+ instrument file types
Standardized stable isotope data ingestion across vendor formats with consistent schemas, field naming, and unit rules
Built regression test suites with golden outputs and corruption fixtures; deployed multi-user ShinyProxy platform with Keycloak OIDC on AWS EC2

🚀 Projects

ClaimPilot AI — Guardrailed LLM Agent for Healthcare Claims Validation
Hybrid agent: deterministic Pydantic layer validates ICD-10/CPT/NPI codes; OpenAI function calling proposes structured fixes strictly scoped to detected issues. LLM cannot hallucinate beyond tool outputs. Full pytest suite, per-run artifact storage, batch eval script.
Python Pydantic v2 OpenAI Function Calling Streamlit pytest

AgentSquared — No-Code AI Agent Builder (HackCU 12)
Config-driven platform to build AI business agents in under 60 seconds. Two modes: RAG-powered customer support and real-time Bluesky brand monitoring with Gemini sentiment classification and threaded replies. Every agent is a DB row + JSON config - no per-agent code.
Python FastAPI Google Gemini API RAG Next.js SQLite Bluesky AT Protocol

SafeRide — Risk-Aware Geospatial Routing API
Crash-aware route ranking for driving, cycling, and walking. OSRM alternatives scored via PostGIS spatial joins against crash and 311 hazard datasets. Deployed serverlessly on AWS Lambda + API Gateway with AWS RDS.
Python FastAPI PostgreSQL PostGIS GeoPandas AWS Lambda Docker

AI PDF Chatbot — RAG Document Q&A
Full-stack RAG pipeline: OCR ingestion, LangChain chunking, FAISS similarity search, GPT-4 answers. Embedding caching, configurable retrieval parameters, usage logging. Deployed on Vercel.
Python FastAPI LangChain FAISS OpenAI GPT-4 React.js

Denver Airport Analytics Dashboard — Hackathon 3rd Place
Integrated ServiceNow and Azure DevOps data; built dual Power BI dashboards with drill-downs for SLA, queue times, and bottlenecks. Reduced manual reporting by 40%.
Python Power BI Tableau Power Query pandas

📄 Publication

Yoga Posture Detection and Correction — Peer-reviewed research paper; TensorFlow/Keras pose classification achieving 96.5% accuracy with real-time corrective feedback.
View Publication

📫 Open to full-time roles in Data Science, ML Engineering, Data Engineering, and AI Engineering starting May 2026.
jainrohanj@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rohan Jain rohanjain11

Achievements

Achievements

Highlights

Block or report rohanjain11