Skip to content
View rohanjain11's full-sized avatar
:electron:
:electron:

Highlights

  • Pro

Block or report rohanjain11

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
rohanjain11/README.md

Rohan Jain

MS Data Science candidate at CU Boulder (GPA 3.9/4.0), graduating May 2026. I build production ML pipelines, LLM agents, and data systems with a focus on reliability, reproducibility, and measurable outcomes.

πŸ“ Vista, CA Β |Β  πŸ“§ jainrohanj@gmail.com Β |Β  🌐 LinkedIn Β |Β  πŸ–₯ Portfolio


πŸ›  Skills


πŸŽ“ Education

MS Data Science β€” University of Colorado Boulder (GPA 3.9/4.0) Β |Β  Expected May 2026
BE Computer Engineering β€” Rajiv Gandhi Institute of Technology, Mumbai Β |Β  May 2024


πŸ’Ό Experience

Data Science & Engineering Intern β€” Nexus Weather and Climate (May 2025 - Present)

  • Replaced baseline SVM with tuned Random Forest; reduced temperature MAE by 47% (0.73 to 0.39) and humidity MAE by 54% (4.68 to 2.37)
  • Built probabilistic CNN-LSTM with custom CRPS loss for calibrated uncertainty quantification; best CRPS of 0.1787
  • Implemented automated inference quality gate rejecting models that fail to beat pre-ML baselines; reduced failed runs by 90%, saved 20 engineer-hours per month
  • Ran controlled loss-function ablation experiments (Hybrid CRPS+MSE, Huberized CRPS); hybrid trained 17% faster with consistent CRPS gains of 0.01 to 0.03

Graduate Research Assistant β€” Kopf Lab, CU Boulder (Dec 2025 - Present)

  • Reverse-engineered Thermo Isodat binary formats (MFC CArchive C++ serialization); built deterministic R readers with inheritance-aware decoding and fail-fast validation for 7+ instrument file types
  • Standardized stable isotope data ingestion across vendor formats with consistent schemas, field naming, and unit rules
  • Built regression test suites with golden outputs and corruption fixtures; deployed multi-user ShinyProxy platform with Keycloak OIDC on AWS EC2

πŸš€ Projects

ClaimPilot AI β€” Guardrailed LLM Agent for Healthcare Claims Validation
Hybrid agent: deterministic Pydantic layer validates ICD-10/CPT/NPI codes; OpenAI function calling proposes structured fixes strictly scoped to detected issues. LLM cannot hallucinate beyond tool outputs. Full pytest suite, per-run artifact storage, batch eval script.
Python Pydantic v2 OpenAI Function Calling Streamlit pytest

AgentSquared β€” No-Code AI Agent Builder (HackCU 12)
Config-driven platform to build AI business agents in under 60 seconds. Two modes: RAG-powered customer support and real-time Bluesky brand monitoring with Gemini sentiment classification and threaded replies. Every agent is a DB row + JSON config - no per-agent code.
Python FastAPI Google Gemini API RAG Next.js SQLite Bluesky AT Protocol

SafeRide β€” Risk-Aware Geospatial Routing API
Crash-aware route ranking for driving, cycling, and walking. OSRM alternatives scored via PostGIS spatial joins against crash and 311 hazard datasets. Deployed serverlessly on AWS Lambda + API Gateway with AWS RDS.
Python FastAPI PostgreSQL PostGIS GeoPandas AWS Lambda Docker

AI PDF Chatbot β€” RAG Document Q&A
Full-stack RAG pipeline: OCR ingestion, LangChain chunking, FAISS similarity search, GPT-4 answers. Embedding caching, configurable retrieval parameters, usage logging. Deployed on Vercel.
Python FastAPI LangChain FAISS OpenAI GPT-4 React.js

Denver Airport Analytics Dashboard β€” Hackathon 3rd Place
Integrated ServiceNow and Azure DevOps data; built dual Power BI dashboards with drill-downs for SLA, queue times, and bottlenecks. Reduced manual reporting by 40%.
Python Power BI Tableau Power Query pandas


πŸ“„ Publication

Yoga Posture Detection and Correction β€” Peer-reviewed research paper; TensorFlow/Keras pose classification achieving 96.5% accuracy with real-time corrective feedback.
View Publication


πŸ“« Open to full-time roles in Data Science, ML Engineering, Data Engineering, and AI Engineering starting May 2026.
jainrohanj@gmail.com

Pinned Loading

  1. agent-research-assistant agent-research-assistant Public

    Multi-agent AI research assistant β€” LangChain + OpenAI pipeline with FastAPI backend, React frontend, and live SSE observability

    JavaScript

  2. AI-PDF-ChatBot AI-PDF-ChatBot Public

    Python

  3. llm-finetune-qlora llm-finetune-qlora Public

    Fine-tune Flan-T5 with QLoRA on a synthetic ML/Data Science QA dataset. OpenAI data gen + Colab training + Hugging Face Hub adapter (+25.7% ROUGE-L).

    Jupyter Notebook

  4. mlflow-weather-benchmark mlflow-weather-benchmark Public

    ML benchmarking pipeline for weather forecast bias correction with MLflow tracking and CI quality gates.

    Jupyter Notebook

  5. robotics-manual-rag robotics-manual-rag Public

    RAG-powered Q&A over robotics manuals with source citations. FastAPI + Pinecone + React.

    JavaScript