Harsh P Bajaj harsh543

👋 Hi, I’m Harsh — I Build Distributed AI Systems, MCP Agents & High-Performance Inference Platforms

Who Am I?

I’m a Senior Machine Learning & Distributed Systems Engineer with 10+ years of experience architecting large-scale AI infrastructure, GPU inference platforms, and multi-agent (MCP) systems. I specialize in mission-critical systems demanding tight control over latency, reliability, orchestration, and control-plane design.

I've shipped:

Real-time distributed systems at Microsoft
Identity & risk-scoring engines at AWS
Multi-agent (MCP) platforms for automating developer workflows end-to-end

🚀 What I Build

⚙️ Distributed Control-Plane & Inference Infrastructure

Designed real-time platforms processing 10M+ GPU telemetry events/day (TP99 <120ms, 99.99% uptime, Azure-scale compute)
Architected global, multi-region orchestration systems using Kubernetes, Synapse, Spark, ADF (2s pipeline latency reduction)
Built low-level telemetry & diagnostics for Maia 100 AI accelerators (Redfish API integration)

🧠 Multi-Agent MCP (Model Context Protocol) Systems

Developed multiple MCP servers powering:
- Automated PR generation
- Repo-wide code intelligence
- Contextual retrieval from CI/CD + logs
- Issue tracking & GitHub tool integration
Implemented deterministic workflows, tool-calling chains, and developer automation pipelines
Designed Mosaic-style agent frameworks (planning, reasoning, orchestration)

🔍 Log-Based RAG & Retrieval Systems

Created end-to-end log-based RAG for diagnostics/investigation workflows
Built a full restaurant recommendation RAG system with LlamaIndex + Elasticsearch (vector search, hybrid retrieval, embeddings, caching)
Developed semantic search tools for design docs (Azure AI Foundry + Semantic Kernel). Improved retrieval efficiency by 60%

📦 High-Performance Backends for AI Applications

Architected microservice-based ML pipelines and anomaly detection frameworks
Built scalable ETL pipelines (Spark, DynamoDB, Kafka), integrated distributed monitoring/alerting
Designed REST APIs, CI/CD workflows, and containerized services for cloud platforms

🛠️ Core Competencies

Infrastructure & Distributed Systems

Kubernetes • Docker • Spark/Flink • Synapse • Redfish API
Control Plane Design • Autoscaling • Routing

AI / ML / LLM Systems

Inference Pipelines • Vector Search • RAG • Embeddings • Observability
Feature Engineering • GPU Telemetry

Agents & MCP

Tool Calling • Multi-agent Orchestration • PR/Repo Automation • Deterministic Workflows • MCP Servers

Cloud Platforms

Azure (AI Foundry, Functions, Compute, AI Search)
AWS (SageMaker, DynamoDB, CloudFormation)
GCP (Familiar)

Languages

Python • Go • Java • C++ • Bash • JavaScript/Node • SQL/NoSQL

📂 Highlight Projects

🍽️ Restaurant Recommendation RAG System

A complete LlamaIndex + Elasticsearch based system utilizing multi-source ingestion, hybrid retrieval, embeddings, and chat-style personalization.

🧑‍💻 MCP Developer Productivity Agents

Multi-agent workflow automation for code review, PR generation, CI/CD understanding, and intelligent repo analysis.

📊 Multi-Agent Mosaic-Style MCP Server

Replicated Databricks Agent Bricks patterns: tool orchestration, structured reasoning, vector-based retrieval, agent messaging layers.

🔧 What I’m Great At (and What I Bring to Your Team)

Building distributed inference & scheduling systems
Designing latency-aware routing, capacity planning, and control-plane components
Creating MCP-enabled agent ecosystems for automation & reasoning
Optimizing GPU utilization and system reliability at scale
Scaling observability, health monitoring, and model versioning
Architecting backend systems for mission-critical AI workloads

🧭 Currently Focused On

Advancing agent orchestration with MCP
Building deterministically reproducible agent workflows
Improving inference through caching, batching, and routing
Developing RAG systems grounded in operational logs & telemetry
Exploring LLM safety, validation, and structured reasoning integrations

💬 Let’s Build the Future of Safe, Scalable AI

If you’re working on high-performance AI infrastructure, next-gen inference, or agentic frameworks, let’s connect!
I'm especially interested in collaborations where safety, reliability, and real-time performance are paramount.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly