We built the system by integrating several cutting-edge technologies, guided by the principle of "Infrastructure as a Code" (IaaC) for deployment.
Infrastructure (IaaC): We used the
awslabs/data-on-eksTerraform blueprint as a massive accelerator. This single command provisioned our entire foundation:- An AWS EKS cluster to host all our microservices.
- Karpenter for auto-scaling GPU nodes (like
g5.xlarge) on-demand. - The NVIDIA GPU Operator to automatically configure drivers on new nodes.
Core Models (NVIDIA NIMs): Once the cluster was up, we deployed the hackathon-mandated NVIDIA NIMs using their Helm charts. This gave us internal, high-performance endpoints for:
- Reasoning:
llama-3.1-nemotron-nano-8b-v1 - Embeddings:
text-embedding-nim(usingarctic-embed-l)
- Reasoning:
Agent Backend (Python):
- We started with the NVIDIA AI-Q Research Assistant blueprint, which provided a containerized FastAPI app.
- We used LangGraph (from the NeMo Agent Toolkit) to define the agent's stateful flow (
AgentState) and nodes (planner, tool execution, final report). - Our key innovation was wrapping the core logic of the NVIDIA UDR prototype into a single Python function (
execute_dynamic_strategy) and registering it as a tool within the LangGraph. - We used the
copilotkitPython SDK to add a single/copilotkitendpoint to our FastAPI app, which automatically handles streaming theAgentStateto the frontend.
Agent Frontend (React):
- We built a simple React/Next.js frontend.
- We used the
@copilotkit/react-corelibrary, specifically theuseCoAgentStateRenderhook. This hook subscribes to the backend's state stream. - We wrote a simple render function to map the
logsarray from ourAgentStateobject into a list on the UI, creating the real-time visualization of the agent's internal flow.
Deployment: We containerized our custom FastAPI/LangGraph agent, pushed it to ECR, and deployed it to our EKS cluster using a standard Kubernetes Deployment YAML. We configured it to use Kubernetes' internal DNS (e.g.,
http://nemotron-nano-service.nim.svc.cluster.local) to communicate with the NIMs with zero latency.
Built With
- ag-ui
- ai-q
- amazon-web-services
- claude
- copilpotkit
- deep-research
- deepresearch
- eks
- gemini
- helm
- langchain
- langgraph
- nemotron
- nim
- nvidia
- rag
- terraform
- udr

Log in or sign up for Devpost to join the conversation.