Agent Builder Home
Agent Creator
Agent Builder UI
Embedded Export Script

Agent Builder - Conversational AI Workflow Platform

🎯 Inspiration

The inspiration for Agent Builder came from a simple observation: building AI agent workflows is incredibly powerful but unnecessarily complex. Data scientists and developers spend countless hours writing boilerplate code, configuring integrations, and manually wiring together agents and tools—time that could be better spent on solving actual business problems. We envisioned a world where anyone could describe what they want in plain English and watch a production-ready AI workflow materialize before their eyes.

We were particularly inspired by the democratization potential of NVIDIA's NIM ecosystem. The combination of powerful reasoning models like llama-3.1-nemotron-nano-8B-v1 and efficient retrieval embeddings like nv-embedqa-e5-v5 opened up possibilities for truly intelligent workflow generation—not just template filling, but genuine understanding of user intent and intelligent component selection. We wanted to prove that with the right architecture, AI could not only understand what you want to build but actually build it for you.

💡 What it does

Agent Builder is a conversational AI workflow platform that transforms natural language descriptions into fully deployable agent workflows. Instead of dragging, dropping, and configuring nodes manually, users simply chat with an AI assistant that understands their requirements and generates complete workflows automatically.

Core Features:

AI Chat Interface: Describe workflows in natural language (e.g., "Create a code review workflow with Git integration")
Intelligent Workflow Generation: Automatically selects appropriate agents, tools, and connections based on semantic understanding
Visual Flow Builder: Drag-and-drop canvas for manual workflow creation and AI-generated workflow refinement
Smart Catalog Management: Semantic search over agents and tools using natural language queries
One-Click Deployment: Deploy workflows as containerized microservices to AWS ECS/Lambda
Real-time Testing: Test workflows with live input and see step-by-step execution
Dual Mode Operation: Works in dummy mode for demos or connects to backend services for production

Example Use Case: A user types "I want to analyze customer feedback and generate reports." The AI Chat immediately generates a workflow with a Sentiment Analyzer agent, Data Analyzer agent, and Text Summarizer, all properly connected with the appropriate tools (Database Connector, PDF Generator) attached—ready to deploy in seconds.

🔨 How we built it

Architecture Overview

We built Agent Builder using a multi-service microarchitecture designed for scalability and intelligent automation:

Frontend Layer (React + Vite + TypeScript)

Visual flow canvas powered by ReactFlow
AI Chat interface with streaming responses
Zustand for state management
Real-time workflow visualization

Orchestration Service (Node.js + Express)

RESTful API for flow management
WebSocket support for deployment updates
PostgreSQL for persistent storage
Redis for caching and session management

LLM Service (Python + FastAPI)

llama-3.1-nemotron-nano-8B-v1 NIM for workflow reasoning and JSON generation
nvidia/nv-embedqa-e5-v5 for semantic retrieval and catalog search
LangChain for prompt engineering and schema validation
Vector database (Pinecone/Weaviate) for agent/tool indexing

Deployment Service (Node.js + AWS SDK)

Docker containerization for generated agents
ECS and Lambda deployment orchestration
S3 for artifact storage

NVIDIA NIM Integration

Reasoning with llama-3.1-nemotron-nano-8B-v1:

# Augmented prompt with retrieved context
system_prompt = """You are an AI workflow architect.
Generate valid workflow JSON with agents, tools, and connections.
Available components: {retrieved_components}"""

response = llm_nim.invoke({
    "model": "llama-3.1-nemotron-nano-8B-v1",
    "messages": [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_intent}
    ],
    "json_schema": flow_graph_schema  # Constrained decoding
})

Retrieval with nv-embedqa-e5-v5:

# Semantic search for relevant components
query_embedding = embedding_nim.embed(
    model="nvidia/nv-embedqa-e5-v5",
    input="code review with git integration"
)

relevant_agents = vector_db.search(
    query_embedding,
    collection="catalog_agents",
    top_k=5
)

Key Technical Decisions

Constrained JSON Generation: Used schema-guided decoding to ensure llama-3.1-nemotron outputs valid workflow structures
Pre-computed Embeddings: All catalog items indexed offline for sub-millisecond retrieval
Streaming Responses: LLM outputs streamed to frontend for better UX
Marker-based Edge Resolution: Used __INPUT__ and __OUTPUT__ markers to connect generated nodes to existing flow boundaries
Handle-based Tool Connections: Implemented specific connection handles (e.g., tool-bottom) for proper agent-to-tool relationships

🚧 Challenges we ran into

1. JSON Schema Reliability Getting llama-3.1-nemotron to consistently generate valid workflow JSON was initially challenging. The model would sometimes hallucinate node IDs or create invalid edge connections. We solved this by implementing constrained decoding with strict JSON schemas and adding post-validation layers that catch and repair common structural errors.

2. Edge Connection Complexity Managing different types of connections (main flow vs. tool attachments) required careful design. We implemented a handle-based system where agents have multiple connection points (tool-top, tool-bottom) and used sourceHandle/targetHandle fields in edges to specify exact connection points. This required updates to our type definitions and careful mapping in the FlowCanvas component.

3. Real-time Synchronization Keeping the visual canvas in sync with the underlying flow state when AI Chat generates new nodes proved tricky. ReactFlow's state management conflicted with our Zustand store updates. We solved this by implementing dual useEffect hooks that watch for changes in both nodes and edges separately, with careful change detection to avoid infinite update loops.

4. LocalStorage Cache Invalidation When we added new default items (like the Git Tool), they wouldn't appear because localStorage was overwriting the defaults. We implemented a merge strategy that combines stored data with new defaults, ensuring backwards compatibility while allowing new features to appear automatically.

5. Context Window Management Balancing the amount of context (conversation history + catalog data) sent to llama-3.1-nemotron within token limits required careful prompt engineering. We implemented a sliding window approach for conversation history and used the embedding model to retrieve only the most relevant catalog items rather than sending the entire catalog.

🏆 Accomplishments that we're proud of

1. True Natural Language to Deployment We achieved our core vision: users can literally speak a workflow into existence and deploy it to production infrastructure—all in under 30 seconds. The combination of llama-3.1-nemotron's reasoning and nv-embedqa-e5-v5's retrieval creates genuinely intelligent workflow generation, not just template filling.

2. Semantic Intelligence The embedding-based catalog search is remarkably intuitive. Users can search for "something to validate data" and get Data Analyzer, Schema Validator, and Content Moderator—semantically related results that would never match with keyword search.

3. Visual + Conversational UX We successfully merged two interaction paradigms: traditional drag-and-drop flow building and conversational AI. Users can start with AI Chat to get 80% of the way there, then fine-tune manually on the canvas—the best of both worlds.

4. Production-Ready Architecture This isn't just a demo—the architecture supports real deployments with Docker containerization, automatic scaling, and proper error handling. Generated workflows become independent microservices with their own resource allocation.

5. Dual-Mode Flexibility The dummy mode allows instant demos without backend dependencies, while real mode provides full production capabilities. This made development faster and demos more reliable.

📚 What we learned

Technical Learnings:

Constrained Decoding is Essential: Free-form LLM output is too unreliable for structured generation. Schema-guided decoding dramatically improved consistency.
Embeddings Enable Magic: Semantic search with nv-embedqa-e5-v5 creates surprisingly intelligent behavior—the model "understands" user intent in ways keyword matching never could.
Context is King: The quality of llama-3.1-nemotron's reasoning heavily depends on the retrieved context. Good retrieval makes good generation.
Handle Complexity Matters: ReactFlow's handle system is powerful but requires careful state management and type definitions to work correctly with dynamic graphs.

AI/ML Insights:

Small Models Can Be Powerful: llama-3.1-nemotron-nano-8B-v1 proved that you don't need massive models for structured reasoning tasks—the "nano" variant handles JSON generation beautifully with much lower latency.
Retrieval-Augmented Generation Works: Combining embedding-based retrieval with LLM reasoning is incredibly effective—the model performs better with 5 retrieved examples than with 100 examples in the prompt.
Streaming Improves UX: Even though total generation time is the same, streaming tokens make the system feel 3x faster to users.

Product Insights:

Conversation as Interface: Users intuitively understand how to describe workflows—no training needed. Natural language truly democratizes AI development.
Visual Feedback is Critical: Seeing nodes appear on the canvas in real-time creates a magical experience that text-based tools can't match.
Fallbacks Build Trust: Having pattern-matching fallbacks when the LLM is unavailable ensures the system always responds, building user confidence.

🚀 What's next for Agent Builder

1. Enhanced LLM Reasoning

Multi-turn workflow refinement: "Actually, add validation before the analyzer"
Automatic error handling node insertion based on failure patterns
Cost optimization suggestions: "This could be 30% cheaper with batching"

2. Expanded Component Library

Integration with NVIDIA NIM catalog for automatic agent discovery
Community marketplace for sharing custom agents and workflows
Pre-built templates for common patterns (ETL, customer service, content moderation)

3. Advanced Retrieval

Hybrid search combining embeddings + metadata filters
Workflow similarity search: "Find workflows like this one"
Automatic component recommendations based on flow context

🎬 Conclusion

Agent Builder represents a fundamental shift in how we build AI systems: from code-first to conversation-first. By leveraging NVIDIA's llama-3.1-nemotron-nano-8B-v1 for intelligent reasoning and nv-embedqa-e5-v5 for semantic understanding, we've created a platform where anyone can build production-grade AI workflows through natural conversation. This is just the beginning—we're excited to see how the community uses this foundation to democratize AI development and unlock new possibilities we haven't even imagined yet.

Talk to Build. Deploy to Scale. 🚀