Agent Builder - Conversational AI Workflow Platform
🎯 Inspiration
The inspiration for Agent Builder came from a simple observation: building AI agent workflows is incredibly powerful but unnecessarily complex. Data scientists and developers spend countless hours writing boilerplate code, configuring integrations, and manually wiring together agents and tools—time that could be better spent on solving actual business problems. We envisioned a world where anyone could describe what they want in plain English and watch a production-ready AI workflow materialize before their eyes.
We were particularly inspired by the democratization potential of NVIDIA's NIM ecosystem. The combination of powerful reasoning models like llama-3.1-nemotron-nano-8B-v1 and efficient retrieval embeddings like nv-embedqa-e5-v5 opened up possibilities for truly intelligent workflow generation—not just template filling, but genuine understanding of user intent and intelligent component selection. We wanted to prove that with the right architecture, AI could not only understand what you want to build but actually build it for you.
💡 What it does
Agent Builder is a conversational AI workflow platform that transforms natural language descriptions into fully deployable agent workflows. Instead of dragging, dropping, and configuring nodes manually, users simply chat with an AI assistant that understands their requirements and generates complete workflows automatically.
Core Features:
- AI Chat Interface: Describe workflows in natural language (e.g., "Create a code review workflow with Git integration")
- Intelligent Workflow Generation: Automatically selects appropriate agents, tools, and connections based on semantic understanding
- Visual Flow Builder: Drag-and-drop canvas for manual workflow creation and AI-generated workflow refinement
- Smart Catalog Management: Semantic search over agents and tools using natural language queries
- One-Click Deployment: Deploy workflows as containerized microservices to AWS ECS/Lambda
- Real-time Testing: Test workflows with live input and see step-by-step execution
- Dual Mode Operation: Works in dummy mode for demos or connects to backend services for production
Example Use Case: A user types "I want to analyze customer feedback and generate reports." The AI Chat immediately generates a workflow with a Sentiment Analyzer agent, Data Analyzer agent, and Text Summarizer, all properly connected with the appropriate tools (Database Connector, PDF Generator) attached—ready to deploy in seconds.
🔨 How we built it
Architecture Overview
We built Agent Builder using a multi-service microarchitecture designed for scalability and intelligent automation:
Frontend Layer (React + Vite + TypeScript)
- Visual flow canvas powered by ReactFlow
- AI Chat interface with streaming responses
- Zustand for state management
- Real-time workflow visualization
Orchestration Service (Node.js + Express)
- RESTful API for flow management
- WebSocket support for deployment updates
- PostgreSQL for persistent storage
- Redis for caching and session management
LLM Service (Python + FastAPI)
- llama-3.1-nemotron-nano-8B-v1 NIM for workflow reasoning and JSON generation
- nvidia/nv-embedqa-e5-v5 for semantic retrieval and catalog search
- LangChain for prompt engineering and schema validation
- Vector database (Pinecone/Weaviate) for agent/tool indexing
Deployment Service (Node.js + AWS SDK)
- Docker containerization for generated agents
- ECS and Lambda deployment orchestration
- S3 for artifact storage
NVIDIA NIM Integration
Reasoning with llama-3.1-nemotron-nano-8B-v1:
# Augmented prompt with retrieved context
system_prompt = """You are an AI workflow architect.
Generate valid workflow JSON with agents, tools, and connections.
Available components: {retrieved_components}"""
response = llm_nim.invoke({
"model": "llama-3.1-nemotron-nano-8B-v1",
"messages": [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_intent}
],
"json_schema": flow_graph_schema # Constrained decoding
})
Retrieval with nv-embedqa-e5-v5:
# Semantic search for relevant components
query_embedding = embedding_nim.embed(
model="nvidia/nv-embedqa-e5-v5",
input="code review with git integration"
)
relevant_agents = vector_db.search(
query_embedding,
collection="catalog_agents",
top_k=5
)
Key Technical Decisions
- Constrained JSON Generation: Used schema-guided decoding to ensure llama-3.1-nemotron outputs valid workflow structures
- Pre-computed Embeddings: All catalog items indexed offline for sub-millisecond retrieval
- Streaming Responses: LLM outputs streamed to frontend for better UX
- Marker-based Edge Resolution: Used
__INPUT__and__OUTPUT__markers to connect generated nodes to existing flow boundaries - Handle-based Tool Connections: Implemented specific connection handles (e.g.,
tool-bottom) for proper agent-to-tool relationships
🚧 Challenges we ran into
1. JSON Schema Reliability Getting llama-3.1-nemotron to consistently generate valid workflow JSON was initially challenging. The model would sometimes hallucinate node IDs or create invalid edge connections. We solved this by implementing constrained decoding with strict JSON schemas and adding post-validation layers that catch and repair common structural errors.
2. Edge Connection Complexity
Managing different types of connections (main flow vs. tool attachments) required careful design. We implemented a handle-based system where agents have multiple connection points (tool-top, tool-bottom) and used sourceHandle/targetHandle fields in edges to specify exact connection points. This required updates to our type definitions and careful mapping in the FlowCanvas component.
3. Real-time Synchronization
Keeping the visual canvas in sync with the underlying flow state when AI Chat generates new nodes proved tricky. ReactFlow's state management conflicted with our Zustand store updates. We solved this by implementing dual useEffect hooks that watch for changes in both nodes and edges separately, with careful change detection to avoid infinite update loops.
4. LocalStorage Cache Invalidation When we added new default items (like the Git Tool), they wouldn't appear because localStorage was overwriting the defaults. We implemented a merge strategy that combines stored data with new defaults, ensuring backwards compatibility while allowing new features to appear automatically.
5. Context Window Management Balancing the amount of context (conversation history + catalog data) sent to llama-3.1-nemotron within token limits required careful prompt engineering. We implemented a sliding window approach for conversation history and used the embedding model to retrieve only the most relevant catalog items rather than sending the entire catalog.
🏆 Accomplishments that we're proud of
1. True Natural Language to Deployment We achieved our core vision: users can literally speak a workflow into existence and deploy it to production infrastructure—all in under 30 seconds. The combination of llama-3.1-nemotron's reasoning and nv-embedqa-e5-v5's retrieval creates genuinely intelligent workflow generation, not just template filling.
2. Semantic Intelligence The embedding-based catalog search is remarkably intuitive. Users can search for "something to validate data" and get Data Analyzer, Schema Validator, and Content Moderator—semantically related results that would never match with keyword search.
3. Visual + Conversational UX We successfully merged two interaction paradigms: traditional drag-and-drop flow building and conversational AI. Users can start with AI Chat to get 80% of the way there, then fine-tune manually on the canvas—the best of both worlds.
4. Production-Ready Architecture This isn't just a demo—the architecture supports real deployments with Docker containerization, automatic scaling, and proper error handling. Generated workflows become independent microservices with their own resource allocation.
5. Dual-Mode Flexibility The dummy mode allows instant demos without backend dependencies, while real mode provides full production capabilities. This made development faster and demos more reliable.
📚 What we learned
Technical Learnings:
- Constrained Decoding is Essential: Free-form LLM output is too unreliable for structured generation. Schema-guided decoding dramatically improved consistency.
- Embeddings Enable Magic: Semantic search with nv-embedqa-e5-v5 creates surprisingly intelligent behavior—the model "understands" user intent in ways keyword matching never could.
- Context is King: The quality of llama-3.1-nemotron's reasoning heavily depends on the retrieved context. Good retrieval makes good generation.
- Handle Complexity Matters: ReactFlow's handle system is powerful but requires careful state management and type definitions to work correctly with dynamic graphs.
AI/ML Insights:
- Small Models Can Be Powerful: llama-3.1-nemotron-nano-8B-v1 proved that you don't need massive models for structured reasoning tasks—the "nano" variant handles JSON generation beautifully with much lower latency.
- Retrieval-Augmented Generation Works: Combining embedding-based retrieval with LLM reasoning is incredibly effective—the model performs better with 5 retrieved examples than with 100 examples in the prompt.
- Streaming Improves UX: Even though total generation time is the same, streaming tokens make the system feel 3x faster to users.
Product Insights:
- Conversation as Interface: Users intuitively understand how to describe workflows—no training needed. Natural language truly democratizes AI development.
- Visual Feedback is Critical: Seeing nodes appear on the canvas in real-time creates a magical experience that text-based tools can't match.
- Fallbacks Build Trust: Having pattern-matching fallbacks when the LLM is unavailable ensures the system always responds, building user confidence.
🚀 What's next for Agent Builder
1. Enhanced LLM Reasoning
- Multi-turn workflow refinement: "Actually, add validation before the analyzer"
- Automatic error handling node insertion based on failure patterns
- Cost optimization suggestions: "This could be 30% cheaper with batching"
2. Expanded Component Library
- Integration with NVIDIA NIM catalog for automatic agent discovery
- Community marketplace for sharing custom agents and workflows
- Pre-built templates for common patterns (ETL, customer service, content moderation)
3. Advanced Retrieval
- Hybrid search combining embeddings + metadata filters
- Workflow similarity search: "Find workflows like this one"
- Automatic component recommendations based on flow context
🎬 Conclusion
Agent Builder represents a fundamental shift in how we build AI systems: from code-first to conversation-first. By leveraging NVIDIA's llama-3.1-nemotron-nano-8B-v1 for intelligent reasoning and nv-embedqa-e5-v5 for semantic understanding, we've created a platform where anyone can build production-grade AI workflows through natural conversation. This is just the beginning—we're excited to see how the community uses this foundation to democratize AI development and unlock new possibilities we haven't even imagined yet.
Talk to Build. Deploy to Scale. 🚀
Log in or sign up for Devpost to join the conversation.