⚠️ IMPORTANT – Active Development BranchYou are currently viewing the
developbranch for the pre-release version of AI-Q v2.0.This branch contains the latest features and experimental updates and may contain breaking changes.
For production use, switch to the v1.2.1 stable release on the
main branch.
- Overview
- Software Components
- Target Audience
- Prerequisites
- Architecture
- Getting Started
- Ways to Run the Agents
- Evaluating the Workflow
- Development
- License
The NVIDIA AI-Q Blueprint is an enterprise-grade research agent built on the NVIDIA NeMo Agent Toolkit. It gives you both quick, cited answers and in-depth, report-style research in one system, with benchmarks and evaluation harnesses so you can measure quality and improve over time.
Key features:
- Orchestration node — One node classifies intent (meta vs. research), produces meta responses (for example, greetings, capabilities), and sets research depth (shallow vs. deep).
- Shallow research — Bounded, faster researcher with tool-calling and source citation.
- Deep research — Long-running multi-step planning and research to generate a long-form citation-backed report.
- Workflow configuration — YAML configs define agents, tools, LLMs, and routing behavior so you can tune workflows without code changes.
- Modular workflows — All agents (orchestration node, shallow researcher, deep researcher, clarifier) are composable; each can run standalone or as part of the full pipeline.
- Evaluation harnesses — Built-in benchmarks (for example, FreshQA, DeepResearch) and evaluation scripts to measure quality and iterate on prompts and agent architecture.
- Frontend options — Run through CLI, web UI, or async jobs; the Getting started and Ways to run the agents.
- Deployment options - Deployment assets for a docker compose as well as helm deployment.
The following are used by this project:
- NVIDIA NeMo Agent Toolkit
- NVIDIA nemotron-3-nano-30b-a3b (agents)
- NVIDIA nemotron-mini-4b-instruct (document summary, if used)
- NIM of nvidia/llama-nemotron-embed-vl-1b-v2 (embedding model for llamaindex knowledge layer implementation, if used)
- NIM of nvidia/nemotron-nano-12b-v2-vl (vision-language model for llamaindex knowledge layer implementation, if used)
- Tavily Search API for web search
- Serper Search API for paper search (Google Scholar)
This project is for:
- AI researchers and developers: People building or extending agentic research workflows
- Enterprise teams: Organizations needing tool-augmented research with citation-backed research
- NeMo Agent Toolkit users: Developers looking to understand advanced multi-agent patterns
- Python 3.11–3.13
- uv package manager
- NVIDIA API key from NVIDIA AI (for NIM models)
- Node.js 22+ and npm (optional, for web UI mode)
Optional requirements:
- Tavily API key (for web search functionality)
- Serper API key (for academic paper search functionality)
Note: Configure at least one data source (Tavily web search, Serper search tool, or knowledge layer) to enable research functionality.
If these optional API keys are not provided, the agent continues to operate without the corresponding search capabilities. Refer to Obtain API Keys for details.
Generalized minimum requirements.
Local Development
- Typical developer machine for AI-Q workflow (no GPU required)
- Llamaindex (no GPU required)
- Self / Remote Hosted Models
Self Hosted
- Typical server for AI-Q workflow (no GPU required)
- NVIDIA nemotron-3-nano-30b-a3b (agents)
- NVIDIA nemotron-mini-4b-instruct (document summary, if used)
- NIM of nvidia/llama-nemotron-embed-vl-1b-v2 (embedding model for llamaindex knowledge layer implementation, if used)
- NIM of nvidia/nemotron-nano-12b-v2-vl (vision-language model for llamaindex knowledge layer implementation, if used)
- NVIDIA RAG Blueprint Requirements (if used)
Remote Hosted
- Typical server for workflow (no GPU required)
- Provider LLM API keys (if used)
- NVIDIA RAG Blueprint Requirements (if used)
AI-Q uses a LangGraph-based state machine with the following key components:
- Orchestration node: Classifies intent (meta vs. research), produces meta responses when needed, and sets depth (shallow vs. deep) in one step
- Shallow research agent: Bounded tool-augmented research optimized for speed
- Deep research agent: Multi-phase research with planning, iteration, and citation management
Each agent can be run individually or as part of the orchestrated workflow. For detailed architecture documentation, refer to Architecture.
git clone https://github.com/NVIDIA-AI-Blueprints/aiq.git && cd aiqRun the setup script to initialize the environment:
./scripts/setup.shThis script:
- Creates a Python virtual environment with uv
- Installs all Python dependencies (core, frontends, benchmarks, data sources)
- Installs UI dependencies (if Node.js is available)
For selective installation, install packages individually:
# Create and activate virtual environment
uv venv --python 3.13 .venv
source .venv/bin/activate
# Install core with development dependencies
uv pip install -e ".[dev]"
# Install frontends (pick what you need)
uv pip install -e ./frontends/cli # CLI frontend
uv pip install -e ./frontends/debug # Debug console
uv pip install -e ./frontends/aiq_api # Unified API (includes debug)
# Install benchmarks (pick what you need)
uv pip install -e ./frontends/benchmarks/deepresearch_bench
uv pip install -e ./frontends/benchmarks/freshqa
# Install data sources (pick what you need)
uv pip install -e ./sources/tavily_web_search
uv pip install -e ./sources/google_scholar_paper_search
uv pip install -e "./sources/knowledge_layer[llamaindex,foundational_rag]"| API | Environment Variable | Purpose | Required |
|---|---|---|---|
| NVIDIA API | NVIDIA_API_KEY |
LLM inference through NIM | Yes |
| Tavily | TAVILY_API_KEY |
Web search | No (if not specified, agent continues without web search) |
| Serper | SERPER_API_KEY |
Academic paper search | No (if not specified, agent continues without paper search) |
- Sign in to NVIDIA Build
- Click on any model, then select "Deploy" > "Get API Key" > "Generate Key"
- Sign in to Tavily
- Navigate to your dashboard
- Generate an API key
- Sign in to Serper
- Generate an API key from your dashboard
Create a .env file in deploy/ directory:
cp deploy/.env.example deploy/.envReplace your API keys.
Note: If you do not want to use paper search, follow the steps in the Customization guide to disable it.
The frontends/ directory contains different interfaces for interacting with the agents. You can also run agents directly through the NeMo Agent Toolkit CLI.
The CLI provides an interactive research assistant in your terminal:
# Activate the virtual environment
source .venv/bin/activate
# Run with the convenience script
./scripts/start_cli.sh
# Verbose logging
./scripts/start_cli.sh --verbose
# Or run directly with the NeMo Agent Toolkit CLI
nat run --config_file configs/config_cli_default.yml --input "How do I install CUDA?"The CLI frontend source is in frontends/cli/.
For a full web-based experience:
./scripts/start_e2e.shThis starts:
- Backend API server at
http://localhost:8000 - Frontend UI at
http://localhost:3000
The web UI source is in frontends/ui/. Refer to frontends/ui/README.md for more details.
You can also run the backend and UI with Docker Compose:
cd deploy/compose
# No-auth local setup (LlamaIndex default)
docker compose --env-file ../.env -f docker-compose.yaml up -d --build
# To select a different backend config, set BACKEND_CONFIG in deploy/.env, for example:
# BACKEND_CONFIG=/app/configs/config_web_frag.ymlFor more details, refer to:
deploy/compose/README.md
Endpoints, SSE streaming, and debug console: refer to frontends/aiq_api/README.md.
To run agents in evaluation mode, refer to the Evaluating the Workflow section.
The docs/notebooks/ directory contains a three-part series that walks through the blueprint from first run to full customization. Run them in order:
| # | Notebook | What it covers | Prerequisites |
|---|---|---|---|
| 0 | Getting Started with AI-Q | Full blueprint overview — environment setup, orchestrated workflow (intent routing, shallow and deep research), and Docker Compose deployment | NVIDIA_API_KEY; optionally TAVILY_API_KEY, SERPER_API_KEY |
| 1 | Deep Researcher — Web Search | Deep researcher in depth — Python API, nat run, and end-to-end evaluation against the DeepResearch Bench with nat eval |
Notebook 0 completed; NVIDIA_API_KEY, TAVILY_API_KEY, SERPER_API_KEY; OpenAI or Gemini key for the judge model |
| 2 | Deep Researcher — Customization | Extending the deep researcher — adding paper search, assigning different LLMs per agent role, editing prompts, and enabling the knowledge layer | Notebooks 0 and 1 completed; NVIDIA_API_KEY, TAVILY_API_KEY, SERPER_API_KEY |
The frontends/benchmarks/ directory contains evaluation pipelines for assessing agent performance.
| Benchmark | Description | Location |
|---|---|---|
| Deep Research Bench | RACE and FACT evaluation for research quality | frontends/benchmarks/deepresearch_bench/ |
| FreshQA | Factuality evaluation on time-sensitive questions | frontends/benchmarks/freshqa/ |
First, install the benchmark package:
uv pip install -e ./frontends/benchmarks/deepresearch_benchDownload the dataset files:
python frontends/benchmarks/deepresearch_bench/scripts/download_drb_dataset.pyThen run the evaluation with one of the available configurations:
dotenv -f deploy/.env run nat eval --config_file frontends/benchmarks/deepresearch_bench/configs/config_deep_research_bench.ymlFor detailed benchmark documentation, refer to:
For development, contribution, and documentation, refer to:
- Development and Contributing: Setup, testing, PR workflow, sign-off/DCO
- Architecture: Component details and data flow
- Customization: Configuration and customization options
- Knowledge Layer Setup: RAG backends and document ingestion
- Docs index: Full documentation list and component docs
- Changelog: Version history and changes
This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use, found in LICENSE-THIRD-PARTY.
GOVERNING TERMS: AIQ blueprint software and materials are governed by the Apache License, Version 2.0
