Cisco AI BOM scans codebases, container images, and cloud environments to produce an AI Bill of Materials — a structured inventory of models, agents, tools, MCP servers/clients, datasets, prompts, guardrails, secrets, and other AI assets used in your software. It supports Python, JavaScript/TypeScript, Java, Go, Rust, Ruby, and C#, with deterministic candidate detection, cross-reference resolution, and LLM-powered agentic classification.
- Features
- Repository Layout
- Installation
- Quick Start
- Commands
- Agentic Enrichment
- Container Scanning
- Cross-Repo and Org Scanning
- Output Formats
- Custom Catalog
- Policy Engine
- Knowledge Base
- Environment Variables
- Docker
- Testing
- Further Reading
- Troubleshooting
- Multi-language analysis — Python (LibCST), JavaScript/TypeScript, Java, Go, Rust, Ruby, C# (tree-sitter).
- 23 built-in scanners — model detection, dependency analysis, secret detection, vulnerability scanning (OSV.dev), MCP server/client detection, A2A/remote agent resolution, structural agent detection, ML lifecycle detection, cloud resource scanning, CI/CD pipeline analysis, deployment detection, container scanning, data-file scanning, environment variable resolution, KB enrichment, and more.
- 30 AI component types —
model,llm_endpoint,model_endpoint,agent,agent_proxy,tool,mcp_server,mcp_client,mcp_gateway,embedding,vector_store,dataset,retriever,knowledge_base,feature_store,memory,prompt,training_run,hyperparameter,model_artifact,experiment_tracker,model_registry,data_versioning,ml_pipeline,guardrail,skill,observability,secret,dependency,other. - Three-tier detection — Tier 1 (deterministic high-confidence), Tier 2 (cross-reference resolution), Tier 3 (agentic LLM reasoning). Tier 1 code-level detection is deepest for Python (LibCST); other languages extract imports and literal patterns via tree-sitter and lean more on Tier 3 for confirmation.
- 10 output formats — Plaintext, JSON, CycloneDX, SARIF, SPDX, HTML dashboard, Markdown, CSV, JUnit, and a live API server.
- Container image scanning — Extract and analyze application source code from Docker, Podman, nerdctl, Buildah, Skopeo, or Crane images, with Anchore Syft for SBOM metadata.
- Cross-repo and org-level scanning — Scan multiple local repos, GitHub orgs, GitLab groups, or Bitbucket projects, with incremental caching.
- Agentic classification — LLM agent (via Deep Agents + LangChain) classifies every scanner candidate, eliminating false positives and enriching confirmed components with concrete identifiers.
- Policy engine — YAML-driven pass/fail gates for CI/CD integration (max-risk, required fields, blocked/required component types).
- Compliance checks — EU AI Act, OWASP Agentic Top 10, NIST AI RMF advisory mappings.
- Watch mode — Real-time file-system monitoring with debounced re-scan and delta reporting.
- Diff command — Compare two AIBOM JSON snapshots side-by-side.
- Benchmark command — Measure precision/recall/F1 against a labelled ground-truth file.
- Secret detection — Integrated Yelp
detect-secretsfor hardcoded API keys, tokens, and credentials. - Vulnerability scanning — OSV.dev API lookups for known CVEs in detected dependencies.
- Plugin system — Extend with custom scanners and reporters via Python entry points.
- Custom catalog — Register custom AI components, base-class rules, excludes, and relationships via
.aibom.yaml. - Knowledge base — Curated DuckDB catalog of AI framework symbols with download, verification, and versioned updates.
| Language | Dep manifests | Code-level detection | Env-var resolver | Structural / KB / MCP / A2A scanners |
|---|---|---|---|---|
| Python | pip, Poetry, uv, setuptools | LibCST (full) | yes | yes |
| JavaScript / TypeScript | npm, yarn, pnpm | tree-sitter (imports + literals) | yes | no |
| Java | Maven, Gradle | tree-sitter (imports + literals) | yes | no |
| Go | go.mod | tree-sitter (imports + literals) | yes | no |
| Rust | Cargo | tree-sitter (imports + literals) | no | no |
| Ruby | Gemfile | tree-sitter (imports + literals) | yes | no |
| C# | *.csproj |
tree-sitter (imports + literals) | no | no |
Non-Python code-level detection is focused on dependency imports (matched against a curated allowlist) and inline model: "..." literals. Deeper structural signals — agent instantiations, MCP/A2A detection, ReAct-style loops, KB symbol enrichment — are Python-only today. For non-Python code, Tier 3 agentic classification fills most of the gap. See docs/TECHNICAL_OVERVIEW.md §12 for details.
aibom/ # Python analyzer package + CLI
docs/ # Documentation (CLI reference, guides, API docs)
- Python 3.11+
- uv (Python package manager)
- Docker / Podman (optional, for container image analysis)
- LLM provider credentials (required for
--llm-model; see Agentic Classification)
# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Analyze with OpenAI / Azure OpenAI
uv tool install --python 3.13 "cisco-aibom[agentic,llm-openai]"
# Analyze with AWS Bedrock
uv tool install --python 3.13 "cisco-aibom[agentic,llm-aws]"
# Core CLI only (report rendering, cache inspection, KB commands, etc.)
uv tool install --python 3.13 cisco-aibom
# Everything
uv tool install --python 3.13 "cisco-aibom[all]"
# Verify
cisco-aibom --helpcisco-aibom analyze always requires --llm-model. If the required agentic or provider extras are missing, the CLI fails fast with the exact uv tool install ... hint for the missing runtime.
uv tool install --python 3.13 --from git+https://github.com/cisco-ai-defense/aibom cisco-aibomgit clone https://github.com/cisco-ai-defense/aibom.git
cd aibom/aibom
uv sync
source .venv/bin/activate
cisco-aibom --helpWhen working from source, you can also use uv run cisco-aibom ... or uv run python -m aibom ....
The analyze command always runs the agentic pipeline, so it requires the agentic extra plus at least one LLM provider extra. The simplest install is cisco-aibom[all]; otherwise pick the provider you plan to use.
Required for analyze:
| Extra | Installs | Purpose |
|---|---|---|
agentic |
Deep Agents, LangChain | LLM-powered agentic enrichment (mandatory for analyze) |
llm-openai |
langchain-openai |
OpenAI / Azure OpenAI provider |
llm-aws |
langchain-aws |
AWS Bedrock provider |
llm-anthropic |
langchain-anthropic |
Anthropic Claude provider |
llm-google |
langchain-google-genai |
Google Gemini provider |
Optional (degrade gracefully if missing):
| Extra | Installs | Purpose |
|---|---|---|
analysis |
detect-secrets, tree-sitter |
Secret detection, multi-language parsing |
security |
cisco-ai-mcp-scanner, cisco-ai-skill-scanner |
Cisco security tool integration |
cloud |
boto3, google-cloud-aiplatform, azure-* |
Cloud resource scanning |
all |
All of the above | Full feature set |
# Scan a local project (--llm-model is required)
cisco-aibom analyze /path/to/project -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# Scan a container image
cisco-aibom analyze my-app:latest -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# Scan multiple repos under a directory
cisco-aibom analyze /path/to/repos --discover-repos -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# HTML dashboard
cisco-aibom analyze /path/to/project -o html -O dashboard.html \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# Policy gate for CI
cisco-aibom analyze /path/to/project -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY --policy policy.yamlAll LLM options can be set via environment variables (AIBOM_LLM_MODEL, AIBOM_LLM_API_KEY, etc.) for cleaner commands.
| Command | Description |
|---|---|
analyze |
Scan source code, container images, or repos and produce an AI BOM. |
report |
Render or upload a previously generated JSON report. |
watch |
Poll directories for changes and re-scan with delta reporting (deterministic pipeline only). |
diff run |
Compare two AIBOM JSON reports side-by-side. |
benchmark run |
Measure precision/recall/F1 against ground-truth YAML. |
kb download |
Download the latest knowledge base. |
kb check |
Check if a newer KB version is available. |
kb info |
Display info about the locally installed KB. |
kb verify |
Verify KB integrity (SHA-256 checksum). |
kb request |
Request a KB build for a specific SDK version. |
kb request-status |
Check the status of a KB build request. |
kb list-requests |
List all pending KB build requests. |
cache clear |
Remove cached scan results and agentic cache. |
cache list |
List cached entries by cache type. |
cache get |
Inspect a specific cache entry. |
plugin list |
List discovered plugins (entry points, MCP servers). |
See docs/CLI_REFERENCE.md for complete option details.
| Option | Env Var | Description |
|---|---|---|
--log-level |
AIBOM_LOG_LEVEL |
Logging level: DEBUG, INFO, WARNING, ERROR, CRITICAL (default INFO). |
The --llm-model option (or AIBOM_LLM_MODEL env var) is required. The LLM agent acts as the final classifier for every scanner candidate and requires the agentic extra plus any provider-specific integration extra (for example llm-openai or llm-aws):
- Confirms or removes every scanner candidate (no unverified findings)
- Classifies and enriches components with concrete identifiers
- Verifies dependencies against package registries (PyPI, npm, Go)
- Discovers components missed by static analysis
# OpenAI
cisco-aibom analyze ./my-app -o json -O report.json \
--llm-model gpt-5.4 --llm-provider openai --llm-api-key $OPENAI_API_KEY
# Azure OpenAI
cisco-aibom analyze ./my-app -o json -O report.json \
--llm-model gpt-5.4 --llm-provider azure_openai \
--llm-api-base https://my-endpoint.openai.azure.com \
--llm-api-key $AZURE_OPENAI_API_KEY --llm-api-version 2024-12-01-preview
# AWS Bedrock
cisco-aibom analyze ./my-app -o json -O report.json \
--llm-model us.anthropic.claude-sonnet-4-20250514-v1:0 --llm-provider bedrock
# Local Ollama
cisco-aibom analyze ./my-app -o json -O report.json \
--llm-model gemma3:12b --llm-provider ollama \
--llm-api-base http://localhost:11434All LLM options can also be set via environment variables or a .env file. See docs/AGENTIC_MODE.md for the full guide.
| Option | Default | Description |
|---|---|---|
--agentic-batch-size |
5 |
Max components per LLM invocation. |
--agentic-concurrency |
1 |
Max parallel LLM batches. |
--agentic-timeout |
120 |
Wall-clock seconds per batch before timeout. |
--agentic-fast-model |
— | Cheaper model for simple confirmations (model lookups, dependency checks). |
--progress |
auto |
Show live per-stage and per-scanner progress in interactive terminals. |
--include-code-snippets |
off |
Include raw code snippets inside per-finding decision annotations. |
# Render a saved report
cisco-aibom report report.json
# Explicit show form
cisco-aibom report show report.json --raw-json
# Upload an existing JSON report
cisco-aibom report upload report.json --format json \
--post-url https://example.invalid/aibom/reports \
--ai-defense-api-key $AI_DEFENSE_API_KEY
# Inspect cache families under the shared cache root
cisco-aibom cache list --type scan
cisco-aibom cache list --type agentic
cisco-aibom cache get scan 0123456789abAll cache families now default under ~/.aibom/cache, including deterministic scan cache, agentic cache, org cache, model cache, and package metadata cache.
The CLI auto-detects container image references and extracts application source code for analysis.
# Auto-detect extraction method
cisco-aibom analyze my-app:latest -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# Force a specific extraction tier
cisco-aibom analyze my-app:latest -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY \
--container-extraction-tier podmanSupported tiers: auto, syft, docker, podman, nerdctl, buildah, crane, skopeo, tarball.
See docs/CONTAINER_SCANNING.md for details.
# Discover and scan all git repos under a directory
cisco-aibom analyze /path/to/repos --discover-repos -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# Scan a GitHub org (requires GITHUB_TOKEN)
cisco-aibom analyze --github-org my-org --platform-token $GITHUB_TOKEN -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# Scan a GitLab group
cisco-aibom analyze --gitlab-group my-group --platform-token $GITLAB_TOKEN -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# Scan repos from a file (JSON array or newline-delimited)
cisco-aibom analyze --repos-file repos.txt -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# Incremental scan (skip repos with unchanged HEAD)
cisco-aibom analyze /path/to/repos --discover-repos --skip-unchanged -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# Limit and filter
cisco-aibom analyze --github-org my-org --platform-token $GITHUB_TOKEN \
--max-repos 50 --repo-filter "ml-" --parallel-repos 4 -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY| Format | Flag | Description |
|---|---|---|
| Plaintext | -o plaintext |
Human-readable text report. |
| JSON | -o json |
Structured JSON with full component details. |
| CycloneDX | -o cyclonedx |
CycloneDX 1.6 BOM (ML-BOM profile). |
| SARIF | -o sarif |
SARIF v2.1.0 for IDE/CI integration. |
| SPDX | -o spdx |
SPDX 3.0 with AI and Dataset profiles. |
| HTML | -o html |
Interactive dashboard with dependency graph and risk heatmap. |
| Markdown | -o markdown |
Markdown table report. |
| CSV | -o csv |
Flat CSV for spreadsheet analysis. |
| JUnit | -o junit |
JUnit XML for CI test result reporting. |
| API | -o api |
Live FastAPI server at http://127.0.0.1:8000. |
All file-based formats require --output-file / -O.
The built-in DuckDB catalog covers popular AI frameworks, but you can extend it with a .aibom.yaml configuration file for custom components, base-class detection rules, exclude patterns, and relationship hints.
# .aibom.yaml
components:
- id: MyLLMWrapper
concept: model
label: My Custom LLM
framework: internal
base_classes:
- class: BaseTool
concept: tool
excludes:
- some_noisy_helper_function
relationship_hints:
tool_arguments:
- custom_toolsPlace .aibom.yaml in your project root (auto-discovered) or pass --custom-catalog /path/to/.aibom.yaml.
Supported keys: components, base_classes, excludes, relationship_hints, custom_relationships. See the full reference in aibom/examples/.aibom.yaml.
Tag classes and functions directly in source code:
# aibom: concept=guardrail framework=internal
class SafetyFilter:
...
class MyRouter: # aibom: concept=router
...Define pass/fail gates in a YAML policy file for CI/CD integration:
# policy.yaml
max_risk_score: 70
required_fields:
- model_name
blocked_types:
- secret
required_types:
- guardrail
rules:
- name: no-hardcoded-keys
field: metadata.secret_type
operator: not_existscisco-aibom analyze ./my-app -o json -O report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY --policy policy.yaml
# Exit code 1 if policy failsThe analyzer uses a versioned DuckDB catalog of AI framework symbols.
# Download the latest KB
cisco-aibom kb download
# Check for updates
cisco-aibom kb check
# Verify integrity
cisco-aibom kb verify
# View info
cisco-aibom kb infoManual download from GitHub Releases (replace <VERSION> with the desired KB version, e.g. the latest tag from Releases):
VERSION="<VERSION>"
mkdir -p "${HOME}/.aibom/catalogs"
gh release download "${VERSION}" \
--repo cisco-ai-defense/aibom \
--pattern "aibom_catalog-${VERSION}.duckdb" \
--dir "${HOME}/.aibom/catalogs"
export AIBOM_DB_PATH="${HOME}/.aibom/catalogs/aibom_catalog-${VERSION}.duckdb"All CLI options with an envvar binding can be set via environment variables or a .env file. The CLI auto-loads .env from the current directory, or you can specify a custom path with AIBOM_ENV_FILE.
| Variable | CLI Option | Description |
|---|---|---|
AIBOM_LOG_LEVEL |
--log-level |
Logging level (default INFO). |
AIBOM_LLM_MODEL |
--llm-model |
LLM model name. |
AIBOM_LLM_PROVIDER |
--llm-provider |
LangChain provider (openai, azure_openai, bedrock, ollama, etc.). |
AIBOM_LLM_API_KEY |
--llm-api-key |
LLM API key. |
AIBOM_LLM_API_BASE |
--llm-api-base |
LLM API base URL. |
AIBOM_LLM_API_VERSION |
--llm-api-version |
API version (Azure OpenAI). |
AIBOM_POST_URL |
--post-url |
HTTP endpoint to POST the report to. |
AIBOM_POST_TIMEOUT |
--post-timeout |
Timeout in seconds for POSTing the report (default 30). |
AIBOM_POST_VERIFY_TLS |
--post-verify-tls |
Verify TLS certificates when POSTing (true/false). |
AI_DEFENSE_API_KEY |
--ai-defense-api-key |
Cisco AI Defense tenant API key (sent as x-cisco-ai-defense-tenant-api-key). |
CISCO_AI_DEFENSE_API_KEY |
--api-key (kb commands) |
Cisco AI Defense tenant API key for kb request, kb request-status, and kb list-requests. |
CISCO_AI_DEFENSE_API_BASE |
--api-base (kb commands) |
Regional Cisco AI Defense API host for kb request* commands (e.g. https://api.security.cisco.com, https://api.eu.security.cisco.com). No default. |
CISCO_AIBOM_MANIFEST_URL |
--url (kb download) |
KB manifest URL for kb download / kb check. No default. |
AIBOM_GITHUB_ORG |
--github-org |
GitHub org for repo discovery. |
AIBOM_GITLAB_GROUP |
--gitlab-group |
GitLab group for repo discovery. |
AIBOM_BITBUCKET_PROJECT |
--bitbucket-project |
Bitbucket project for repo discovery. |
AIBOM_PLATFORM_TOKEN |
--platform-token |
Auth token for GitHub/GitLab/Bitbucket. |
AIBOM_DB_PATH |
— | Override path to the DuckDB catalog file. |
AIBOM_DB_SHA256 |
— | Expected SHA-256 checksum for the catalog. |
AIBOM_MANIFEST_PATH |
— | Override path to manifest.json. |
AIBOM_ENV_FILE |
— | Path to a custom .env file. |
A single multi-stage Dockerfile is provided. It installs the all extra (which expands to analysis, security, agentic, llm-openai, llm-aws, llm-anthropic, llm-google, cloud) because the analyze command requires the agentic pipeline plus at least one LLM provider. Image size is ~800 MB.
cd aibom
# Build
docker build -t cisco-aibom .
# Run
docker run --rm -v /path/to/project:/workspace cisco-aibom \
analyze /workspace -o json -O /workspace/report.json \
--llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY
# Or via docker compose (see aibom/docker-compose.yml for env var forwarding)
SCAN_DIR=/path/to/project docker compose run aibom \
analyze /workspace -o json -O /workspace/report.json \
--llm-model gpt-5.4cd aibom
uv run pytest tests -v- CLI Reference — Complete command and option reference.
- Agentic Mode Guide — LLM enrichment setup, providers, and tuning.
- Container Scanning Guide — Extraction tiers, Syft, and runtime support.
- API Server — FastAPI endpoint details for
--output-format api. - Technical Overview — Architecture, pipeline stages, and scanner design.
- DuckDB catalog errors: Run
cisco-aibom kb downloadto fetch the latest catalog, or setAIBOM_DB_PATHto point at an existing file. Usecisco-aibom kb verifyto check integrity. - Container extraction fails: Ensure Docker or an alternative runtime is installed and running. Use
--container-extraction-tierto force a specific tool. See docs/CONTAINER_SCANNING.md. - Missing
--llm-model: The LLM agent is required. Install the agentic extra (uv tool install "cisco-aibom[agentic,llm-openai]") and supply--llm-modelor setAIBOM_LLM_MODEL. See docs/AGENTIC_MODE.md. - LLM provider errors: Ensure
--llm-providermatches the installed LangChain integration package. For Azure OpenAI,--llm-api-versionis required. - Slow scans on large repos: Use
--timingto identify bottlenecks. Use--agentic-fast-modelfor a cheaper model on simple confirmations, or increase--agentic-concurrencyfor parallel batches. - Missing output files:
--output-file/-Ois required for all file-based formats. - Report submission: Set
AIBOM_POST_URLandAI_DEFENSE_API_KEY. Regional endpoints: US (api.security.cisco.com), APJ (api.apj.security.cisco.com), EU (api.eu.security.cisco.com), UAE (api.uae.security.cisco.com).