AI BOM

Cisco AI BOM scans codebases, container images, and cloud environments to produce an AI Bill of Materials — a structured inventory of models, agents, tools, MCP servers/clients, datasets, prompts, guardrails, secrets, and other AI assets used in your software. It supports Python, JavaScript/TypeScript, Java, Go, Rust, Ruby, and C#, with deterministic candidate detection, cross-reference resolution, and LLM-powered agentic classification.

Features

Multi-language analysis — Python (LibCST), JavaScript/TypeScript, Java, Go, Rust, Ruby, C# (tree-sitter).
23 built-in scanners — model detection, dependency analysis, secret detection, vulnerability scanning (OSV.dev), MCP server/client detection, A2A/remote agent resolution, structural agent detection, ML lifecycle detection, cloud resource scanning, CI/CD pipeline analysis, deployment detection, container scanning, data-file scanning, environment variable resolution, KB enrichment, and more.
30 AI component types — model, llm_endpoint, model_endpoint, agent, agent_proxy, tool, mcp_server, mcp_client, mcp_gateway, embedding, vector_store, dataset, retriever, knowledge_base, feature_store, memory, prompt, training_run, hyperparameter, model_artifact, experiment_tracker, model_registry, data_versioning, ml_pipeline, guardrail, skill, observability, secret, dependency, other.
Three-tier detection — Tier 1 (deterministic high-confidence), Tier 2 (cross-reference resolution), Tier 3 (agentic LLM reasoning). Tier 1 code-level detection is deepest for Python (LibCST); other languages extract imports and literal patterns via tree-sitter and lean more on Tier 3 for confirmation.
10 output formats — Plaintext, JSON, CycloneDX, SARIF, SPDX, HTML dashboard, Markdown, CSV, JUnit, and a live API server.
Container image scanning — Extract and analyze application source code from Docker, Podman, nerdctl, Buildah, Skopeo, or Crane images, with Anchore Syft for SBOM metadata.
Cross-repo and org-level scanning — Scan multiple local repos, GitHub orgs, GitLab groups, or Bitbucket projects, with incremental caching.
Agentic classification — LLM agent (via Deep Agents + LangChain) classifies every scanner candidate, eliminating false positives and enriching confirmed components with concrete identifiers.
Policy engine — YAML-driven pass/fail gates for CI/CD integration (max-risk, required fields, blocked/required component types).
Compliance checks — EU AI Act, OWASP Agentic Top 10, NIST AI RMF advisory mappings.
Watch mode — Real-time file-system monitoring with debounced re-scan and delta reporting.
Diff command — Compare two AIBOM JSON snapshots side-by-side.
Benchmark command — Measure precision/recall/F1 against a labelled ground-truth file.
Secret detection — Integrated Yelp detect-secrets for hardcoded API keys, tokens, and credentials.
Vulnerability scanning — OSV.dev API lookups for known CVEs in detected dependencies.
Plugin system — Extend with custom scanners and reporters via Python entry points.
Custom catalog — Register custom AI components, base-class rules, excludes, and relationships via .aibom.yaml.
Knowledge base — Curated DuckDB catalog of AI framework symbols with download, verification, and versioned updates.

Language coverage at a glance

Language	Dep manifests	Code-level detection	Env-var resolver	Structural / KB / MCP / A2A scanners
Python	pip, Poetry, uv, setuptools	LibCST (full)	yes	yes
JavaScript / TypeScript	npm, yarn, pnpm	tree-sitter (imports + literals)	yes	no
Java	Maven, Gradle	tree-sitter (imports + literals)	yes	no
Go	go.mod	tree-sitter (imports + literals)	yes	no
Rust	Cargo	tree-sitter (imports + literals)	no	no
Ruby	Gemfile	tree-sitter (imports + literals)	yes	no
C#	`*.csproj`	tree-sitter (imports + literals)	no	no

Non-Python code-level detection is focused on dependency imports (matched against a curated allowlist) and inline model: "..." literals. Deeper structural signals — agent instantiations, MCP/A2A detection, ReAct-style loops, KB symbol enrichment — are Python-only today. For non-Python code, Tier 3 agentic classification fills most of the gap. See docs/TECHNICAL_OVERVIEW.md §12 for details.

Repository Layout

aibom/   # Python analyzer package + CLI
docs/    # Documentation (CLI reference, guides, API docs)

Installation

Prerequisites

Python 3.11+
uv (Python package manager)
Docker / Podman (optional, for container image analysis)
LLM provider credentials (required for --llm-model; see Agentic Classification)

Install from PyPI

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Analyze with OpenAI / Azure OpenAI
uv tool install --python 3.13 "cisco-aibom[agentic,llm-openai]"

# Analyze with AWS Bedrock
uv tool install --python 3.13 "cisco-aibom[agentic,llm-aws]"

# Core CLI only (report rendering, cache inspection, KB commands, etc.)
uv tool install --python 3.13 cisco-aibom

# Everything
uv tool install --python 3.13 "cisco-aibom[all]"

# Verify
cisco-aibom --help

cisco-aibom analyze always requires --llm-model. If the required agentic or provider extras are missing, the CLI fails fast with the exact uv tool install ... hint for the missing runtime.

Install from source

uv tool install --python 3.13 --from git+https://github.com/cisco-ai-defense/aibom cisco-aibom

Local development

git clone https://github.com/cisco-ai-defense/aibom.git
cd aibom/aibom

uv sync
source .venv/bin/activate

cisco-aibom --help

When working from source, you can also use uv run cisco-aibom ... or uv run python -m aibom ....

Extras

The analyze command always runs the agentic pipeline, so it requires the agentic extra plus at least one LLM provider extra. The simplest install is cisco-aibom[all]; otherwise pick the provider you plan to use.

Required for analyze:

Extra	Installs	Purpose
`agentic`	Deep Agents, LangChain	LLM-powered agentic enrichment (mandatory for `analyze`)
`llm-openai`	`langchain-openai`	OpenAI / Azure OpenAI provider
`llm-aws`	`langchain-aws`	AWS Bedrock provider
`llm-anthropic`	`langchain-anthropic`	Anthropic Claude provider
`llm-google`	`langchain-google-genai`	Google Gemini provider

Optional (degrade gracefully if missing):

Extra	Installs	Purpose
`analysis`	`detect-secrets`, `tree-sitter`	Secret detection, multi-language parsing
`security`	`cisco-ai-mcp-scanner`, `cisco-ai-skill-scanner`	Cisco security tool integration
`cloud`	`boto3`, `google-cloud-aiplatform`, `azure-*`	Cloud resource scanning
`all`	All of the above	Full feature set

Quick Start

# Scan a local project (--llm-model is required)
cisco-aibom analyze /path/to/project -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# Scan a container image
cisco-aibom analyze my-app:latest -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# Scan multiple repos under a directory
cisco-aibom analyze /path/to/repos --discover-repos -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# HTML dashboard
cisco-aibom analyze /path/to/project -o html -O dashboard.html \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# Policy gate for CI
cisco-aibom analyze /path/to/project -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY --policy policy.yaml

All LLM options can be set via environment variables (AIBOM_LLM_MODEL, AIBOM_LLM_API_KEY, etc.) for cleaner commands.

Commands

Command	Description
`analyze`	Scan source code, container images, or repos and produce an AI BOM.
`report`	Render or upload a previously generated JSON report.
`watch`	Poll directories for changes and re-scan with delta reporting (deterministic pipeline only).
`diff run`	Compare two AIBOM JSON reports side-by-side.
`benchmark run`	Measure precision/recall/F1 against ground-truth YAML.
`kb download`	Download the latest knowledge base.
`kb check`	Check if a newer KB version is available.
`kb info`	Display info about the locally installed KB.
`kb verify`	Verify KB integrity (SHA-256 checksum).
`kb request`	Request a KB build for a specific SDK version.
`kb request-status`	Check the status of a KB build request.
`kb list-requests`	List all pending KB build requests.
`cache clear`	Remove cached scan results and agentic cache.
`cache list`	List cached entries by cache type.
`cache get`	Inspect a specific cache entry.
`plugin list`	List discovered plugins (entry points, MCP servers).

See docs/CLI_REFERENCE.md for complete option details.

Global options

Option	Env Var	Description
`--log-level`	`AIBOM_LOG_LEVEL`	Logging level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL` (default `INFO`).

Agentic Enrichment

The --llm-model option (or AIBOM_LLM_MODEL env var) is required. The LLM agent acts as the final classifier for every scanner candidate and requires the agentic extra plus any provider-specific integration extra (for example llm-openai or llm-aws):

Confirms or removes every scanner candidate (no unverified findings)
Classifies and enriches components with concrete identifiers
Verifies dependencies against package registries (PyPI, npm, Go)
Discovers components missed by static analysis

# OpenAI
cisco-aibom analyze ./my-app -o json -O report.json \
  --llm-model gpt-5.4 --llm-provider openai --llm-api-key $OPENAI_API_KEY

# Azure OpenAI
cisco-aibom analyze ./my-app -o json -O report.json \
  --llm-model gpt-5.4 --llm-provider azure_openai \
  --llm-api-base https://my-endpoint.openai.azure.com \
  --llm-api-key $AZURE_OPENAI_API_KEY --llm-api-version 2024-12-01-preview

# AWS Bedrock
cisco-aibom analyze ./my-app -o json -O report.json \
  --llm-model us.anthropic.claude-sonnet-4-20250514-v1:0 --llm-provider bedrock

# Local Ollama
cisco-aibom analyze ./my-app -o json -O report.json \
  --llm-model gemma3:12b --llm-provider ollama \
  --llm-api-base http://localhost:11434

All LLM options can also be set via environment variables or a .env file. See docs/AGENTIC_MODE.md for the full guide.

Agentic tuning

Option	Default	Description
`--agentic-batch-size`	`5`	Max components per LLM invocation.
`--agentic-concurrency`	`1`	Max parallel LLM batches.
`--agentic-timeout`	`120`	Wall-clock seconds per batch before timeout.
`--agentic-fast-model`	—	Cheaper model for simple confirmations (model lookups, dependency checks).
`--progress`	`auto`	Show live per-stage and per-scanner progress in interactive terminals.
`--include-code-snippets`	`off`	Include raw code snippets inside per-finding decision annotations.

Report and cache utilities

# Render a saved report
cisco-aibom report report.json

# Explicit show form
cisco-aibom report show report.json --raw-json

# Upload an existing JSON report
cisco-aibom report upload report.json --format json \
  --post-url https://example.invalid/aibom/reports \
  --ai-defense-api-key $AI_DEFENSE_API_KEY

# Inspect cache families under the shared cache root
cisco-aibom cache list --type scan
cisco-aibom cache list --type agentic
cisco-aibom cache get scan 0123456789ab

All cache families now default under ~/.aibom/cache, including deterministic scan cache, agentic cache, org cache, model cache, and package metadata cache.

Container Scanning

The CLI auto-detects container image references and extracts application source code for analysis.

# Auto-detect extraction method
cisco-aibom analyze my-app:latest -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# Force a specific extraction tier
cisco-aibom analyze my-app:latest -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY \
  --container-extraction-tier podman

Supported tiers: auto, syft, docker, podman, nerdctl, buildah, crane, skopeo, tarball.

See docs/CONTAINER_SCANNING.md for details.

Cross-Repo and Org Scanning

# Discover and scan all git repos under a directory
cisco-aibom analyze /path/to/repos --discover-repos -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# Scan a GitHub org (requires GITHUB_TOKEN)
cisco-aibom analyze --github-org my-org --platform-token $GITHUB_TOKEN -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# Scan a GitLab group
cisco-aibom analyze --gitlab-group my-group --platform-token $GITLAB_TOKEN -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# Scan repos from a file (JSON array or newline-delimited)
cisco-aibom analyze --repos-file repos.txt -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# Incremental scan (skip repos with unchanged HEAD)
cisco-aibom analyze /path/to/repos --discover-repos --skip-unchanged -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# Limit and filter
cisco-aibom analyze --github-org my-org --platform-token $GITHUB_TOKEN \
  --max-repos 50 --repo-filter "ml-" --parallel-repos 4 -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

Output Formats

Format	Flag	Description
Plaintext	`-o plaintext`	Human-readable text report.
JSON	`-o json`	Structured JSON with full component details.
CycloneDX	`-o cyclonedx`	CycloneDX 1.6 BOM (ML-BOM profile).
SARIF	`-o sarif`	SARIF v2.1.0 for IDE/CI integration.
SPDX	`-o spdx`	SPDX 3.0 with AI and Dataset profiles.
HTML	`-o html`	Interactive dashboard with dependency graph and risk heatmap.
Markdown	`-o markdown`	Markdown table report.
CSV	`-o csv`	Flat CSV for spreadsheet analysis.
JUnit	`-o junit`	JUnit XML for CI test result reporting.
API	`-o api`	Live FastAPI server at `http://127.0.0.1:8000`.

All file-based formats require --output-file / -O.

Custom Catalog

The built-in DuckDB catalog covers popular AI frameworks, but you can extend it with a .aibom.yaml configuration file for custom components, base-class detection rules, exclude patterns, and relationship hints.

# .aibom.yaml
components:
  - id: MyLLMWrapper
    concept: model
    label: My Custom LLM
    framework: internal

base_classes:
  - class: BaseTool
    concept: tool

excludes:
  - some_noisy_helper_function

relationship_hints:
  tool_arguments:
    - custom_tools

Place .aibom.yaml in your project root (auto-discovered) or pass --custom-catalog /path/to/.aibom.yaml.

Supported keys: components, base_classes, excludes, relationship_hints, custom_relationships. See the full reference in aibom/examples/.aibom.yaml.

Inline annotations

Tag classes and functions directly in source code:

# aibom: concept=guardrail framework=internal
class SafetyFilter:
    ...

class MyRouter:  # aibom: concept=router
    ...

Policy Engine

Define pass/fail gates in a YAML policy file for CI/CD integration:

# policy.yaml
max_risk_score: 70
required_fields:
  - model_name
blocked_types:
  - secret
required_types:
  - guardrail
rules:
  - name: no-hardcoded-keys
    field: metadata.secret_type
    operator: not_exists

cisco-aibom analyze ./my-app -o json -O report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY --policy policy.yaml
# Exit code 1 if policy fails

Knowledge Base

The analyzer uses a versioned DuckDB catalog of AI framework symbols.

# Download the latest KB
cisco-aibom kb download

# Check for updates
cisco-aibom kb check

# Verify integrity
cisco-aibom kb verify

# View info
cisco-aibom kb info

Manual download from GitHub Releases (replace <VERSION> with the desired KB version, e.g. the latest tag from Releases):

VERSION="<VERSION>"
mkdir -p "${HOME}/.aibom/catalogs"
gh release download "${VERSION}" \
  --repo cisco-ai-defense/aibom \
  --pattern "aibom_catalog-${VERSION}.duckdb" \
  --dir "${HOME}/.aibom/catalogs"

export AIBOM_DB_PATH="${HOME}/.aibom/catalogs/aibom_catalog-${VERSION}.duckdb"

Environment Variables

All CLI options with an envvar binding can be set via environment variables or a .env file. The CLI auto-loads .env from the current directory, or you can specify a custom path with AIBOM_ENV_FILE.

Variable	CLI Option	Description
`AIBOM_LOG_LEVEL`	`--log-level`	Logging level (default `INFO`).
`AIBOM_LLM_MODEL`	`--llm-model`	LLM model name.
`AIBOM_LLM_PROVIDER`	`--llm-provider`	LangChain provider (`openai`, `azure_openai`, `bedrock`, `ollama`, etc.).
`AIBOM_LLM_API_KEY`	`--llm-api-key`	LLM API key.
`AIBOM_LLM_API_BASE`	`--llm-api-base`	LLM API base URL.
`AIBOM_LLM_API_VERSION`	`--llm-api-version`	API version (Azure OpenAI).
`AIBOM_POST_URL`	`--post-url`	HTTP endpoint to POST the report to.
`AIBOM_POST_TIMEOUT`	`--post-timeout`	Timeout in seconds for POSTing the report (default `30`).
`AIBOM_POST_VERIFY_TLS`	`--post-verify-tls`	Verify TLS certificates when POSTing (`true`/`false`).
`AI_DEFENSE_API_KEY`	`--ai-defense-api-key`	Cisco AI Defense tenant API key (sent as `x-cisco-ai-defense-tenant-api-key`).
`CISCO_AI_DEFENSE_API_KEY`	`--api-key` (kb commands)	Cisco AI Defense tenant API key for `kb request`, `kb request-status`, and `kb list-requests`.
`CISCO_AI_DEFENSE_API_BASE`	`--api-base` (kb commands)	Regional Cisco AI Defense API host for `kb request*` commands (e.g. `https://api.security.cisco.com`, `https://api.eu.security.cisco.com`). No default.
`CISCO_AIBOM_MANIFEST_URL`	`--url` (`kb download`)	KB manifest URL for `kb download` / `kb check`. No default.
`AIBOM_GITHUB_ORG`	`--github-org`	GitHub org for repo discovery.
`AIBOM_GITLAB_GROUP`	`--gitlab-group`	GitLab group for repo discovery.
`AIBOM_BITBUCKET_PROJECT`	`--bitbucket-project`	Bitbucket project for repo discovery.
`AIBOM_PLATFORM_TOKEN`	`--platform-token`	Auth token for GitHub/GitLab/Bitbucket.
`AIBOM_DB_PATH`	—	Override path to the DuckDB catalog file.
`AIBOM_DB_SHA256`	—	Expected SHA-256 checksum for the catalog.
`AIBOM_MANIFEST_PATH`	—	Override path to `manifest.json`.
`AIBOM_ENV_FILE`	—	Path to a custom `.env` file.

Docker

A single multi-stage Dockerfile is provided. It installs the all extra (which expands to analysis, security, agentic, llm-openai, llm-aws, llm-anthropic, llm-google, cloud) because the analyze command requires the agentic pipeline plus at least one LLM provider. Image size is ~800 MB.

cd aibom

# Build
docker build -t cisco-aibom .

# Run
docker run --rm -v /path/to/project:/workspace cisco-aibom \
  analyze /workspace -o json -O /workspace/report.json \
  --llm-model gpt-5.4 --llm-api-key $OPENAI_API_KEY

# Or via docker compose (see aibom/docker-compose.yml for env var forwarding)
SCAN_DIR=/path/to/project docker compose run aibom \
  analyze /workspace -o json -O /workspace/report.json \
  --llm-model gpt-5.4

Testing

cd aibom
uv run pytest tests -v

Troubleshooting

DuckDB catalog errors: Run cisco-aibom kb download to fetch the latest catalog, or set AIBOM_DB_PATH to point at an existing file. Use cisco-aibom kb verify to check integrity.
Container extraction fails: Ensure Docker or an alternative runtime is installed and running. Use --container-extraction-tier to force a specific tool. See docs/CONTAINER_SCANNING.md.
Missing --llm-model: The LLM agent is required. Install the agentic extra (uv tool install "cisco-aibom[agentic,llm-openai]") and supply --llm-model or set AIBOM_LLM_MODEL. See docs/AGENTIC_MODE.md.
LLM provider errors: Ensure --llm-provider matches the installed LangChain integration package. For Azure OpenAI, --llm-api-version is required.
Slow scans on large repos: Use --timing to identify bottlenecks. Use --agentic-fast-model for a cheaper model on simple confirmations, or increase --agentic-concurrency for parallel batches.
Missing output files: --output-file / -O is required for all file-based formats.
Report submission: Set AIBOM_POST_URL and AI_DEFENSE_API_KEY. Regional endpoints: US (api.security.cisco.com), APJ (api.apj.security.cisco.com), EU (api.eu.security.cisco.com), UAE (api.uae.security.cisco.com).

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.github/workflows		.github/workflows
aibom		aibom
docs		docs
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DEVELOPING.md		DEVELOPING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Folders and files

Latest commit

History

Repository files navigation

AI BOM

Table of Contents

Features

Language coverage at a glance

Repository Layout

Installation

Prerequisites

Install from PyPI

Install from source

Local development

Extras

Quick Start

Commands

Global options

Agentic Enrichment

Agentic tuning

Report and cache utilities

Container Scanning

Cross-Repo and Org Scanning

Output Formats

Custom Catalog

Inline annotations

Policy Engine

Knowledge Base

Environment Variables

Docker

Testing

Further Reading

Troubleshooting

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 14

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages