cpp

Foundry Local C++ SDK

The Foundry Local C++ SDK provides a C++17 static library for running AI models locally via Foundry Local. Discover, download, load, and run inference entirely on your own machine — no cloud required.

Windows-only — requires MSVC or clang-cl (MSVC-compatible toolchain).

Features

Model catalog — browse and search all available models; filter by cached or loaded state
Lifecycle management — download, load, unload, and remove models programmatically
Chat completions — synchronous and streaming via OpenAI-compatible types
Audio transcription — transcribe audio files with streaming support
Tool calling — define tools and handle tool-call responses in chat completions
Download progress — wire up a callback for real-time download percentage
Model variants — select specific hardware/quantization variants per model alias
Optional web service — start an OpenAI-compatible REST endpoint
Execution providers — discover, download, and register EPs with per-EP progress reporting
Auto NuGet download — CMake auto-downloads native runtime DLLs at configure time

Prerequisites

Requirement	Notes
CMake >= 3.20	Ships with Visual Studio 2022
Ninja	Ships with Visual Studio 2022
vcpkg	Set the `VCPKG_ROOT` environment variable to your vcpkg installation
MSVC (or clang-cl)	Visual Studio 2022 Build Tools or full IDE
NuGet CLI	Required for auto-downloading native runtime DLLs. Install via `winget install Microsoft.NuGet`

Building from Source

0. Open an x64 developer environment

All commands below must run in a shell that has the x64 MSVC toolchain on the PATH. Choose one of the following:

Method	How to open
Developer Command Prompt	Start Menu → "x64 Native Tools Command Prompt for VS 2022"
Developer PowerShell	Start Menu → "Developer PowerShell for VS 2022"
Inside an existing cmd	Run `"<VS_INSTALL>\VC\Auxiliary\Build\vcvars64.bat"` where `<VS_INSTALL>` is your Visual Studio installation path (e.g. `C:\Program Files\Microsoft Visual Studio\2022\Enterprise`)
VS Code terminal	Open the project folder in VS Code with the CMake Tools extension; it configures the environment automatically

Verify by running cl.exe — the banner should say x64.

1. Clone & navigate

git clone https://github.com/microsoft/Foundry-Local.git
cd Foundry-Local/sdk/cpp

2. Configure (CMake + vcpkg)

cmake --preset x64-debug

This uses the x64-debug preset which:

Uses the Ninja generator
Resolves C++ dependencies via vcpkg (nlohmann-json, ms-gsl, gtest)
Builds with the x64-windows-static-md triplet
Auto-downloads native runtime DLLs via NuGet:
- Microsoft.AI.Foundry.Local.Core (1.2.1) — Foundry Local core runtime
- Microsoft.ML.OnnxRuntime.Foundry (1.26.0) — ONNX Runtime
- Microsoft.ML.OnnxRuntimeGenAI.Foundry (0.14.1) — ONNX Runtime GenAI

NuGet packages are cached in out/build/<preset>/_native_deps/ and only downloaded on first configure. Runtime DLLs are automatically copied next to executables via post-build steps.

3. Build

cmake --build --preset x64-debug

Release build

cmake --preset x64-release
cmake --build --preset x64-release
ctest --preset x64-release

Quick Start

#include "foundry_local.h"
#include <iostream>

using namespace foundry_local;

int main() {
    // 1. Create the manager
    Manager::Create({"MyApp"});
    auto& manager = Manager::Instance();

    // 2. Get a model from the catalog
    auto& catalog = manager.GetCatalog();
    auto* model = catalog.GetModel("phi-3.5-mini");
    if (!model) return 1;

    // 3. Download (if needed) and load
    model->Download();
    model->Load();

    // 4. Chat
    OpenAIChatClient chat(*model);
    std::vector<ChatMessage> messages = {{"user", "Hello!"}};
    ChatSettings settings;
    settings.max_tokens = 128;

    auto response = chat.CompleteChat(messages, settings);
    if (!response.choices.empty() && response.choices[0].message) {
        std::cout << response.choices[0].message->content << "\n";
    }

    // 5. Cleanup
    model->Unload();
    Manager::Destroy();
}

Vision Sample (Responses API)

A complete vision sample is included at sample/web-server-responses-vision/. It demonstrates image understanding using the Responses API with streaming via cURL.

Build and run from the SDK root:

cmake --preset x64-debug
cmake --build --preset x64-debug --target WebServerResponsesVision
.\out\build\x64-debug\WebServerResponsesVision.exe qwen3.5-0.8b

Or build standalone from the sample directory:

cd sample/web-server-responses-vision
cmake --preset x64-debug
cmake --build --preset x64-debug
.\out\build\x64-debug\web-server-responses-vision.exe qwen3.5-0.8b

See sample/web-server-responses-vision/README.md for full details.

Usage

Initialization

Manager is a singleton. Call Create once at startup:

Manager::Create(Configuration{"MyApp"}, &myLogger);

Access it anywhere afterward via Manager::Instance(). Check Manager::IsInitialized() to verify creation.

Call Manager::Destroy() to perform deterministic cleanup when done.

Catalog

The catalog lists all models known to the Foundry Local Core:

auto& catalog = Manager::Instance().GetCatalog();

// List all available models
auto models = catalog.GetModels();
for (auto* m : models)
    std::cout << m->GetAlias() << " — " << m->GetId() << "\n";

// Get a specific model by alias
auto* model = catalog.GetModel("phi-3.5-mini");

// Get a specific variant by its unique model ID
auto* variant = catalog.GetModelVariant("phi-3.5-mini-generic-gpu-4");

// List models already downloaded to the local cache
auto cached = catalog.GetCachedModels();

// List models currently loaded in memory
auto loaded = catalog.GetLoadedModels();

Model Lifecycle

Each model may have multiple variants (different quantizations, hardware targets). The SDK auto-selects the best variant, or you can pick one.

// Check and select variants
if (auto* concrete = dynamic_cast<Model*>(model)) {
    for (const auto& v : concrete->GetVariants()) {
        std::cout << v.GetId() << " (cached: " << v.IsCached() << ")\n";
    }
    // Switch to a specific variant (e.g., CPU)
    for (const auto& variant : concrete->GetVariants()) {
        if (variant.GetInfo().runtime &&
            variant.GetInfo().runtime->device_type == DeviceType::CPU) {
            concrete->SelectVariant(variant);
            break;
        }
    }
}

Download, load, and unload:

// Download with progress reporting
model->Download([](float progress) {
    std::cout << "Download: " << progress << "%\n";
    return true;
});

// Load into memory
model->Load();

// Unload when done
model->Unload();

// Remove from local cache entirely
model->RemoveFromCache();

Chat Completions

OpenAIChatClient chat(*model);

std::vector<ChatMessage> messages = {
    {"system", "You are a helpful assistant."},
    {"user", "Explain async/await in C#."}
};
ChatSettings settings;

auto response = chat.CompleteChat(messages, settings);
if (!response.choices.empty() && response.choices[0].message) {
    std::cout << response.choices[0].message->content << "\n";
}

Streaming

Use a callback for token-by-token output:

chat.CompleteChatStreaming(messages, settings, [](const ChatCompletionCreateResponse& chunk) {
    if (!chunk.choices.empty() && chunk.choices[0].delta) {
        std::cout << chunk.choices[0].delta->content << std::flush;
    }
});

Chat Settings

Tune generation parameters per request:

ChatSettings settings;
settings.temperature = 0.7f;
settings.max_tokens = 256;
settings.top_p = 0.9f;
settings.frequency_penalty = 0.5f;

Audio Transcription

OpenAIAudioClient audio(*model);

// One-shot transcription
auto result = audio.TranscribeAudio(R"(C:\path\to\audio.wav)");
std::cout << result.text << "\n";

// Streaming transcription
audio.TranscribeAudioStreaming(R"(C:\path\to\audio.wav)", [](const AudioCreateTranscriptionResponse& chunk) {
    std::cout << chunk.text;
});

Tool Calling

See sample/main.cpp (Example 5) for a full tool-calling walkthrough.

Web Service

Start an OpenAI-compatible REST endpoint for use by external tools or processes:

Configuration config{"MyApp"};
config.web = WebServiceConfig{ "http://127.0.0.1:5000" };
Manager::Create(std::move(config));

Manager::Instance().StartWebService();
auto urls = Manager::Instance().GetWebServiceEndpoints();
for (const auto& url : urls)
    std::cout << "Listening on: " << url << "\n";

// ... use the service ...

Manager::Instance().StopWebService();

Execution Providers

Discover and download execution providers for hardware acceleration:

// Discover available EPs
auto eps = manager.DiscoverEps();
for (const auto& ep : eps) {
    std::cout << ep.name << " — registered: " << (ep.is_registered ? "yes" : "no") << "\n";
}

// Download and register all EPs with progress
std::string currentEp;
auto result = manager.DownloadAndRegisterEps([&](const std::string& epName, double percent) {
    if (epName != currentEp) {
        if (!currentEp.empty()) std::cout << "\n";
        currentEp = epName;
    }
    std::cout << "\r  " << epName << "  " << percent << "%" << std::flush;
});
std::cout << "\n";

// Or download specific EPs only
auto result2 = manager.DownloadAndRegisterEps({"WebGpuExecutionProvider"});

// Check results
if (result.success) {
    for (const auto& ep : result.registered_eps)
        std::cout << "Registered: " << ep << "\n";
}

Using the Prebuilt SDK (Zip)

Creating the zip

Build and install the SDK to produce the redistributable layout:

cd sdk/cpp

# Release build
cmake --preset x64-release
cmake --build --preset x64-release
cmake --install out/build/x64-release --prefix out/foundry-local-cpp-sdk

# Optional: also install Debug lib for consumers who need Debug builds
cmake --preset x64-debug
cmake --build --preset x64-debug
cmake --install out/build/x64-debug --prefix out/foundry-local-cpp-sdk

This creates:

out/foundry-local-cpp-sdk/
├── include/          # Public headers
├── lib/CppSdk.lib   # Prebuilt static library
├── bin/              # Runtime DLLs (Core, OnnxRuntime, OnnxRuntimeGenAI)
├── cmake/            # FoundryLocalConfig.cmake
└── README.md

Zip the out/foundry-local-cpp-sdk/ folder and distribute.

Using the zip in your project

Unzip to a folder (e.g. foundry-local-cpp-sdk/)
In your CMakeLists.txt:

cmake_minimum_required(VERSION 3.20)
project(my-app)

set(CMAKE_CXX_STANDARD 17)
set(VCPKG_TARGET_TRIPLET "x64-windows-static-md" CACHE STRING "")

list(APPEND CMAKE_PREFIX_PATH "${CMAKE_CURRENT_SOURCE_DIR}/foundry-local-cpp-sdk")
find_package(FoundryLocal REQUIRED)

add_executable(my-app main.cpp)
target_link_libraries(my-app PRIVATE FoundryLocal::FoundryLocal)

# Auto-copies Core DLL, ORT DLLs next to the exe
fl_copy_runtime_dlls(my-app)

Create a vcpkg.json with the required transitive dependencies:

{
  "dependencies": ["nlohmann-json", "ms-gsl"]
}

Build:

cmake -G Ninja -B build -DCMAKE_TOOLCHAIN_FILE="%VCPKG_ROOT%/scripts/buildsystems/vcpkg.cmake"
cmake --build build

Note: Match your build type to the SDK's. If the zip only contains a Release lib, build your project in Release (-DCMAKE_BUILD_TYPE=Release) to avoid MSVC runtime-library mismatches. If both Debug and Release libs are included, CMake selects the correct one automatically.

Using the SDK from Source

Include the SDK via add_subdirectory (e.g. from the repo):

add_subdirectory(path/to/sdk/cpp ${CMAKE_CURRENT_BINARY_DIR}/CppSdk)

add_executable(my_app main.cpp)
target_link_libraries(my_app PRIVATE CppSdk)
fl_copy_runtime_dlls(my_app)

Configuration

Property	Type	Default	Description
`app_name`	`std::string`	(required)	Your application name
`app_data_dir`	`optional<path>`	`~/.{app_name}`	Application data directory
`model_cache_dir`	`optional<path>`	`{app_data_dir}/cache/models`	Where models are stored locally
`logs_dir`	`optional<path>`	`{app_data_dir}/logs`	Log output directory
`log_level`	`LogLevel`	`Warning`	Verbose, Debug, Information, Warning, Error, Fatal
`web`	`optional<WebServiceConfig>`	`nullopt`	Web service configuration (see below)
`additional_settings`	`optional<unordered_map>`	`nullopt`	Extra key-value settings passed to Core

WebServiceConfig

Property	Type	Default	Description
`urls`	`optional<string>`	`127.0.0.1:0`	Bind address; semicolon-separated for multiple
`external_url`	`optional<string>`	`nullopt`	URI for accessing the web service in a separate process

API Reference

Key types:

Type	Description
`Manager`	Singleton entry point — create, catalog, web service, EP management
`Configuration`	Initialization settings
`Catalog`	Model catalog — list, search, filter
`IModel`	Model interface — identity, metadata, lifecycle
`Model`	Model with variant selection (implements `IModel`)
`ModelVariant`	A specific variant of a model (implements `IModel`)
`OpenAIChatClient`	Chat completions (sync + streaming)
`OpenAIAudioClient`	Audio transcription (sync + streaming)
`EpInfo`	Execution provider discovery info (name, registration status)
`EpDownloadResult`	Result of EP download/registration (success, registered/failed EPs)
`ChatSettings`	Chat generation parameters
`ModelInfo`	Full model metadata record

Tests

ctest --preset x64-debug

Or run the test executable directly:

.\out\build\x64-debug\CppSdkTests.exe

Project Structure

sdk/cpp/
├── include/                  # Public headers
│   ├── foundry_local.h       # Umbrella header (include this)
│   ├── configuration.h       # Configuration struct
│   ├── foundry_local_manager.h  # Manager singleton + EP types
│   ├── catalog.h             # Model catalog
│   ├── model.h               # Model & ModelVariant
│   ├── logger.h              # ILogger interface
│   └── openai/
│       ├── chat_client.h     # Chat completion client
│       ├── audio_client.h    # Audio transcription client
│       └── tool_types.h      # Tool calling types
├── src/                      # Private implementation
├── sample/
│   ├── main.cpp              # Sample application
│   └── web-server-responses-vision/  # Vision sample (Responses API)
├── test/                     # Unit & E2E tests (GTest)
├── CMakeLists.txt
├── CMakePresets.json
├── vcpkg.json                # vcpkg dependencies
└── vcpkg-configuration.json

Troubleshooting

Error	Cause	Fix
`DML provider requested, but GenAI has not been built with DML support`	GPU variant selected but ONNX Runtime GenAI lacks DML	Select a CPU variant or update Foundry Local
`OgaGenerator_TokenCount not found in onnxruntime-genai`	Version mismatch between Foundry Local components	Update NuGet package versions in CMakeLists.txt
`API version [N] is not available`	ONNX Runtime version too old for the Foundry Local service	Update NuGet package versions in CMakeLists.txt
`nuget.exe not found on PATH`	NuGet CLI not installed	Install via `winget install Microsoft.NuGet`
`Failed to load shared library: Microsoft.AI.Foundry.Local.Core.dll`	Runtime DLLs not next to executable	Reconfigure with `cmake --preset x64-debug` to re-download NuGet packages, then rebuild
NuGet packages not installed or DLLs not copied correctly	Stale or corrupted build cache	Delete the `out` folder (`rmdir /s /q out`) and reconfigure from scratch: `cmake --preset x64-debug && cmake --build --preset x64-debug`

License

Licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Foundry Local C++ SDK

Features

Prerequisites

Building from Source

0. Open an x64 developer environment

1. Clone & navigate

2. Configure (CMake + vcpkg)

3. Build

Release build

Quick Start

Vision Sample (Responses API)

Usage

Initialization

Catalog

Model Lifecycle

Chat Completions

Streaming

Chat Settings

Audio Transcription

Tool Calling

Web Service

Execution Providers

Using the Prebuilt SDK (Zip)

Creating the zip

Using the zip in your project

Using the SDK from Source

Configuration

API Reference

Tests

Project Structure

Troubleshooting

License

Name		Name	Last commit message	Last commit date
parent directory ..
cmake		cmake
include		include
sample		sample
src		src
test		test
triplets		triplets
.clang-format		.clang-format
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
README.md		README.md
vcpkg-configuration.json		vcpkg-configuration.json
vcpkg.json		vcpkg.json

FilesExpand file tree

cpp

Directory actions

More options

Directory actions

More options

Latest commit

History

cpp

Folders and files

parent directory

README.md

Foundry Local C++ SDK

Features

Prerequisites

Building from Source

0. Open an x64 developer environment

1. Clone & navigate

2. Configure (CMake + vcpkg)

3. Build

Release build

Quick Start

Vision Sample (Responses API)

Usage

Initialization

Catalog

Model Lifecycle

Chat Completions

Streaming

Chat Settings

Audio Transcription

Tool Calling

Web Service

Execution Providers

Using the Prebuilt SDK (Zip)

Creating the zip

Using the zip in your project

Using the SDK from Source

Configuration

API Reference

Tests

Project Structure

Troubleshooting

License