Using Cohere models via the OpenAI SDK

The Compatibility API allows developers to use Cohere’s models through OpenAI’s SDK.

It makes it easy to switch existing OpenAI-based applications to use Cohere’s models while still maintaining the use of OpenAI SDK — no big refactors needed.

The supported libraries are:

TypeScript / JavaScript
Python
.NET
Java (beta)
Go (beta)

This is a quickstart guide to help you get started with the Compatibility API.

Installation

First, install the OpenAI SDK and import the package.

Then, create a client and configure it with the compatibility API base URL and your Cohere API key.

Python

TypeScript

$ pip install openai

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )

Basic chat completions

Here’s a basic example of using the Chat Completions API.

Python

TypeScript

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )
7 
8 completion = client.chat.completions.create(
9     model="command-a-03-2025",
10     messages=[
11         {
12             "role": "user",
13             "content": "Write a haiku about recursion in programming.",
14         },
15     ],
16 )
17 
18 print(completion.choices[0].message)

Example response (via the Python SDK):

1 ChatCompletionMessage(content="Recursive loops,\nUnraveling code's depths,\nEndless, yet complete.", refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None)

Chat with streaming

To stream the response, set the stream parameter to True.

Python

TypeScript

cURL

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )
7 
8 stream = client.chat.completions.create(
9     model="command-a-03-2025",
10     messages=[
11         {
12             "role": "user",
13             "content": "Write a haiku about recursion in programming.",
14         },
15     ],
16     stream=True,
17 )
18 
19 for chunk in stream:
20     print(chunk.choices[0].delta.content or "", end="")

Example response (via the Python SDK):

1 Recursive call,
2 Unraveling, line by line,
3 Solving, then again.

State management

For state management, use the messages parameter to build the conversation history.

You can include a system message via the developer role and the multiple chat turns between the user and assistant.

Python

TypeScript

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )
7 
8 completion = client.chat.completions.create(
9     messages=[
10         {
11             "role": "developer",
12             "content": "You must respond in the style of a pirate.",
13         },
14         {
15             "role": "user",
16             "content": "What's 2 + 2.",
17         },
18         {
19             "role": "assistant",
20             "content": "Arrr, matey! 2 + 2 be 4, just like a doubloon in the sea!",
21         },
22         {
23             "role": "user",
24             "content": "Add 30 to that.",
25         },
26     ],
27     model="command-a-03-2025",
28 )
29 
30 print(completion.choices[0].message)

Example response (via the Python SDK):

1 ChatCompletionMessage(content='Aye aye, captain! 4 + 30 be 34, a treasure to behold!', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None)

Structured outputs

The Structured Outputs feature allows you to specify the schema of the model response. It guarantees that the response will strictly follow the schema.

To use it, set the response_format parameter to the JSON Schema of the desired output.

Python

TypeScript

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )
7 
8 completion = client.beta.chat.completions.parse(
9     model="command-a-03-2025",
10     messages=[
11         {
12             "role": "user",
13             "content": "Generate a JSON describing a book.",
14         }
15     ],
16     response_format={
17         "type": "json_object",
18         "schema": {
19             "type": "object",
20             "properties": {
21                 "title": {"type": "string"},
22                 "author": {"type": "string"},
23                 "publication_year": {"type": "integer"},
24             },
25             "required": ["title", "author", "publication_year"],
26         },
27     },
28 )
29 
30 print(completion.choices[0].message.content)

Example response (via the Python SDK):

{
    "title": "The Great Gatsby",
    "author": "F. Scott Fitzgerald",
    "publication_year": 1925
}

Tool use (function calling)

You can utilize the tool use feature by passing a list of tools to the tools parameter in the API call.

Specifying the strict parameter to True in the tool calling step will guarantee that every generated tool call follows the specified tool schema.

Python

TypeScript

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key="COHERE_API_KEY",
6 )
7 
8 tools = [
9     {
10         "type": "function",
11         "function": {
12             "name": "get_flight_info",
13             "description": "Get flight information between two cities or airports",
14             "parameters": {
15                 "type": "object",
16                 "properties": {
17                     "loc_origin": {
18                         "type": "string",
19                         "description": "The departure airport, e.g. MIA",
20                     },
21                     "loc_destination": {
22                         "type": "string",
23                         "description": "The destination airport, e.g. NYC",
24                     },
25                 },
26                 "required": ["loc_origin", "loc_destination"],
27             },
28         },
29     }
30 ]
31 
32 messages = [
33     {"role": "developer", "content": "Today is April 30th"},
34     {
35         "role": "user",
36         "content": "When is the next flight from Miami to Seattle?",
37     },
38     {
39         "role": "assistant",
40         "tool_calls": [
41             {
42                 "function": {
43                     "arguments": '{ "loc_destination": "Seattle", "loc_origin": "Miami" }',
44                     "name": "get_flight_info",
45                 },
46                 "id": "get_flight_info0",
47                 "type": "function",
48             }
49         ],
50     },
51     {
52         "role": "tool",
53         "name": "get_flight_info",
54         "tool_call_id": "get_flight_info0",
55         "content": "Miami to Seattle, May 1st, 10 AM.",
56     },
57 ]
58 
59 completion = client.chat.completions.create(
60     model="command-a-03-2025",
61     messages=messages,
62     tools=tools,
63     temperature=0.7,
64 )
65 
66 print(completion.choices[0].message)

Example response (via the Python SDK):

1 ChatCompletionMessage(content='The next flight from Miami to Seattle is on May 1st, 10 AM.', refusal=None, role='assistant', audio=None, function_call=None, tool_calls=None)

Embeddings

You can generate text embeddings Embeddings API by passing a list of strings as the input parameter. You can also specify in encoding_format the format of embeddings to be generated. Can be either float or base64.

Python

TypeScript

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key=COHERE_API_KEY,
6 )
7 
8 response = client.embeddings.create(
9     input=["Hello world!"],
10     model="embed-v4.0",
11     encoding_format="float",
12 )
13 
14 print(
15     response.data[0].embedding[:5]
16 )  # Display the first 5 dimensions

Example response (via the Python SDK):

1 [0.0045051575, 0.046905518, 0.025543213, 0.009651184, -0.024993896]

Audio transcriptions

You can pass an audio file to the Audio Transcriptions API to to create a transcription of the audio for files up to 25MB in size.

Python

cURL

PYTHON

1 from openai import OpenAI
2 
3 client = OpenAI(
4     base_url="https://api.cohere.ai/compatibility/v1",
5     api_key=COHERE_API_KEY,
6 )
7 
8 response = client.audio.transcriptions.create(
9     model="cohere-transcribe-03-2026",
10     language="en",
11     file=open("./sample.wav", "rb"),
12 )
13 
14 print(response)

Supported parameters

The following is the list of supported parameters in the Compatibility API, including those that are not explicitly demonstrated in the examples above:

Chat completions

model
messages
stream
reasoning_effort (Only “none” and “high” are currently supported.)
response_format
tools
temperature
max_tokens
stop
seed
top_p
frequency_penalty
presence_penalty

Note

Currently, only none and high are supported for reasoning_effort.
These correspond to enabling or disabling thinking in the Cohere Chat API.
Passing medium or low is not supported at this time.

Embeddings

input
model
encoding_format

Audio transcriptions

model (required)
language (required)
file (required, must be the last parameter in the HTTP form-data request)
response_format (only “json” is supported)
temperature

Note

Please take note the following:

language is required in the Cohere Audio Transcriptions API but optional in the OpenAI Audio Transcriptions API.
file must be the last parameter in the cURL call.
The Cohere Audio Transcriptions API supports FLAC, MP3, MPEG, MPGA, OGG, and WAV files but does not support the following file types that the OpenAI Audio Transcriptions API does support: MP4, M4A, and WEBM.

Unsupported parameters

The following parameters are not supported in the Compatibility API:

Chat completions

store
metadata
logit_bias
top_logprobs
n
modalities
prediction
audio
service_tier
parallel_tool_calls

Embeddings

dimensions
user

Audio transcriptions

stream
prompt
timestamp_granularities
chunking_strategy
include
known_speaker_names
known_speaker_references

Cohere-specific parameters

Parameters that are uniquely available on the Cohere API but not on the OpenAI SDK are not supported.

Chat endpoint:

connectors
documents
citation_options
…more here

Embed endpoint:

input_type
images
truncate
…more here