Skip to content

Commit 768a9d6

Browse files
Backport: feat (provider/gateway): add get-generation support (#13870)
This is an automated backport of #13842 to the release-v6.0 branch. FYI @shaper Co-authored-by: Walter Korman <shaper@vercel.com>
1 parent f07a378 commit 768a9d6

File tree

7 files changed

+566
-0
lines changed

7 files changed

+566
-0
lines changed

.changeset/brown-coats-obey.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
"@ai-sdk/gateway": patch
3+
---
4+
5+
feat (provider/gateway): add get-generation support

content/providers/01-ai-sdk-providers/00-ai-gateway.mdx

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,6 +238,86 @@ The `getCredits()` method returns your team's credit information based on the au
238238
- **balance** _number_ - Your team's current available credit balance
239239
- **total_used** _number_ - Total credits consumed by your team
240240

241+
## Generation Lookup
242+
243+
Look up detailed information about a specific generation by its ID, including cost, token usage, latency, and provider details. Generation IDs are available in `providerMetadata.gateway.generationId` on both `generateText` and `streamText` responses.
244+
245+
When streaming, the generation ID is injected on the first content chunk, so you can capture it early in the stream without waiting for completion. This is especially useful in cases where a network interruption or mid-stream error could prevent you from receiving the final response — since the gateway records the final status server-side, you can use the generation ID to look up the results (including cost, token usage, and finish reason) later via `getGenerationInfo()`.
246+
247+
```ts
248+
import { gateway, generateText } from 'ai';
249+
250+
// Make a request
251+
const result = await generateText({
252+
model: gateway('anthropic/claude-sonnet-4'),
253+
prompt: 'Explain quantum entanglement briefly',
254+
});
255+
256+
// Get the generation ID from provider metadata
257+
const generationId = result.providerMetadata?.gateway?.generationId;
258+
259+
// Look up detailed generation info
260+
const generation = await gateway.getGenerationInfo({ id: generationId });
261+
262+
console.log(`Model: ${generation.model}`);
263+
console.log(`Cost: $${generation.totalCost.toFixed(6)}`);
264+
console.log(`Latency: ${generation.latency}ms`);
265+
console.log(`Prompt tokens: ${generation.promptTokens}`);
266+
console.log(`Completion tokens: ${generation.completionTokens}`);
267+
```
268+
269+
With `streamText`, you can capture the generation ID from the first chunk via `fullStream`:
270+
271+
```ts
272+
import { gateway, streamText } from 'ai';
273+
274+
const result = streamText({
275+
model: gateway('anthropic/claude-sonnet-4'),
276+
prompt: 'Explain quantum entanglement briefly',
277+
});
278+
279+
let generationId: string | undefined;
280+
281+
for await (const part of result.fullStream) {
282+
if (!generationId && part.providerMetadata?.gateway?.generationId) {
283+
generationId = part.providerMetadata.gateway.generationId as string;
284+
console.log(`Generation ID (early): ${generationId}`);
285+
}
286+
}
287+
288+
// Look up cost and usage after the stream completes
289+
if (generationId) {
290+
const generation = await gateway.getGenerationInfo({ id: generationId });
291+
console.log(`Cost: $${generation.totalCost.toFixed(6)}`);
292+
console.log(`Finish reason: ${generation.finishReason}`);
293+
}
294+
```
295+
296+
The `getGenerationInfo()` method accepts:
297+
298+
- **id** _string_ - The generation ID to look up (format: `gen_<ulid>`, required)
299+
300+
It returns a `GatewayGenerationInfo` object with the following fields:
301+
302+
- **id** _string_ - The generation ID
303+
- **totalCost** _number_ - Total cost in USD
304+
- **upstreamInferenceCost** _number_ - Upstream inference cost in USD (relevant for BYOK)
305+
- **usage** _number_ - Usage cost in USD (same as totalCost)
306+
- **createdAt** _string_ - ISO 8601 timestamp when the generation was created
307+
- **model** _string_ - Model identifier used
308+
- **isByok** _boolean_ - Whether Bring Your Own Key credentials were used
309+
- **providerName** _string_ - The provider that served this generation
310+
- **streamed** _boolean_ - Whether streaming was used
311+
- **finishReason** _string_ - Finish reason (e.g. `'stop'`)
312+
- **latency** _number_ - Time to first token in milliseconds
313+
- **generationTime** _number_ - Total generation time in milliseconds
314+
- **promptTokens** _number_ - Number of prompt tokens
315+
- **completionTokens** _number_ - Number of completion tokens
316+
- **reasoningTokens** _number_ - Reasoning tokens used (if applicable)
317+
- **cachedTokens** _number_ - Cached tokens used (if applicable)
318+
- **cacheCreationTokens** _number_ - Cache creation input tokens
319+
- **billableWebSearchCalls** _number_ - Number of billable web search calls
320+
241321
## Examples
242322

243323
### Basic Text Generation
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
import { gateway, streamText } from 'ai';
2+
import { run } from '../lib/run';
3+
4+
run(async () => {
5+
const result = streamText({
6+
model: gateway('anthropic/claude-haiku-4.5'),
7+
prompt: 'What animals are relatives of the tenrec?',
8+
});
9+
10+
result.consumeStream();
11+
console.log('Response:', await result.text);
12+
console.log('Token usage:', await result.usage);
13+
const providerMetadata = await result.providerMetadata;
14+
console.log('Provider metadata:', JSON.stringify(providerMetadata, null, 2));
15+
16+
const generationId = (
17+
providerMetadata?.gateway as { generationId?: string } | undefined
18+
)?.generationId;
19+
20+
if (!generationId) {
21+
console.log('No generation ID found in provider metadata.');
22+
return;
23+
}
24+
25+
console.log(`\nGeneration ID: ${generationId}`);
26+
27+
console.log('\nWaiting briefly for generation data to become available...');
28+
await new Promise(resolve => setTimeout(resolve, 30_000));
29+
30+
console.log('\n--- Generation Details ---\n');
31+
const generation = await gateway.getGenerationInfo({ id: generationId });
32+
console.log(JSON.stringify(generation, null, 2));
33+
});
Lines changed: 268 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,268 @@
1+
import { createTestServer } from '@ai-sdk/test-server/with-vitest';
2+
import { describe, expect, it, vi } from 'vitest';
3+
import { GatewayGenerationInfoFetcher } from './gateway-generation-info';
4+
import type { FetchFunction } from '@ai-sdk/provider-utils';
5+
import {
6+
GatewayAuthenticationError,
7+
GatewayInternalServerError,
8+
GatewayResponseError,
9+
} from './errors';
10+
11+
function createFetcher({
12+
headers,
13+
fetch,
14+
}: {
15+
headers?: () => Record<string, string>;
16+
fetch?: FetchFunction;
17+
} = {}) {
18+
return new GatewayGenerationInfoFetcher({
19+
baseURL: 'https://api.example.com',
20+
headers: headers ?? (() => ({ Authorization: 'Bearer test-token' })),
21+
fetch,
22+
});
23+
}
24+
25+
const mockGenerationResponse = {
26+
data: {
27+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
28+
total_cost: 0.00123,
29+
upstream_inference_cost: 0.0011,
30+
usage: 0.00123,
31+
created_at: '2024-01-01T00:00:00.000Z',
32+
model: 'gpt-4',
33+
is_byok: false,
34+
provider_name: 'openai',
35+
streamed: true,
36+
finish_reason: 'stop',
37+
latency: 200,
38+
generation_time: 1500,
39+
native_tokens_prompt: 100,
40+
native_tokens_completion: 50,
41+
native_tokens_reasoning: 0,
42+
native_tokens_cached: 0,
43+
native_tokens_cache_creation: 0,
44+
billable_web_search_calls: 0,
45+
},
46+
};
47+
48+
describe('GatewayGenerationInfoFetcher', () => {
49+
const server = createTestServer({
50+
'https://api.example.com/*': {
51+
response: {
52+
type: 'json-value',
53+
body: mockGenerationResponse,
54+
},
55+
},
56+
});
57+
58+
describe('getGenerationInfo', () => {
59+
it('should fetch from the correct endpoint with generation ID', async () => {
60+
const fetcher = createFetcher();
61+
62+
await fetcher.getGenerationInfo({
63+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
64+
});
65+
66+
expect(server.calls[0].requestMethod).toBe('GET');
67+
const url = new URL(server.calls[0].requestUrl);
68+
expect(url.pathname).toBe('/v1/generation');
69+
expect(url.searchParams.get('id')).toBe('gen_01ARZ3NDEKTSV4RRFFQ69G5FAV');
70+
});
71+
72+
it('should transform snake_case response fields to camelCase', async () => {
73+
const fetcher = createFetcher();
74+
75+
const result = await fetcher.getGenerationInfo({
76+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
77+
});
78+
79+
expect(result).toEqual({
80+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
81+
totalCost: 0.00123,
82+
upstreamInferenceCost: 0.0011,
83+
usage: 0.00123,
84+
createdAt: '2024-01-01T00:00:00.000Z',
85+
model: 'gpt-4',
86+
isByok: false,
87+
providerName: 'openai',
88+
streamed: true,
89+
finishReason: 'stop',
90+
latency: 200,
91+
generationTime: 1500,
92+
promptTokens: 100,
93+
completionTokens: 50,
94+
reasoningTokens: 0,
95+
cachedTokens: 0,
96+
cacheCreationTokens: 0,
97+
billableWebSearchCalls: 0,
98+
});
99+
});
100+
101+
it('should unwrap the data envelope', async () => {
102+
const fetcher = createFetcher();
103+
104+
const result = await fetcher.getGenerationInfo({
105+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
106+
});
107+
108+
// Result should be the data object directly, not { data: ... }
109+
expect('data' in result).toBe(false);
110+
expect(result.id).toBe('gen_01ARZ3NDEKTSV4RRFFQ69G5FAV');
111+
});
112+
113+
it('should not have snake_case fields in result', async () => {
114+
const fetcher = createFetcher();
115+
116+
const result = await fetcher.getGenerationInfo({
117+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
118+
});
119+
120+
expect('total_cost' in result).toBe(false);
121+
expect('is_byok' in result).toBe(false);
122+
expect('provider_name' in result).toBe(false);
123+
expect('created_at' in result).toBe(false);
124+
expect('generation_time' in result).toBe(false);
125+
expect('finish_reason' in result).toBe(false);
126+
});
127+
128+
it('should pass headers correctly', async () => {
129+
const fetcher = createFetcher({
130+
headers: () => ({
131+
Authorization: 'Bearer custom-token',
132+
'Custom-Header': 'custom-value',
133+
}),
134+
});
135+
136+
await fetcher.getGenerationInfo({
137+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
138+
});
139+
140+
expect(server.calls[0].requestHeaders).toEqual({
141+
authorization: 'Bearer custom-token',
142+
'custom-header': 'custom-value',
143+
});
144+
});
145+
146+
it('should handle 401 authentication errors', async () => {
147+
server.urls['https://api.example.com/*'].response = {
148+
type: 'error',
149+
status: 401,
150+
body: JSON.stringify({
151+
error: {
152+
message: 'Unauthorized',
153+
type: 'authentication_error',
154+
},
155+
}),
156+
};
157+
158+
const fetcher = createFetcher();
159+
160+
try {
161+
await fetcher.getGenerationInfo({
162+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
163+
});
164+
expect.fail('Should have thrown an error');
165+
} catch (error) {
166+
expect(GatewayAuthenticationError.isInstance(error)).toBe(true);
167+
const authError = error as GatewayAuthenticationError;
168+
expect(authError.statusCode).toBe(401);
169+
}
170+
});
171+
172+
it('should handle 500 internal server errors', async () => {
173+
server.urls['https://api.example.com/*'].response = {
174+
type: 'error',
175+
status: 500,
176+
body: JSON.stringify({
177+
error: {
178+
message: 'Failed to retrieve usage data',
179+
type: 'internal_server_error',
180+
},
181+
}),
182+
};
183+
184+
const fetcher = createFetcher();
185+
186+
await expect(
187+
fetcher.getGenerationInfo({
188+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
189+
}),
190+
).rejects.toThrow(GatewayInternalServerError);
191+
});
192+
193+
it('should handle malformed JSON error responses', async () => {
194+
server.urls['https://api.example.com/*'].response = {
195+
type: 'error',
196+
status: 500,
197+
body: '{ invalid json',
198+
};
199+
200+
const fetcher = createFetcher();
201+
202+
try {
203+
await fetcher.getGenerationInfo({
204+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
205+
});
206+
expect.fail('Should have thrown an error');
207+
} catch (error) {
208+
expect(GatewayResponseError.isInstance(error)).toBe(true);
209+
const responseError = error as GatewayResponseError;
210+
expect(responseError.statusCode).toBe(500);
211+
}
212+
});
213+
214+
it('should use custom fetch function when provided', async () => {
215+
const mockFetch = vi.fn().mockResolvedValue(
216+
new Response(JSON.stringify(mockGenerationResponse), {
217+
status: 200,
218+
headers: { 'Content-Type': 'application/json' },
219+
}),
220+
);
221+
222+
const fetcher = createFetcher({ fetch: mockFetch });
223+
224+
const result = await fetcher.getGenerationInfo({
225+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
226+
});
227+
228+
expect(mockFetch).toHaveBeenCalled();
229+
expect(result.totalCost).toBe(0.00123);
230+
expect(result.model).toBe('gpt-4');
231+
});
232+
233+
it('should encode special characters in generation ID', async () => {
234+
const fetcher = createFetcher();
235+
236+
await fetcher.getGenerationInfo({
237+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
238+
});
239+
240+
const url = new URL(server.calls[0].requestUrl);
241+
expect(url.searchParams.get('id')).toBe('gen_01ARZ3NDEKTSV4RRFFQ69G5FAV');
242+
});
243+
244+
it('should handle BYOK generation response', async () => {
245+
server.urls['https://api.example.com/*'].response = {
246+
type: 'json-value',
247+
body: {
248+
data: {
249+
...mockGenerationResponse.data,
250+
is_byok: true,
251+
upstream_inference_cost: 0.0009,
252+
provider_name: 'anthropic',
253+
model: 'claude-sonnet-4',
254+
},
255+
},
256+
};
257+
258+
const fetcher = createFetcher();
259+
const result = await fetcher.getGenerationInfo({
260+
id: 'gen_01ARZ3NDEKTSV4RRFFQ69G5FAV',
261+
});
262+
263+
expect(result.isByok).toBe(true);
264+
expect(result.upstreamInferenceCost).toBe(0.0009);
265+
expect(result.providerName).toBe('anthropic');
266+
});
267+
});
268+
});

0 commit comments

Comments
 (0)