Using the node package
Installation
promptfoo is available as a node package on npm:
npm install promptfoo
Usage
Use promptfoo as a library in your project by importing the evaluate function and other utilities:
import promptfoo from 'promptfoo';
const results = await promptfoo.evaluate(testSuite, options);
The evaluate function takes the following parameters:
-
testSuite: the Javascript equivalent of the promptfooconfig.yaml as aTestSuiteConfigurationobject. -
options: misc options related to how the test harness runs, as anEvaluateOptionsobject.
The results of the evaluation are returned as an EvaluateSummary object.
Provider functions
A ProviderFunction is a Javascript function that implements an LLM API call. It takes a prompt string and a context. It returns the LLM response or an error. See ProviderFunction type.
You can load providers using the loadApiProvider function:
import { loadApiProvider } from 'promptfoo';
// Load a provider with default options
const provider = await loadApiProvider('openai:o3-mini');
// Load a provider with custom options
const providerWithOptions = await loadApiProvider('azure:chat:test', {
options: {
apiHost: 'test-host',
apiKey: 'test-key',
},
});
Assertion functions
An Assertion can take an AssertionValueFunction as its value. The function receives:
output: the LLM output stringcontext: execution context, includingprompt,vars,test,logProbs,config,provider,providerResponse, and optionaltracedata for debugging
Type definition
type AssertionValueFunction = (
output: string,
context: AssertionValueFunctionContext,
) => AssertionValueFunctionResult | Promise<AssertionValueFunctionResult>;
interface AssertionValueFunctionContext {
prompt: string | undefined;
vars: Record<string, unknown>;
test: AtomicTestCase;
logProbs: number[] | undefined;
config?: Record<string, any>;
provider: ApiProvider | undefined;
providerResponse: ProviderResponse | undefined;
trace?: TraceData;
}
type AssertionValueFunctionResult = boolean | number | GradingResult;
interface GradingResult {
// Whether the test passed or failed
pass: boolean;
// Test score, typically between 0 and 1
score: number;
// Plain text reason for the result
reason: string;
// Map of labeled metrics to values
namedScores?: Record<string, number>;
// Weighted denominator for namedScores when assertion weights are used
namedScoreWeights?: Record<string, number>;
// Record of tokens usage for this assertion
tokensUsed?: Partial<{
total: number;
prompt: number;
completion: number;
cached?: number;
}>;
// Additional matcher/provider metadata
metadata?: Record<string, unknown>;
// List of results for each component of the assertion
componentResults?: GradingResult[];
// The assertion that was evaluated
assertion?: Assertion;
}
For more info on different assertion types, see assertions & metrics.
Transform functions
When using the node package, you can pass JavaScript functions directly as transform, transformVars, or contextTransform values — instead of string expressions or file:// references.
This enables better IDE support, type checking, and debugging:
import promptfoo from 'promptfoo';
const results = await promptfoo.evaluate({
prompts: ['What tools did you use to answer: {{question}}'],
providers: ['openai:gpt-5-mini'],
tests: [
{
vars: { question: 'What is 2+2?' },
options: {
// Transform the output before assertions
transform: (output, context) => {
return output.toUpperCase();
},
},
assert: [
{
type: 'contains',
value: 'calculator',
// Transform just for this assertion
transform: (output, context) => {
const tools = context.metadata?.toolCalls ?? [];
return tools.map((t) => t.name).join(', ');
},
},
],
},
],
});
Transform functions receive:
output: the LLM output (string or object)context: an object containingvars,prompt, and optionallymetadatafrom the provider response
Function transforms are not serializable. If you use writeLatestResults: true, function transforms will not be persisted in the stored config. Use string expressions or file:// references if you need results to be fully reproducible from the stored eval.
For more on transforms, see Transforming Outputs.
Example
promptfoo exports an evaluate function that you can use to run prompt evaluations.
import promptfoo from 'promptfoo';
const results = await promptfoo.evaluate(
{
prompts: ['Rephrase this in French: {{body}}', 'Rephrase this like a pirate: {{body}}'],
providers: ['openai:gpt-5-mini'],
tests: [
{
vars: {
body: 'Hello world',
},
},
{
vars: {
body: "I'm hungry",
},
},
],
writeLatestResults: true, // write results to disk so they can be viewed in web viewer
},
{
maxConcurrency: 2,
},
);
console.log(results);
This code imports the promptfoo library, defines the evaluation options, and then calls the evaluate function with these options.
You can also supply functions as prompts, providers, or asserts:
import promptfoo from 'promptfoo';
(async () => {
const results = await promptfoo.evaluate({
prompts: [
'Rephrase this in French: {{body}}',
(vars) => {
return `Rephrase this like a pirate: ${vars.body}`;
},
],
providers: [
'openai:gpt-5-mini',
(prompt, context) => {
// Call LLM here...
console.log(`Prompt: ${prompt}, vars: ${JSON.stringify(context.vars)}`);
return {
output: '<LLM output>',
};
},
],
tests: [
{
vars: {
body: 'Hello world',
},
},
{
vars: {
body: "I'm hungry",
},
assert: [
{
type: 'javascript',
value: (output) => {
const pass = output.includes("J'ai faim");
return {
pass,
score: pass ? 1.0 : 0.0,
reason: pass ? 'Output contained substring' : 'Output did not contain substring',
};
},
},
],
},
],
});
console.log('RESULTS:');
console.log(results);
})();
There's a full example on Github here.
Here's the example output in JSON format:
{
"results": [
{
"prompt": {
"raw": "Rephrase this in French: Hello world",
"display": "Rephrase this in French: {{body}}"
},
"vars": {
"body": "Hello world"
},
"response": {
"output": "Bonjour le monde",
"tokenUsage": {
"total": 19,
"prompt": 16,
"completion": 3
}
}
},
{
"prompt": {
"raw": "Rephrase this in French: I'm hungry",
"display": "Rephrase this in French: {{body}}"
},
"vars": {
"body": "I'm hungry"
},
"response": {
"output": "J'ai faim.",
"tokenUsage": {
"total": 24,
"prompt": 19,
"completion": 5
}
}
}
// ...
],
"stats": {
"successes": 4,
"failures": 0,
"tokenUsage": {
"total": 120,
"prompt": 72,
"completion": 48
}
},
"table": [
["Rephrase this in French: {{body}}", "Rephrase this like a pirate: {{body}}", "body"],
["Bonjour le monde", "Ahoy thar, me hearties! Avast ye, world!", "Hello world"],
[
"J'ai faim.",
"Arrr, me belly be empty and me throat be parched! I be needin' some grub, matey!",
"I'm hungry"
]
]
}
Sharing Results
To get a shareable URL, set sharing: true along with writeLatestResults: true:
const results = await promptfoo.evaluate({
prompts: ['Your prompt here'],
providers: ['openai:gpt-5-mini'],
tests: [{ vars: { input: 'test' } }],
writeLatestResults: true,
sharing: true,
});
console.log(results.shareableUrl); // https://app.promptfoo.dev/eval/abc123
Requires a Promptfoo Cloud account or self-hosted server. For self-hosted, pass sharing: { apiBaseUrl, appBaseUrl } instead of true.