Summary
docker agent eval does not preserve the "id" field from input eval JSON files in the results output. Instead, it generates a new UUID for each session, discarding any "id" value supplied by the caller. The "title" field is preserved correctly — only "id" is dropped.
Steps to Reproduce
mkdir -p /tmp/probe-eval /tmp/probe-out
cat > /tmp/probe-eval/test.json <<'EOF'
{
"id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"title": "Probe eval",
"evals": { "relevance": ["Answers the question"] },
"messages": [{ "message": { "agentName": "", "message": { "role": "user", "content": "What is Docker?" } } }]
}
EOF
docker agent eval ~/Workspace/gordon/assistant/gordon_dev.yaml /tmp/probe-eval --output /tmp/probe-out --concurrency 1
jq '.sessions[0] | {id, title}' /tmp/probe-out/*.json
Current Behaviour
The output session receives a freshly generated UUID, ignoring the "id" present in the input file:
{
"id": "91907fe1-cd72-4e88-b1a5-1b439675f7c5",
"title": "Probe eval"
}
Expected Behaviour
When an eval JSON file contains an "id" field, docker agent eval should carry that value through to the corresponding session entry in the results output:
{
"id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
"title": "Probe eval"
}
If no "id" is present in the input file, the current behaviour (auto-generating a UUID) is acceptable.
Motivation / Impact
Any downstream system that writes eval JSON files with a known "id" — for example to correlate results back to a record in a source database — is broken by this behaviour. Because the output session ID is an unrelated UUID, there is no reliable way to map a result back to the originating eval record without resorting to fragile heuristics (e.g. matching on "title", which is not guaranteed to be unique).
Preserving the caller-supplied "id" is a minimal, non-breaking change: it only affects sessions whose input file already carries an "id", and leaves auto-generation in place for all other cases.
Summary
docker agent evaldoes not preserve the"id"field from input eval JSON files in the results output. Instead, it generates a new UUID for each session, discarding any"id"value supplied by the caller. The"title"field is preserved correctly — only"id"is dropped.Steps to Reproduce
Current Behaviour
The output session receives a freshly generated UUID, ignoring the
"id"present in the input file:{ "id": "91907fe1-cd72-4e88-b1a5-1b439675f7c5", "title": "Probe eval" }Expected Behaviour
When an eval JSON file contains an
"id"field,docker agent evalshould carry that value through to the corresponding session entry in the results output:{ "id": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee", "title": "Probe eval" }If no
"id"is present in the input file, the current behaviour (auto-generating a UUID) is acceptable.Motivation / Impact
Any downstream system that writes eval JSON files with a known
"id"— for example to correlate results back to a record in a source database — is broken by this behaviour. Because the output session ID is an unrelated UUID, there is no reliable way to map a result back to the originating eval record without resorting to fragile heuristics (e.g. matching on"title", which is not guaranteed to be unique).Preserving the caller-supplied
"id"is a minimal, non-breaking change: it only affects sessions whose input file already carries an"id", and leaves auto-generation in place for all other cases.