Skip to content

fix: audio transcription sends [object FormData] when HTTP proxy is configured (Node 22+ FormData/undici mismatch) #48554

@dco5

Description

@dco5

Bug Type

Behavior bug

Summary

When HTTPS_PROXY (or HTTP_PROXY) is set in the environment, audio transcription requests (and batch file uploads) silently fail. The request body is sent as the literal string [object FormData] with Content-Type: text/plain;charset=UTF-8 instead of proper multipart/form-data with the audio file.

Root cause: resolveProxyFetchFromEnv() and makeProxyFetch() in src/infra/net/proxy-fetch.ts wrap undici.fetch as a drop-in for globalThis.fetch. But in Node 22+, globalThis.FormData and require("undici").FormData are different classes. When callers (e.g. transcribeOpenAiCompatibleAudio in audio.ts) create a new FormData() (global) and pass it as the request body, undici.fetch doesn't recognize it and calls .toString() on it.

// Node 22+
globalThis.FormData === require("undici").FormData  // false
globalThis.fetch === require("undici").fetch          // false

Steps to Reproduce

  1. Set HTTPS_PROXY in the OpenClaw container/environment
  2. Configure tools.media.audio with an OpenAI-compatible STT provider (e.g. local Whisper via Speaches)
  3. Send a Telegram voice message

Expected Behavior

Audio transcribed via multipart/form-data POST to the configured STT endpoint. Agent responds based on the transcript.

Actual Behavior

Request body is the literal string [object FormData] (17 bytes). Content-Type is text/plain;charset=UTF-8. The STT endpoint returns HTTP 422 (missing file and model fields). Transcription silently fails.

Reproduction Script

Run inside the OpenClaw container:

const { fetch: undiciFetch, FormData: UndiciFormData } = require("undici");
const blob = new Blob([new Uint8Array(10)], { type: "audio/ogg" });

// BUG: global FormData + undici fetch → [object FormData]
const form1 = new globalThis.FormData();
form1.append("file", blob, "audio.ogg");
form1.append("model", "whisper-1");
const res1 = await undiciFetch("http://httpbin.org/post", { method: "POST", body: form1 });
// → Content-Type: text/plain;charset=UTF-8, body: "[object FormData]"

// WORKS: undici FormData + undici fetch → proper multipart
const form2 = new UndiciFormData();
form2.append("file", blob, "audio.ogg");
form2.append("model", "whisper-1");
const res2 = await undiciFetch("http://httpbin.org/post", { method: "POST", body: form2 });
// → Content-Type: multipart/form-data;boundary=..., body: proper multipart

Affected Code

The bug is in the proxy fetch wrappers, not the callers:

  • src/infra/net/proxy-fetch.tsmakeProxyFetch() and resolveProxyFetchFromEnv() pass init.body (a globalThis.FormData) directly to undici.fetch, which doesn't recognize it
  • Affected callers: src/media-understanding/providers/openai/audio.ts (audio transcription), src/memory/batch-upload.ts (batch file uploads)

Proposed Fix

Add a normalizeInitForUndici() helper in proxy-fetch.ts that detects globalThis.FormData bodies and converts them to undici's FormData before dispatch. This keeps the proxy fetch wrappers drop-in compatible with globalThis.fetch. No changes needed in callers.

I have a working fix with tests ready — happy to open a PR.

Environment

  • OpenClaw version: v2026.3.13
  • Node.js: v24.14.0
  • undici: 7.24.1
  • OS: Ubuntu 24.04 (Docker)
  • Integration: Telegram voice messages
  • STT provider: Speaches (local Whisper, OpenAI-compatible API)

Impact

  • Severity: High — completely breaks audio transcription and batch file uploads for any user with an HTTP proxy configured
  • Workaround: Node.js preload script that replaces globalThis.FormData with undici's FormData (NODE_OPTIONS="--require /path/to/fix.cjs")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions