Skip to content

Encode non-ASCII characters in cache tags at construction (revalidateTag, cacheTag, fetch tags) #1138

@github-actions

Description

@github-actions

Track Next.js fix for non-ASCII cache tags crashing ISR responses with ERR_INVALID_CHAR.

Upstream commit: vercel/next.js@9e18303 (#93601)
Fixes: vercel/next.js#93142
Closes: #93139, #93167

Problem

When a cache tag contains a non-ASCII character (Hebrew, Arabic, CJK, emoji, …), it gets written into the internal x-next-cache-tags HTTP header on ISR responses. Node's validateHeaderValue rejects any byte outside \t\x20-\x7e, so the response crashes with ERR_INVALID_CHAR. On platforms with stale-if-error (Vercel), the 500 is masked from clients but revalidation itself keeps failing and the cache stops refreshing for affected routes.

On Cloudflare Workers, Headers setters are more permissive, but if vinext writes a x-next-cache-tags header anywhere that is later parsed by Node-compatible code (e.g. KV cache handler post-processing, or a downstream Worker that uses validateHeaderValue semantics), we have the same crash class.

Fix shape

New helper packages/next/src/server/lib/encode-cache-tag.ts:

const OUT_OF_CLASS_CHAR = /[^\t\x20-\x7e]/
const OUT_OF_CLASS_RUN = /[^\t\x20-\x7e]+/g

export function encodeCacheTag(tag: string): string {
  return OUT_OF_CLASS_CHAR.test(tag)
    ? tag.replace(OUT_OF_CLASS_RUN, (run) => encodeURIComponent(run))
    : tag
}

Properties:

  • Applied at every public boundary so storage, comparison, and the wire all see the same canonical ASCII-safe form.
  • Idempotent on already-encoded %xx input (the fast-path returns unchanged input).
  • Matches runs of out-of-class code units so surrogate pairs (emoji) are handed to encodeURIComponent as a complete code point — a per-code-unit regex would split the pair and throw URIError.

Call sites updated

  • server/lib/implicit-tags.tsgetImplicitTags (path-derived _N_T_… tags)
  • server/lib/patch-fetch.tsvalidateTags (funnels cacheTag(), unstable_cache(), fetch tags)
  • server/web/spec-extension/revalidate.tsrevalidatePath, revalidateTag, updateTag

PR #93139 attempted to encode at construction but missed user-supplied tag entry points and used a decodeURIComponent round-trip that mangled literal %xx characters. PR #93167 encoded only at setHeader sites, which left storage and invalidation diverging. The canonical-form-at-the-boundary approach in #93601 covers all entry points uniformly.

Action for vinext

  1. Port encode-cache-tag.ts into packages/vinext/src/server/ (or shims/ if shared with the runtime).
  2. Apply it in vinext's equivalents:
    • shims/cache.tscacheTag(...tags) (currently at shims/cache.ts:733)
    • shims/cache.ts / shims/next-cache.tsrevalidateTag, revalidatePath, updateTag (whichever exist)
    • server/* — wherever fetch tags are validated and stored
    • Any path-derived tag construction (_N_T_/[slug]/page shape)
  3. Audit any Headers.set("x-next-cache-tags", …) (or equivalent) call site for places where un-encoded tags could land in a header.
  4. Port the e2e test at test/e2e/app-dir/non-ascii-cache-tags/ — at minimum: a page that calls cacheTag('שלום-עולם') and a route that calls revalidateTag with the same tag, asserting both ISR storage and revalidation work end to end.

Existing related issues (different concern): #708 tracks the new two-argument revalidateTag(tag, profile) signature and is independent of encoding.

Metadata

Metadata

Assignees

No one assigned

    Labels

    nextjs-trackingTracking issue for a Next.js canary change relevant to vinext

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions