chore: Updated content-type header parsing by jsumners · Pull Request #6414 · fastify/fastify

jsumners · 2025-12-14T15:49:15Z

Resolves #6332.

This is a follow-on from #6303. This introduces a more robust content-type header value parsing algorithm.

IMPORTANT: this results in much stricter content-type handling. Prior to this PR, we were accepting, and handling, headers like content-type: a-simple-string. That is not valid. There must be a type and a subtype present. So a "valid" form would be content-type: a-simple-string/foo.

This change also enforces normalization of the content-type value. For example, if the request has application/json; charset=utf-8 as the value, it will be normalized to application/json; charset="utf-8".

gurgunday

I think this is hard to land in v5

Also, we should check how much it affects the performance

Unfortunately we removed the benchmark CI, someone should run it locally (I'll do it when I review)

climba03003 · 2025-12-15T02:39:50Z

It like a light weight version of content-type parser?
Removed the parameters parts, it is ok on our case because we never use it.

How the performance after the change?

jsumners · 2025-12-15T11:40:38Z

It like a light weight version of content-type parser? Removed the parameters parts, it is ok on our case because we never use it.

In our code base, "content-type parser" refers to the thing that collects and manages the functions used to parse HTTP POST bodies. Prior to this PR, the indicator used to denote what format those POST bodies is in was inspected in various ways that are very fragile. This PR introduces a dedicated identifier parser.

How the performance after the change?

I don't know. Likely slower than the previous fragile checks, but likely not the slowest possible. Unfortunately, proper handling of the content-type header value is an imperative. We cannot continue using the previous fragile inspections.

mcollina

Are you sure that there are no content types without a /? I think a user could create one like that, and I feel we should kind of support it.

lib/content-type.js

jsumners · 2025-12-15T14:14:37Z

Are you sure that there are no content types without a /? I think a user could create one like that, and I feel we should kind of support it.

Please review https://httpwg.org/specs/rfc9110.html#media.type

The required format is type/subtype. Anything written as type is not spec compliant. We opt to adhere to the specs: https://github.com/fastify/fastify/blob/main/docs/Reference/Principles.md#specification-adherence

jsumners · 2025-12-15T21:48:40Z

@mcollina I have written the benchmark below to test the performance of splitting out only the type and subtype parts. I agree that the IIFE is too expensive. It looks like the regular expression version becomes worth it as we start dealing with the parameters.

Benchmark

'use strict'

const assert = require('node:assert')
const { Suite } = require('bench-node')
const suite = new Suite({
  reporter
})

const basicMediaType = 'application/json'
const paramsMediaType = 'application/json; charset=utf-8; foo=bar'

suite.add('iife split (basic)', () => {
  let parts = (() => {
    const [first, ...rest] = basicMediaType.split('/')
    if (rest.length === 0) return [first]
    return [first, rest.join('/')]
  })()
  assert.equal(parts[0], 'application')
  assert.equal(parts[1], 'json')
})

suite.add('iife split (params)', () => {
  let parts = (() => {
    const [first, ...rest] = paramsMediaType.split('/')
    if (rest.length === 0) return [first]
    return [first, rest.join('/')]
  })()

  const type = parts[0]
  parts = parts[1].split(';')
  const subtype = parts[0]
  
  assert.equal(type, 'application')
  assert.equal(subtype, 'json')
})

suite.add('long split (basic)', () => {
  let parts = basicMediaType.split('/')
  assert.equal(parts[0], 'application')
  assert.equal(parts[1], 'json')
})

suite.add('long split (params)', () => {
  let parts = paramsMediaType.split('/')
  const type = parts[0]

  parts = parts[1].split(';')
  const subtype = parts[0]

  assert.equal(type, 'application')
  assert.equal(subtype, 'json')
})

const tokensReg = /^([\w!#$%&'*+.^`|~-]+)\/([\w!#$%&'*+.^`|~-]+)\s*;?/
suite.add('regex split (basic)', () => {
  const matches = tokensReg.exec(basicMediaType)
  const type = matches[1]
  const subtype = matches[2]
  assert.equal(type, 'application')
  assert.equal(subtype, 'json')
})

suite.add('regex split (params)', () => {
  const matches = tokensReg.exec(paramsMediaType)
  const type = matches[1]
  const subtype = matches[2]
  assert.equal(type, 'application')
  assert.equal(subtype, 'json')
})

suite.run()

const { format } = new Intl.NumberFormat('en-US', {
  useGrouping: true
})
function reporter (results) {

  for (const result of results) {
    const ops = format(result.opsSec)
    console.log(`${result.name}: ${ops} ops/sec`)
  }
}

❯ node --allow-natives-syntax index.js
iife split (basic): 9,655,587.18 ops/sec
iife split (params): 6,261,203.487 ops/sec
long split (basic): 15,462,605.433 ops/sec
long split (params): 8,142,902.648 ops/sec
regex split (basic): 15,020,238.273 ops/sec
regex split (params): 14,836,393.537 ops/sec

Eomm

Closes #6077

Minor suggestions

lib/content-type-parser.js

lib/content-type.js

jsumners · 2025-12-23T11:39:06Z

I'll be getting back to this soon.

test/schema-validation.test.js

maneeshaking · 2026-01-16T13:11:55Z

Hi @jsumners and @mcollina ,

I've analyzed the blocker regarding the performance vs. accuracy of the header parsing. Since Regex is proving difficult for strictness and split has overhead, the most performant way to handle this in V8 is usually a direct indexOf slice.

Here is a proposed solution that fixes the vulnerability (adding the missing else check) and includes a high-performance parser that passes the tests without Regex complexity.

Optimized Parser (Replaces getEssenceMediaType) This handles application/json; charset=utf-8 correctly without allocating arrays.

function getEssenceMediaType (header) {
  if (!header) return ''
  let end = header.indexOf(';')
  if (end === -1) end = header.length
  return header.slice(0, end).trim().toLowerCase()
}

The Logic Fix (Inside validate) This ensures we don't "Fail Open" if the content type is missing from the schema map.

// ... inside validate() ...
    } else if (context[bodySchema]) {
      const contentType = getEssenceMediaType(request.headers['content-type'])
      const contentSchema = context[bodySchema][contentType]
      
      if (contentSchema) {
        validatorFunction = contentSchema
      } else {
        // Fix: Fail closed if Content-Type is present but not defined in schema
        const error = new Error('Content-Type not supported')
        error.statusCode = 415
        error.code = 'FST_ERR_VALIDATION'
        return error
      }
    }

This approach should satisfy the 415 test expectation and resolve the performance regression.

Co-authored-by: Manuel Spigolon <behemoth89@gmail.com> Signed-off-by: James Sumners <321201+jsumners@users.noreply.github.com>

jsumners · 2026-01-21T11:47:15Z

Here's some benchmark results for the algorithm in commit 7bbaad3:

❯ node --allow-natives-syntax index.js
iife split (basic): 9,570,374.43 ops/sec
iife split (params): 6,220,149.478 ops/sec
long split (basic): 15,476,672.638 ops/sec
long split (params): 8,291,173.241 ops/sec
regex split (basic): 15,846,585.338 ops/sec
regex split (params): 15,680,046.717 ops/sec
indexOf based (basic): 10,012,107.29 ops/sec
indexOf based (params): 2,958,473.828 ops/sec

Note that the long split (params) only benchmarks the split operations between the media type field and the parameters list. It does not cover the parsing of those parameters. Feel free to update and run locally to get a better approximation.

I await the group's decision.

Benchmark script:

'use strict'

const assert = require('node:assert')
const { Suite } = require('bench-node')
const suite = new Suite({
  reporter
})

const basicMediaType = 'application/json'
const paramsMediaType = 'application/json; charset=utf-8; foo="bar"'

const typeNameReg = /^[\w!#$%&'*+.^`|~-]+$/
const subtypeNameReg = /^[\w!#$%&'*+.^`|~-]+\s*/
const keyValuePairsReg = /([\w!#$%&'*+.^`|~-]+)=([^;]*)/gm

suite.add('iife split (basic)', () => {
  let parts = (() => {
    const [first, ...rest] = basicMediaType.split('/')
    if (rest.length === 0) return [first]
    return [first, rest.join('/')]
  })()
  assert.equal(parts[0], 'application')
  assert.equal(parts[1], 'json')
})

suite.add('iife split (params)', () => {
  let parts = (() => {
    const [first, ...rest] = paramsMediaType.split('/')
    if (rest.length === 0) return [first]
    return [first, rest.join('/')]
  })()

  const type = parts[0]
  parts = parts[1].split(';')
  const subtype = parts[0]
  
  assert.equal(type, 'application')
  assert.equal(subtype, 'json')
})

suite.add('long split (basic)', () => {
  let parts = basicMediaType.split('/')
  assert.equal(parts[0], 'application')
  assert.equal(parts[1], 'json')
})

suite.add('long split (params)', () => {
  let parts = paramsMediaType.split('/')
  const type = parts[0]

  parts = parts[1].split(';')
  const subtype = parts[0]

  assert.equal(type, 'application')
  assert.equal(subtype, 'json')
})

const tokensReg = /^([\w!#$%&'*+.^`|~-]+)\/([\w!#$%&'*+.^`|~-]+)\s*;?/
suite.add('regex split (basic)', () => {
  const matches = tokensReg.exec(basicMediaType)
  const type = matches[1]
  const subtype = matches[2]
  assert.equal(type, 'application')
  assert.equal(subtype, 'json')
})

suite.add('regex split (params)', () => {
  const matches = tokensReg.exec(paramsMediaType)
  const type = matches[1]
  const subtype = matches[2]
  assert.equal(type, 'application')
  assert.equal(subtype, 'json')
})

suite.add('indexOf based (basic)', () => {
  let sepIdx = basicMediaType.indexOf(';')
  if (sepIdx === -1) {
    sepIdx = basicMediaType.indexOf('/')
    if (sepIdx === -1) return
    const type = basicMediaType.slice(0, sepIdx).trimStart().toLowerCase()
    const subtype = basicMediaType.slice(sepIdx + 1).trimEnd().toLowerCase()

    if (typeNameReg.test(type) === true && subtypeNameReg.test(subtype) === true) {
      assert.equal(type, 'application')
      assert.equal(subtype, 'json')
    }

    return
  }
})

suite.add('indexOf based (params)', () => {
  let sepIdx = paramsMediaType.indexOf(';')
  if (sepIdx === -1) {
    sepIdx = paramsMediaType.indexOf('/')
    if (sepIdx === -1) return
    const type = paramsMediaType.slice(0, sepIdx).trimStart().toLowerCase()
    const subtype = paramsMediaType.slice(sepIdx + 1).trimEnd().toLowerCase()

    if (typeNameReg.test(type) === true && subtypeNameReg.test(subtype) === true) {
      assert.equal(type, 'application')
      assert.equal(subtype, 'json')
    }

    return
  }

  const mediaType = paramsMediaType.slice(0, sepIdx).toLowerCase()
  const paramsList = paramsMediaType.slice(sepIdx + 1).trim()

  sepIdx = mediaType.indexOf('/')
  if (sepIdx === -1) return
  const type = mediaType.slice(0, sepIdx).trimStart()
  const subtype = mediaType.slice(sepIdx + 1).trimEnd()

  if (typeNameReg.test(type) === false || subtypeNameReg.test(subtype) === false) {
    return
  }
  assert.equal(type, 'application')
  assert.equal(subtype, 'json')

  const parameters = new Map()
  let matches = keyValuePairsReg.exec(paramsList)
  while (matches) {
    const key = matches[1]
    const value = matches[2]
    if (value[0] === '"') {
      if (value.at(-1) !== '"') {
        parameters.set(key, 'invalid quoted string')
        matches = keyValuePairsReg.exec(paramsList)
        continue
      }
      parameters.set(key, value.slice(1, value.length - 1))
    } else {
      parameters.set(key, value)
    }
    matches = keyValuePairsReg.exec(paramsList)
  }
  assert.equal(parameters.size, 2)
})

suite.run()

const { format } = new Intl.NumberFormat('en-US', {
  useGrouping: true
})
function reporter (results) {

  for (const result of results) {
    const ops = format(result.opsSec)
    console.log(`${result.name}: ${ops} ops/sec`)
  }
}

mcollina

lgtm

jsumners · 2026-01-23T11:35:32Z

@Eomm do you have any further concern?

jean-michelet

Nice!

Let's close this after merge:
#6332

gurgunday

Are we landing this as a bug fix? Wouldn't SemVer major be more appropriate?

jsumners · 2026-01-25T12:15:58Z

Are we landing this as a bug fix? Wouldn't SemVer major be more appropriate?

This is a bug fix. The current content-type parser algorithm allows several invalid values, many of which allow for schema validation to be incorrectly applied. You can see this in the tests I added, updated, and removed.

I'll merge and release this on 2026-01-26 if we don't have any objections by then.

attn: @fastify/leads

gurgunday · 2026-01-26T11:48:45Z

I personally don't have objections, let's see users' reaction

jsumners force-pushed the parse-content-type branch 3 times, most recently from c6e17ba to 817bc34 Compare December 14, 2025 18:41

jsumners marked this pull request as ready for review December 14, 2025 18:43

jsumners requested review from a team and mcollina December 14, 2025 18:47

gurgunday reviewed Dec 14, 2025

View reviewed changes

jsumners marked this pull request as draft December 14, 2025 22:42

jsumners marked this pull request as ready for review December 15, 2025 00:20

jsumners force-pushed the parse-content-type branch 2 times, most recently from 93f288a to b620bfc Compare December 15, 2025 11:35

chore: Updated content-type header parsing

bd67b41

jsumners force-pushed the parse-content-type branch from b620bfc to bd67b41 Compare December 15, 2025 11:43

jsumners added the bugfix Issue or PR that should land as semver patch label Dec 15, 2025

mcollina requested changes Dec 15, 2025

View reviewed changes

lib/content-type.js Outdated Show resolved Hide resolved

lib/content-type.js Outdated Show resolved Hide resolved

lib/content-type.js Outdated Show resolved Hide resolved

lib/content-type.js Outdated Show resolved Hide resolved

Eomm reviewed Dec 18, 2025

View reviewed changes

lib/content-type-parser.js Outdated Show resolved Hide resolved

lib/content-type.js Outdated Show resolved Hide resolved

lib/content-type.js Outdated Show resolved Hide resolved

lib/content-type.js Outdated Show resolved Hide resolved

address feedback

126463b

jsumners force-pushed the parse-content-type branch from 4523a2b to 126463b Compare January 1, 2026 18:58

jsumners commented Jan 1, 2026

View reviewed changes

test/schema-validation.test.js Show resolved Hide resolved

refactor algorithm

7bbaad3

jsumners force-pushed the parse-content-type branch from 23c9425 to 7bbaad3 Compare January 19, 2026 13:35

jsumners requested a review from mcollina January 19, 2026 13:41

Update lib/content-type-parser.js

fcdeb15

Co-authored-by: Manuel Spigolon <behemoth89@gmail.com> Signed-off-by: James Sumners <321201+jsumners@users.noreply.github.com>

jsumners requested a review from Eomm January 19, 2026 13:42

appease coverage

68d8c65

mcollina approved these changes Jan 21, 2026

View reviewed changes

metcoder95 approved these changes Jan 22, 2026

View reviewed changes

jean-michelet approved these changes Jan 24, 2026

View reviewed changes

gurgunday reviewed Jan 24, 2026

View reviewed changes

jsumners merged commit 32d7b6a into main Jan 26, 2026
35 checks passed

jsumners deleted the parse-content-type branch January 26, 2026 11:08

claude bot mentioned this pull request Jan 26, 2026

chore(deps): update dependencies maiis/quickynab#85

Merged

1 task

dimbslmh mentioned this pull request Feb 6, 2026

Fastify v5 fails with HTTP 415 FST_ERR_CTP_INVALID_MEDIA_TYPE fox1t/fastify-multer#35

Open

Uh oh!

Conversation

jsumners commented Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gurgunday left a comment

Choose a reason for hiding this comment

Uh oh!

climba03003 commented Dec 15, 2025

Uh oh!

jsumners commented Dec 15, 2025

Uh oh!

mcollina left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jsumners commented Dec 15, 2025

Uh oh!

jsumners commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Eomm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jsumners commented Dec 23, 2025

Uh oh!

Uh oh!

maneeshaking commented Jan 16, 2026

Uh oh!

jsumners commented Jan 21, 2026

Uh oh!

mcollina left a comment

Choose a reason for hiding this comment

Uh oh!

jsumners commented Jan 23, 2026

Uh oh!

jean-michelet left a comment

Choose a reason for hiding this comment

Uh oh!

gurgunday left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jsumners commented Jan 25, 2026

Uh oh!

Uh oh!

gurgunday commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

jsumners commented Dec 14, 2025 •

edited

Loading

jsumners commented Dec 15, 2025 •

edited

Loading

gurgunday left a comment •

edited

Loading