fix: make content-type case-insensitive by gurgunday · Pull Request #5742 · fastify/fastify

gurgunday · 2024-10-15T20:47:01Z

so this actually correctly fixes #5740

standards say Content-Type is case insensitive, which is why we prevent the registration of text/html and text/HTML at the same time, but currently don't treat text/html and text/HTML as the same

github-actions · 2024-10-15T20:51:09Z

Node: 20
PR: [1] 1565k requests in 30.05s, 294 MB read
MAIN: [1] 1546k requests in 30.04s, 290 MB read

Node: 22
PR: [1] 1546k requests in 30.04s, 290 MB read
MAIN: [1] 1560k requests in 30.04s, 293 MB read

github-actions · 2024-10-15T20:51:13Z

Node: 20
PR: [1] 1367k requests in 30.05s, 257 MB read
MAIN: [1] 1357k requests in 30.04s, 255 MB read

Node: 22
PR: [1] 1380k requests in 30.04s, 259 MB read
MAIN: [1] 1392k requests in 30.05s, 261 MB read

gurgunday · 2024-10-15T20:55:07Z

We currently don't have this:

For example, the following media types are equivalent in describing HTML text data encoded in the UTF-8 character encoding scheme, but the first is preferred for consistency (the "charset" parameter value is defined as being case-insensitive in [RFC2046], Section 4.1.2):

text/html;charset=utf-8
Text/HTML;Charset="utf-8"
text/html; charset="utf-8"
text/html;charset=UTF-8

Uzlopak · 2024-10-15T21:47:58Z

cant we avoid somehow toLowerCase?

L2jLiga · 2024-10-16T02:07:16Z

cant we avoid somehow toLowerCase?

We already call toLowerCase in add and hasParser, I think its ok to have it in run as well

climba03003 · 2024-10-16T03:50:06Z

I would say we should do something like this. The intention is that we cache both any-case header and lowercase header.
So, most of the time it hit the cache with original header and check only once by lowercase header.

Please be aware multipart/form-data always pollute the cache (spammed with all useless cache record) since the boundary part may not be the same for all time.
But most of the case, the full lookup should only run one time.

ContentTypeParser.prototype.getParser = function (contentType) {
  // fast-path for remembered content-type (incoming as lowercase)
  let parser = this.customParsers.get(contentType)
  if (parser !== undefined) return parser
  parser = this.cache.get(contentType)
  if (parser !== undefined) return parser
  // case-incensitive cache check
  const lowercaseContentType = contentType.toLowerCase()
  let parser = this.customParsers.get(lowercaseContentType)
  if (parser !== undefined) return parser
  parser = this.cache.get(lowercaseContentType)
  if (parser !== undefined) return parser


  // eslint-disable-next-line no-var
  for (var i = 0; i !== this.parserList.length; ++i) {
    const parserListItem = this.parserList[i]
    if (
      lowercaseContentType.slice(0, parserListItem.length) === parserListItem &&
      (lowercaseContentType.length === parserListItem.length || lowercaseContentType.charCodeAt(parserListItem.length) === 59 /* `;` */ || lowercaseContentType.charCodeAt(parserListItem.length) === 32 /* ` ` */)
    ) {
      parser = this.customParsers.get(parserListItem)
      this.cache.set(contentType, parser)
      // cache for both lowercase and original header
      this.cache.set(lowercaseContentType, parser)
      return parser
    }
  }
  // eslint-disable-next-line no-var
  for (var j = 0; j !== this.parserRegExpList.length; ++j) {
    const parserRegExp = this.parserRegExpList[j]
    // regexp should use the original header in all case
    // it allows the user to choose whether they want case-sensitive or case-insensitive
    if (parserRegExp.test(contentType)) {
      parser = this.customParsers.get(parserRegExp.toString())
      this.cache.set(contentType, parser)
      // cache for both lowercase and original header
      this.cache.set(lowercaseContentType, parser)
      return parser
    }
  }
  return this.customParsers.get('')
}

metcoder95

This should be fine, the benchmarks doesn't show a big difference and should be fine.

Nevertheless, the suggestion of caching from @climba03003 seems quite interesting; adding a disclosure of the multipart/form-data limitations might be sufficient.

gurgunday · 2024-10-16T09:48:49Z

@climba03003 wdyt? I like this approach!

Every string Content-Type parser is lowercase , so:

If we miss the original cache checks, then we lowercase and try to match manually like before

If we match there, then we save the original version to the cache for future matches

gurgunday · 2024-10-16T09:49:47Z

Note that this only applies to string content-type parsers

Like you said, RegExp offers more freedom in how a user would want to match

mcollina

lgtm

metcoder95

lgtm

github-actions · 2025-10-17T00:30:56Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

fix: make content-type case-insensitive

53d06be

gurgunday added the benchmark Label to run benchmark against PR and main branch label Oct 15, 2024

github-actions Bot removed the benchmark Label to run benchmark against PR and main branch label Oct 15, 2024

gurgunday requested review from a team and mcollina October 15, 2024 20:55

This comment was marked as outdated.

Sign in to view

metcoder95 reviewed Oct 16, 2024

View reviewed changes

cache missed case insensitive content types

9803a65

move test to getParser

a1bc5a6

gurgunday requested a review from a team October 16, 2024 10:04

climba03003 approved these changes Oct 16, 2024

View reviewed changes

gurgunday requested a review from metcoder95 October 16, 2024 13:11

mcollina approved these changes Oct 16, 2024

View reviewed changes

mcollina merged commit c1789df into main Oct 16, 2024

mcollina deleted the lowercase branch October 16, 2024 14:57

metcoder95 reviewed Oct 16, 2024

View reviewed changes

Fdawgs mentioned this pull request Jan 12, 2025

addContentTypeParser matches the header charset in a case-sensitive fashion #4583

Closed

2 tasks

github-actions Bot locked as resolved and limited conversation to collaborators Oct 17, 2025

Uh oh!

Conversation

gurgunday commented Oct 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Oct 15, 2024

Uh oh!

github-actions Bot commented Oct 15, 2024

Uh oh!

gurgunday commented Oct 15, 2024

Uh oh!

Uzlopak commented Oct 15, 2024

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

L2jLiga commented Oct 16, 2024

Uh oh!

climba03003 commented Oct 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

metcoder95 left a comment

Choose a reason for hiding this comment

Uh oh!

gurgunday commented Oct 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gurgunday commented Oct 16, 2024

Uh oh!

mcollina left a comment

Choose a reason for hiding this comment

Uh oh!

metcoder95 left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

gurgunday commented Oct 15, 2024 •

edited

Loading

climba03003 commented Oct 16, 2024 •

edited

Loading

gurgunday commented Oct 16, 2024 •

edited

Loading