Skip to content

fix: official IANA MIME type for Apache Parquet files#748

Merged
sindresorhus merged 2 commits intosindresorhus:mainfrom
JoeCap08055:fix/apache-parquet-official-mime-type
May 18, 2025
Merged

fix: official IANA MIME type for Apache Parquet files#748
sindresorhus merged 2 commits intosindresorhus:mainfrom
JoeCap08055:fix/apache-parquet-official-mime-type

Conversation

@JoeCap08055
Copy link
Contributor

The MIME type 'application/x-parquet' is old/deprecated/unofficial. The official MIME type is: application/vnd.apache.parquet.
Also added a missing magic number for this type.

https://www.iana.org/assignments/media-types/application/vnd.apache.parquet

If you're adding support for a new file type, please follow the below steps:

  • One PR per file type.
  • Add the file's MIME type to the types array in supported.js.
  • Add the file type detection logic to the core.js file.

@Borewit
Copy link
Collaborator

Borewit commented May 8, 2025

Hi @JoeCap08055, thanks for correcting the MIME type.

I failed to find any trace of documentation mentioning PARE is used as a magic byte signature for Apache Parquet files.
Can you clarify where this comes from and why we should include that signature?
Do you have an original file using that signature you can share?

@JoeCap08055
Copy link
Contributor Author

JoeCap08055 commented May 9, 2025

Hi @JoeCap08055, thanks for correcting the MIME type.

I failed to find any trace of documentation mentioning PARE is used as a magic byte signature for Apache Parquet files. Can you clarify where this comes from and why we should include that signature? Do you have an original file using that signature you can share?

@Borewit I saw it in the official IANA descriptor here

Additional information:

  1. Deprecated alias names for this type: x-parquet
  2. Magic number(s): PAR1, PARE
  3. File extension(s): .parquet
  4. Macintosh file type code: not available
  5. Object Identifiers: not available

Also, see this GH issue that calls out "PARE" as being a magic number for Parquet files with encryption:
apache/parquet-format#381

@JoeCap08055
Copy link
Contributor Author

Side note: the Parquet docs only specifically refer to "PARE" being used for an encrypted file footer, not the header, so technically it wouldn't apply to file type determination, but it also can't hurt--and as it's a registered IANA magic number, it can't be misinterpreted as some other type.

@Borewit Your call on whether to leave the check in the code for "PARE" or not.

@Borewit
Copy link
Collaborator

Borewit commented May 9, 2025

Thanks for providing the background @JoeCap08055 , makes total sense to include that one indeed.

Copy link
Collaborator

@Borewit Borewit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Borewit Borewit requested a review from sindresorhus May 9, 2025 15:14
@sindresorhus
Copy link
Owner

Looks good to me, but this is a breaking change and must wait for the next major version.

@JoeCap08055
Copy link
Contributor Author

Looks good to me, but this is a breaking change and must wait for the next major version.

@sindresorhus that's unfortunate, but makes sense--is there a rough estimate of when that's expected?

@Borewit Borewit added the API change Major change, dependents may need to update their code label May 17, 2025
@Borewit
Copy link
Collaborator

Borewit commented May 17, 2025

@sindresorhus , we now have a number PR's with some minor API changes, I think this is a good moment to start merging those and start to work towards a major release.

@sindresorhus sindresorhus merged commit 98e3f8e into sindresorhus:main May 18, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API change Major change, dependents may need to update their code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants