Skip to content

Prefer extensions over loose filename tags#2092

Merged
j178 merged 1 commit into
masterfrom
identify-extension-precedence
May 17, 2026
Merged

Prefer extensions over loose filename tags#2092
j178 merged 1 commit into
masterfrom
identify-extension-precedence

Conversation

@j178

@j178 j178 commented May 17, 2026

Copy link
Copy Markdown
Owner

Closes #2087

Prefer recognized extensions over loose filename-prefix matches during file identification, while preserving exact filename tags and unknown-extension fallbacks.

Copilot AI review requested due to automatic review settings May 17, 2026 13:18
@codecov

codecov Bot commented May 17, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.46%. Comparing base (580caee) to head (e931848).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2092      +/-   ##
==========================================
- Coverage   92.47%   92.46%   -0.01%     
==========================================
  Files         119      119              
  Lines       24478    24493      +15     
==========================================
+ Hits        22635    22647      +12     
- Misses       1843     1846       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates prek-identify filename tagging to prefer known file extensions over loose filename-prefix matches (e.g., makefile.png no longer inherits makefile/text tags), preventing binary files from being misidentified as text while keeping exact-filename semantics intact.

Changes:

  • Prefer recognized extension tags over prefix-based filename tag matches, while still preserving exact filename tag matches.
  • Add focused unit tests covering extension-precedence, unknown-extension fallback, and exact-name + known-extension merging.
  • Document this intentional behavioral divergence from upstream in docs/diff.md.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
docs/diff.md Documents the intentional divergence: extensions take precedence over loose filename-prefix matches.
crates/prek-identify/src/lib.rs Adjusts filename tagging logic to prioritize known extensions and adds unit tests for the new behavior.

@j178 j178 added the enhancement New feature or request label May 17, 2026
@prek-ci-bot

prek-ci-bot Bot commented May 17, 2026

Copy link
Copy Markdown

📦 Cargo Bloat Comparison

Binary size change: +0.00% (26.1 MiB → 26.1 MiB)

Expand for cargo-bloat output

Head Branch Results

 File  .text     Size             Crate Name
 1.2%   2.6% 332.0KiB        aws_lc_sys aws_lc_0_39_1_aes_gcm_encrypt_avx512
 1.2%   2.6% 332.0KiB        aws_lc_sys aws_lc_0_39_1_aes_gcm_decrypt_avx512
 0.3%   0.7%  91.9KiB              prek prek::languages::<impl prek::config::Language>::run::{{closure}}::{{closure}}
 0.3%   0.7%  83.8KiB              prek prek::languages::<impl prek::config::Language>::run::{{closure}}::{{closure}}
 0.2%   0.5%  65.5KiB              prek prek::languages::<impl prek::config::Language>::install::{{closure}}
 0.2%   0.5%  61.1KiB             prek? <prek::cli::Command as clap_builder::derive::Subcommand>::augment_subcommands
 0.2%   0.4%  53.3KiB annotate_snippets annotate_snippets::renderer::render::render
 0.2%   0.4%  48.6KiB              prek prek::run::{{closure}}
 0.2%   0.3%  43.3KiB              prek prek::cli::run::run::run::{{closure}}
 0.1%   0.3%  33.3KiB             prek? <prek::cli::RunArgs as clap_builder::derive::Args>::augment_args
 0.1%   0.2%  32.1KiB             prek? <prek::config::_::<impl serde_core::de::Deserialize for prek::config::Config>::deserialize::__Visitor as serde_core::de::Visitor>::visit_map
 0.1%   0.2%  28.7KiB               std core::ptr::drop_in_place<prek::languages::<impl prek::config::Language>::install::{{closure}}>
 0.1%   0.2%  28.0KiB        aws_lc_sys aws_lc_0_39_1_edwards25519_scalarmuldouble_alt
 0.1%   0.2%  27.6KiB      serde_saphyr saphyr_parser_bw::scanner::Scanner<T>::fetch_more_tokens
 0.1%   0.2%  27.5KiB        aws_lc_sys aws_lc_0_39_1_edwards25519_scalarmuldouble
 0.1%   0.2%  26.2KiB              prek prek::cli::try_repo::try_repo::{{closure}}
 0.1%   0.2%  23.2KiB              prek prek::hooks::meta_hooks::MetaHooks::run::{{closure}}
 0.1%   0.2%  22.5KiB      serde_saphyr saphyr_parser_bw::scanner::Scanner<T>::fetch_more_tokens
 0.1%   0.2%  22.3KiB         [Unknown] Lp384_montjscalarmul_alt_p384_montjadd
 0.1%   0.2%  21.5KiB      clap_builder clap_builder::parser::parser::Parser::get_matches_with
41.6%  86.3%  10.9MiB                   And 23856 smaller methods. Use -n N to show more.
48.1% 100.0%  12.6MiB                   .text section size, the file size is 26.1MiB

Base Branch Results

 File  .text     Size             Crate Name
 1.2%   2.6% 332.0KiB        aws_lc_sys aws_lc_0_39_1_aes_gcm_encrypt_avx512
 1.2%   2.6% 332.0KiB        aws_lc_sys aws_lc_0_39_1_aes_gcm_decrypt_avx512
 0.3%   0.7%  91.9KiB              prek prek::languages::<impl prek::config::Language>::run::{{closure}}::{{closure}}
 0.3%   0.7%  83.8KiB              prek prek::languages::<impl prek::config::Language>::run::{{closure}}::{{closure}}
 0.2%   0.5%  65.5KiB              prek prek::languages::<impl prek::config::Language>::install::{{closure}}
 0.2%   0.5%  61.1KiB             prek? <prek::cli::Command as clap_builder::derive::Subcommand>::augment_subcommands
 0.2%   0.4%  53.3KiB annotate_snippets annotate_snippets::renderer::render::render
 0.2%   0.4%  48.6KiB              prek prek::run::{{closure}}
 0.2%   0.3%  43.3KiB              prek prek::cli::run::run::run::{{closure}}
 0.1%   0.3%  33.3KiB             prek? <prek::cli::RunArgs as clap_builder::derive::Args>::augment_args
 0.1%   0.2%  32.1KiB             prek? <prek::config::_::<impl serde_core::de::Deserialize for prek::config::Config>::deserialize::__Visitor as serde_core::de::Visitor>::visit_map
 0.1%   0.2%  28.7KiB               std core::ptr::drop_in_place<prek::languages::<impl prek::config::Language>::install::{{closure}}>
 0.1%   0.2%  28.0KiB        aws_lc_sys aws_lc_0_39_1_edwards25519_scalarmuldouble_alt
 0.1%   0.2%  27.6KiB      serde_saphyr saphyr_parser_bw::scanner::Scanner<T>::fetch_more_tokens
 0.1%   0.2%  27.5KiB        aws_lc_sys aws_lc_0_39_1_edwards25519_scalarmuldouble
 0.1%   0.2%  26.2KiB              prek prek::cli::try_repo::try_repo::{{closure}}
 0.1%   0.2%  23.2KiB              prek prek::hooks::meta_hooks::MetaHooks::run::{{closure}}
 0.1%   0.2%  22.5KiB      serde_saphyr saphyr_parser_bw::scanner::Scanner<T>::fetch_more_tokens
 0.1%   0.2%  22.3KiB         [Unknown] Lp384_montjscalarmul_alt_p384_montjadd
 0.1%   0.2%  21.5KiB      clap_builder clap_builder::parser::parser::Parser::get_matches_with
41.6%  86.3%  10.9MiB                   And 23856 smaller methods. Use -n N to show more.
48.1% 100.0%  12.6MiB                   .text section size, the file size is 26.1MiB

@prek-ci-bot

prek-ci-bot Bot commented May 17, 2026

Copy link
Copy Markdown

⚡️ Hyperfine Benchmarks

Summary: 0 regressions, 2 improvements above the 10% threshold.

Environment
  • OS: Linux 6.17.0-1013-azure
  • CPU: 4 cores
  • prek version: prek 0.4.0+9 (c0e9204 2026-05-17)
  • Rust version: rustc 1.95.0 (59807616e 2026-04-14)
  • Hyperfine version: hyperfine 1.20.0
CLI Commands

Benchmarking basic commands in the main repo:

prek --version

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base --version 2.2 ± 0.1 2.1 2.6 1.00
prek-head --version 2.4 ± 0.3 2.0 3.2 1.06 ± 0.15

prek list

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base list 9.0 ± 0.7 8.6 14.0 1.03 ± 0.08
prek-head list 8.7 ± 0.1 8.4 9.3 1.00

prek validate-config .pre-commit-config.yaml

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base validate-config .pre-commit-config.yaml 3.1 ± 0.1 2.9 3.2 1.07 ± 0.03
prek-head validate-config .pre-commit-config.yaml 2.9 ± 0.1 2.8 3.1 1.00

prek sample-config

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base sample-config 2.5 ± 0.1 2.5 2.9 1.04 ± 0.15
prek-head sample-config 2.4 ± 0.3 2.3 4.7 1.00
Cold vs Warm Runs

Comparing first run (cold) vs subsequent runs (warm cache):

prek run --all-files (cold - no cache)

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run --all-files 135.3 ± 1.6 133.3 137.5 1.00
prek-head run --all-files 137.1 ± 1.9 134.0 139.9 1.01 ± 0.02

prek run --all-files (warm - with cache)

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run --all-files 136.3 ± 3.0 132.9 144.5 1.01 ± 0.03
prek-head run --all-files 135.2 ± 2.6 132.3 141.0 1.00
Full Hook Suite

Running the builtin hook suite on the benchmark workspace:

prek run --all-files (full builtin hook suite)

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run --all-files 137.3 ± 7.3 130.6 185.3 1.00
prek-head run --all-files 137.5 ± 2.6 132.9 143.8 1.00 ± 0.06
Individual Hook Performance

Benchmarking each hook individually on the test repo:

prek run trailing-whitespace --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run trailing-whitespace --all-files 20.5 ± 0.7 19.8 23.7 1.03 ± 0.05
prek-head run trailing-whitespace --all-files 19.9 ± 0.5 19.1 21.5 1.00

prek run end-of-file-fixer --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run end-of-file-fixer --all-files 36.9 ± 37.8 23.5 205.9 1.41 ± 1.45
prek-head run end-of-file-fixer --all-files 26.1 ± 2.3 23.5 31.8 1.00

✅ Performance improvement for prek run end-of-file-fixer --all-files: 29.1700% faster

prek run check-json --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run check-json --all-files 11.6 ± 0.3 10.9 12.2 1.07 ± 0.04
prek-head run check-json --all-files 10.9 ± 0.2 10.5 11.4 1.00

prek run check-yaml --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run check-yaml --all-files 10.9 ± 0.1 10.7 11.3 1.00
prek-head run check-yaml --all-files 10.9 ± 0.2 10.6 11.6 1.01 ± 0.02

prek run check-toml --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run check-toml --all-files 11.0 ± 0.3 10.5 11.6 1.01 ± 0.04
prek-head run check-toml --all-files 10.8 ± 0.2 10.5 11.6 1.00

prek run check-xml --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run check-xml --all-files 11.0 ± 0.3 10.5 11.6 1.01 ± 0.03
prek-head run check-xml --all-files 10.9 ± 0.2 10.6 11.4 1.00

prek run detect-private-key --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run detect-private-key --all-files 17.1 ± 1.4 15.4 19.7 1.00
prek-head run detect-private-key --all-files 17.2 ± 1.4 15.4 21.9 1.01 ± 0.12

prek run fix-byte-order-marker --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run fix-byte-order-marker --all-files 21.9 ± 1.6 19.4 24.5 1.04 ± 0.10
prek-head run fix-byte-order-marker --all-files 21.0 ± 1.3 18.9 23.6 1.00
Installation Performance

Benchmarking hook installation (fast path hooks skip Python setup):

prek install-hooks (cold - no cache)

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base install-hooks 4.5 ± 0.0 4.5 4.6 1.02 ± 0.01
prek-head install-hooks 4.4 ± 0.0 4.4 4.5 1.00

prek install-hooks (warm - with cache)

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base install-hooks 4.6 ± 0.1 4.5 4.7 1.03 ± 0.02
prek-head install-hooks 4.4 ± 0.1 4.4 4.5 1.00
File Filtering/Scoping Performance

Testing different file selection modes:

prek run (staged files only)

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run 46.3 ± 0.7 45.2 47.3 1.00
prek-head run 46.4 ± 0.9 45.0 47.9 1.00 ± 0.02

prek run --files '*.json' (specific file type)

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run --files '*.json' 8.0 ± 0.1 7.9 8.2 1.02 ± 0.02
prek-head run --files '*.json' 7.9 ± 0.1 7.7 8.3 1.00
Workspace Discovery & Initialization

Benchmarking hook discovery and initialization overhead:

prek run --dry-run --all-files (measures init overhead)

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run --dry-run --all-files 10.0 ± 0.1 9.9 10.2 1.02 ± 0.01
prek-head run --dry-run --all-files 9.9 ± 0.1 9.7 10.1 1.00
Meta Hooks Performance

Benchmarking meta hooks separately:

prek run check-hooks-apply --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run check-hooks-apply --all-files 12.2 ± 0.1 12.0 12.3 1.12 ± 0.04
prek-head run check-hooks-apply --all-files 10.9 ± 0.4 10.6 11.9 1.00

✅ Performance improvement for prek run check-hooks-apply --all-files: 10.7000% faster

prek run check-useless-excludes --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run check-useless-excludes --all-files 10.9 ± 0.1 10.7 11.1 1.00
prek-head run check-useless-excludes --all-files 11.4 ± 1.3 10.5 13.9 1.04 ± 0.12

prek run identity --all-files

Command Mean [ms] Min [ms] Max [ms] Relative
prek-base run identity --all-files 9.7 ± 0.1 9.6 9.8 1.02 ± 0.01
prek-head run identity --all-files 9.5 ± 0.1 9.4 9.8 1.00

@j178 j178 force-pushed the identify-extension-precedence branch from d94642a to a9f44f5 Compare May 17, 2026 13:27
Copilot AI review requested due to automatic review settings May 17, 2026 13:31
@j178 j178 force-pushed the identify-extension-precedence branch from a9f44f5 to e931848 Compare May 17, 2026 13:31

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@j178 j178 merged commit d7c4503 into master May 17, 2026
33 of 35 checks passed
@j178 j178 deleted the identify-extension-precedence branch May 17, 2026 13:37
tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request May 20, 2026
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [prek](https://github.com/j178/prek) | patch | `0.4.0` → `0.4.1` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>j178/prek (prek)</summary>

### [`v0.4.1`](https://github.com/j178/prek/blob/HEAD/CHANGELOG.md#041)

[Compare Source](j178/prek@v0.4.0...v0.4.1)

Released on 2026-05-20.

##### Enhancements

- Fix pre-push range after rebase ([#&#8203;2089](j178/prek#2089))
- Prefer extensions over loose filename tags ([#&#8203;2092](j178/prek#2092))
- Skip installs for hooks that will not run ([#&#8203;2103](j178/prek#2103))

##### Performance

- Optimize meta hook file scans ([#&#8203;2106](j178/prek#2106))
- Reduce run filtering allocations ([#&#8203;2090](j178/prek#2090))

##### Contributors

- [@&#8203;j178](https://github.com/j178)

</details>

---

### Configuration

📅 **Schedule**: (UTC)

- Branch creation
  - At any time (no schedule defined)
- Automerge
  - At any time (no schedule defined)

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Mend Renovate](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xODUuMCIsInVwZGF0ZWRJblZlciI6IjQzLjE4NS4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiLCJhdXRvbWF0aW9uOmJvdC1hdXRob3JlZCIsImRlcGVuZGVuY3ktdHlwZTo6cGF0Y2giXX0=-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prek identifies a file named makefile.png as a text makefile

2 participants