feat(task): improve source freshness checking with edge case handling#7932
feat(task): improve source freshness checking with edge case handling#7932
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
There was a problem hiding this comment.
Pull request overview
This PR enhances task source freshness checking to handle edge cases where file modification times are unreliable (e.g., files extracted from tarballs). It adds epoch timestamp detection, includes file size in metadata hashing, and provides warnings for missing sources/outputs. Additionally, it introduces a Redactor implementation using Aho-Corasick for efficient multi-pattern string replacement, replacing the previous naive iteration approach.
Changes:
- Improved source freshness detection with epoch timestamp handling and size-based change detection
- Added two optional settings for content hashing and mtime comparison behavior
- Implemented
Redactorusing Aho-Corasick for O(n+z) performance vs O(n*m) naive approach - Refactored redaction logic across logger, config, and command runner to use new
Redactor
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| src/task/task_source_checker.rs | Enhanced source freshness checks with epoch detection, size-based hashing, and missing file warnings |
| src/task/task_executor.rs | Updated redact calls to pass string reference instead of owned String |
| src/redactions.rs | Added Redactor implementation with Aho-Corasick automaton and comprehensive tests |
| src/logger.rs | Refactored to redact once and reuse result for both file and terminal output |
| src/config/mod.rs | Replaced _REDACTIONS with _REDACTOR and updated redaction methods |
| src/cmd.rs | Updated CmdLineRunner to use Redactor instead of IndexSet<String> |
| settings.toml | Added two new task settings for source freshness behavior |
| schema/mise.json | Updated JSON schema with new task settings |
| Cargo.toml | Added aho-corasick dependency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
98421fb to
bba11d2
Compare
|
bugbot run |
a6df071 to
3d338b3
Compare
Hyperfine Performance
|
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.1.12 x -- echo |
21.3 ± 0.3 | 20.7 | 23.9 | 1.00 |
mise x -- echo |
21.6 ± 0.2 | 21.1 | 22.4 | 1.01 ± 0.02 |
mise env
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.1.12 env |
20.6 ± 0.6 | 20.0 | 27.1 | 1.00 |
mise env |
20.9 ± 0.3 | 20.4 | 24.4 | 1.02 ± 0.03 |
mise hook-env
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.1.12 hook-env |
21.4 ± 0.2 | 20.9 | 22.4 | 1.00 |
mise hook-env |
21.7 ± 0.2 | 21.3 | 22.5 | 1.02 ± 0.01 |
mise ls
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|---|---|---|---|---|
mise-2026.1.12 ls |
19.0 ± 0.2 | 18.5 | 19.9 | 1.00 |
mise ls |
19.4 ± 0.2 | 18.9 | 20.8 | 1.02 ± 0.02 |
xtasks/test/perf
| Command | mise-2026.1.12 | mise | Variance |
|---|---|---|---|
| install (cached) | 116ms | 118ms | -1% |
| ls (cached) | 73ms | 73ms | +0% |
| bin-paths (cached) | 78ms | 78ms | +0% |
| task-ls (cached) | 555ms | -77% |
|
bugbot run |
- Add epoch timestamp detection: files with mtime == UNIX_EPOCH (e.g., from tarball extraction) are treated as stale - Include file size in metadata hash to detect changes when mtimes are unreliable - Add warning when sources are defined but no matching files found - Add warning when task completes but expected outputs don't exist - Add optional content hashing (blake3) via task.source_freshness_hash_contents - Add optional equal mtime comparison via task.source_freshness_equal_mtime_is_fresh Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
d75abba to
3204679
Compare
|
bugbot run |
The `touch -r` approach with equal mtimes didn't properly isolate the hash-based detection since equal mtimes also trigger rebuilds. Changed to `sleep + touch output` to ensure output is newer than source, so only the hash check (size/path change) triggers rebuild. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
### 🚀 Features - **(edit)** add interactive config editor (`mise edit`) by @jdx in [#7930](#7930) - **(lockfile)** graduate lockfiles from experimental by @jdx in [#7929](#7929) - **(task)** add support for usage values in task confirm dialog by @roele in [#7924](#7924) - **(task)** improve source freshness checking with edge case handling by @jdx in [#7932](#7932) ### 🐛 Bug Fixes - **(activate)** preserve ordering of paths appended after mise activate by @jdx in [#7919](#7919) - **(install)** sort failed installations for deterministic error output by @jdx in [#7936](#7936) - **(lockfile)** preserve URL and prefer sha256 when merging platform info by @jdx in [#7923](#7923) - **(lockfile)** add atomic writes and cache invalidation by @jdx in [#7927](#7927) - **(templates)** use sha256 for hash filter instead of blake3 by @jdx in [#7925](#7925) - **(upgrade)** respect tracked configs when pruning old versions by @jdx in [#7926](#7926) ### 🚜 Refactor - **(progress)** migrate from indicatif to clx by @jdx in [#7928](#7928) ### 📚 Documentation - improve clarity on uvx and pipx dependencies by @ygormutti in [#7878](#7878) ### ⚡ Performance - **(install)** use Kahn's algorithm for dependency scheduling by @jdx in [#7933](#7933) - use Aho-Corasick for efficient redaction by @jdx in [#7931](#7931) ### 🧪 Testing - remove flaky test_http_version_list test by @jdx in [#7934](#7934) ### Chore - use github backend instead of ubi in mise.lock by @jdx in [#7922](#7922) ### New Contributors - @ygormutti made their first contribution in [#7878](#7878)
…jdx#7932) ## Summary Improves task source freshness checking to handle edge cases with tarball extraction and unreliable mtimes: - **Epoch timestamp detection**: Files with `mtime == UNIX_EPOCH` (common when extracted from tarballs without preserved timestamps) are treated as stale - **Size-based detection**: Metadata hash now includes file size alongside path, catching changes even when mtimes are unreliable - **Warning for missing outputs**: Warns when task completes but expected outputs don't exist Also adds two optional settings: - `task.source_freshness_hash_contents` (default: false) - Use blake3 content hashing instead of metadata (more accurate but slower) - `task.source_freshness_equal_mtime_is_fresh` (default: false) - Use `<=` instead of `<` for mtime comparison ## Files Changed - `settings.toml` - Add 2 new task settings - `schema/mise.json` - Generated schema updates - `src/task/task_source_checker.rs` - Core freshness logic improvements - `e2e/tasks/test_task_source_freshness` - E2E test coverage ## Test plan - [x] Build succeeds: `mise run build` - [x] Lint passes: `mise run lint` - [x] E2E tests pass: `mise run test:e2e test_task_source_freshness` 🤖 Generated with [Claude Code](https://claude.com/claude-code) <!-- CURSOR_SUMMARY --> --- > [!NOTE] > **Medium Risk** > Medium risk because it changes the cache-busting logic that determines whether tasks re-run (hashing + mtime comparison) and adds new warnings that may affect CI/log expectations; defaults preserve prior behavior but edge-case behavior changes. > > **Overview** > Improves task incremental execution by making `sources_are_fresh` more robust: treats `UNIX_EPOCH` mtimes as stale, supports optional blake3 *content* hashing, and extends the default metadata hash to include file size and path (catching changes even when mtimes are unreliable). It also adds an optional setting to treat equal source/output mtimes as fresh (`<=`), and persists separate hash files for metadata vs content mode. > > After successful runs, tasks with explicit `outputs` now warn if expected output paths (including globs) were not produced, and `save_checksum` is updated to use the task’s resolved working directory. Adds E2E coverage for missing outputs, size/path change detection, rename detection, and glob output behavior, plus settings/schema entries for the new knobs. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit 4fb13ed. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
### 🚀 Features - **(edit)** add interactive config editor (`mise edit`) by @jdx in [jdx#7930](jdx#7930) - **(lockfile)** graduate lockfiles from experimental by @jdx in [jdx#7929](jdx#7929) - **(task)** add support for usage values in task confirm dialog by @roele in [jdx#7924](jdx#7924) - **(task)** improve source freshness checking with edge case handling by @jdx in [jdx#7932](jdx#7932) ### 🐛 Bug Fixes - **(activate)** preserve ordering of paths appended after mise activate by @jdx in [jdx#7919](jdx#7919) - **(install)** sort failed installations for deterministic error output by @jdx in [jdx#7936](jdx#7936) - **(lockfile)** preserve URL and prefer sha256 when merging platform info by @jdx in [jdx#7923](jdx#7923) - **(lockfile)** add atomic writes and cache invalidation by @jdx in [jdx#7927](jdx#7927) - **(templates)** use sha256 for hash filter instead of blake3 by @jdx in [jdx#7925](jdx#7925) - **(upgrade)** respect tracked configs when pruning old versions by @jdx in [jdx#7926](jdx#7926) ### 🚜 Refactor - **(progress)** migrate from indicatif to clx by @jdx in [jdx#7928](jdx#7928) ### 📚 Documentation - improve clarity on uvx and pipx dependencies by @ygormutti in [jdx#7878](jdx#7878) ### ⚡ Performance - **(install)** use Kahn's algorithm for dependency scheduling by @jdx in [jdx#7933](jdx#7933) - use Aho-Corasick for efficient redaction by @jdx in [jdx#7931](jdx#7931) ### 🧪 Testing - remove flaky test_http_version_list test by @jdx in [jdx#7934](jdx#7934) ### Chore - use github backend instead of ubi in mise.lock by @jdx in [jdx#7922](jdx#7922) ### New Contributors - @ygormutti made their first contribution in [jdx#7878](jdx#7878)
Summary
Improves task source freshness checking to handle edge cases with tarball extraction and unreliable mtimes:
mtime == UNIX_EPOCH(common when extracted from tarballs without preserved timestamps) are treated as staleAlso adds two optional settings:
task.source_freshness_hash_contents(default: false) - Use blake3 content hashing instead of metadata (more accurate but slower)task.source_freshness_equal_mtime_is_fresh(default: false) - Use<=instead of<for mtime comparisonFiles Changed
settings.toml- Add 2 new task settingsschema/mise.json- Generated schema updatessrc/task/task_source_checker.rs- Core freshness logic improvementse2e/tasks/test_task_source_freshness- E2E test coverageTest plan
mise run buildmise run lintmise run test:e2e test_task_source_freshness🤖 Generated with Claude Code
Note
Medium Risk
Medium risk because it changes the cache-busting logic that determines whether tasks re-run (hashing + mtime comparison) and adds new warnings that may affect CI/log expectations; defaults preserve prior behavior but edge-case behavior changes.
Overview
Improves task incremental execution by making
sources_are_freshmore robust: treatsUNIX_EPOCHmtimes as stale, supports optional blake3 content hashing, and extends the default metadata hash to include file size and path (catching changes even when mtimes are unreliable). It also adds an optional setting to treat equal source/output mtimes as fresh (<=), and persists separate hash files for metadata vs content mode.After successful runs, tasks with explicit
outputsnow warn if expected output paths (including globs) were not produced, andsave_checksumis updated to use the task’s resolved working directory. Adds E2E coverage for missing outputs, size/path change detection, rename detection, and glob output behavior, plus settings/schema entries for the new knobs.Written by Cursor Bugbot for commit 4fb13ed. This will update automatically on new commits. Configure here.