Skip to content

feat(task): improve source freshness checking with edge case handling#7932

Merged
jdx merged 4 commits intomainfrom
feat/task-source-freshness
Jan 31, 2026
Merged

feat(task): improve source freshness checking with edge case handling#7932
jdx merged 4 commits intomainfrom
feat/task-source-freshness

Conversation

@jdx
Copy link
Owner

@jdx jdx commented Jan 31, 2026

Summary

Improves task source freshness checking to handle edge cases with tarball extraction and unreliable mtimes:

  • Epoch timestamp detection: Files with mtime == UNIX_EPOCH (common when extracted from tarballs without preserved timestamps) are treated as stale
  • Size-based detection: Metadata hash now includes file size alongside path, catching changes even when mtimes are unreliable
  • Warning for missing outputs: Warns when task completes but expected outputs don't exist

Also adds two optional settings:

  • task.source_freshness_hash_contents (default: false) - Use blake3 content hashing instead of metadata (more accurate but slower)
  • task.source_freshness_equal_mtime_is_fresh (default: false) - Use <= instead of < for mtime comparison

Files Changed

  • settings.toml - Add 2 new task settings
  • schema/mise.json - Generated schema updates
  • src/task/task_source_checker.rs - Core freshness logic improvements
  • e2e/tasks/test_task_source_freshness - E2E test coverage

Test plan

  • Build succeeds: mise run build
  • Lint passes: mise run lint
  • E2E tests pass: mise run test:e2e test_task_source_freshness

🤖 Generated with Claude Code


Note

Medium Risk
Medium risk because it changes the cache-busting logic that determines whether tasks re-run (hashing + mtime comparison) and adds new warnings that may affect CI/log expectations; defaults preserve prior behavior but edge-case behavior changes.

Overview
Improves task incremental execution by making sources_are_fresh more robust: treats UNIX_EPOCH mtimes as stale, supports optional blake3 content hashing, and extends the default metadata hash to include file size and path (catching changes even when mtimes are unreliable). It also adds an optional setting to treat equal source/output mtimes as fresh (<=), and persists separate hash files for metadata vs content mode.

After successful runs, tasks with explicit outputs now warn if expected output paths (including globs) were not produced, and save_checksum is updated to use the task’s resolved working directory. Adds E2E coverage for missing outputs, size/path change detection, rename detection, and glob output behavior, plus settings/schema entries for the new knobs.

Written by Cursor Bugbot for commit 4fb13ed. This will update automatically on new commits. Configure here.

Copilot AI review requested due to automatic review settings January 31, 2026 19:39
@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances task source freshness checking to handle edge cases where file modification times are unreliable (e.g., files extracted from tarballs). It adds epoch timestamp detection, includes file size in metadata hashing, and provides warnings for missing sources/outputs. Additionally, it introduces a Redactor implementation using Aho-Corasick for efficient multi-pattern string replacement, replacing the previous naive iteration approach.

Changes:

  • Improved source freshness detection with epoch timestamp handling and size-based change detection
  • Added two optional settings for content hashing and mtime comparison behavior
  • Implemented Redactor using Aho-Corasick for O(n+z) performance vs O(n*m) naive approach
  • Refactored redaction logic across logger, config, and command runner to use new Redactor

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/task/task_source_checker.rs Enhanced source freshness checks with epoch detection, size-based hashing, and missing file warnings
src/task/task_executor.rs Updated redact calls to pass string reference instead of owned String
src/redactions.rs Added Redactor implementation with Aho-Corasick automaton and comprehensive tests
src/logger.rs Refactored to redact once and reuse result for both file and terminal output
src/config/mod.rs Replaced _REDACTIONS with _REDACTOR and updated redaction methods
src/cmd.rs Updated CmdLineRunner to use Redactor instead of IndexSet<String>
settings.toml Added two new task settings for source freshness behavior
schema/mise.json Updated JSON schema with new task settings
Cargo.toml Added aho-corasick dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jdx jdx force-pushed the feat/task-source-freshness branch 3 times, most recently from 98421fb to bba11d2 Compare January 31, 2026 19:47
@jdx
Copy link
Owner Author

jdx commented Jan 31, 2026

bugbot run

@jdx jdx force-pushed the feat/task-source-freshness branch from a6df071 to 3d338b3 Compare January 31, 2026 20:12
@github-actions
Copy link

github-actions bot commented Jan 31, 2026

Hyperfine Performance

mise x -- echo

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.1.12 x -- echo 21.3 ± 0.3 20.7 23.9 1.00
mise x -- echo 21.6 ± 0.2 21.1 22.4 1.01 ± 0.02

mise env

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.1.12 env 20.6 ± 0.6 20.0 27.1 1.00
mise env 20.9 ± 0.3 20.4 24.4 1.02 ± 0.03

mise hook-env

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.1.12 hook-env 21.4 ± 0.2 20.9 22.4 1.00
mise hook-env 21.7 ± 0.2 21.3 22.5 1.02 ± 0.01

mise ls

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.1.12 ls 19.0 ± 0.2 18.5 19.9 1.00
mise ls 19.4 ± 0.2 18.9 20.8 1.02 ± 0.02

xtasks/test/perf

Command mise-2026.1.12 mise Variance
install (cached) 116ms 118ms -1%
ls (cached) 73ms 73ms +0%
bin-paths (cached) 78ms 78ms +0%
task-ls (cached) 555ms ⚠️ 2506ms -77%

⚠️ Warning: task-ls cached performance variance is -77%

@jdx
Copy link
Owner Author

jdx commented Jan 31, 2026

bugbot run

- Add epoch timestamp detection: files with mtime == UNIX_EPOCH (e.g., from
  tarball extraction) are treated as stale
- Include file size in metadata hash to detect changes when mtimes are unreliable
- Add warning when sources are defined but no matching files found
- Add warning when task completes but expected outputs don't exist
- Add optional content hashing (blake3) via task.source_freshness_hash_contents
- Add optional equal mtime comparison via task.source_freshness_equal_mtime_is_fresh

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@jdx jdx force-pushed the feat/task-source-freshness branch from d75abba to 3204679 Compare January 31, 2026 20:39
@jdx
Copy link
Owner Author

jdx commented Jan 31, 2026

bugbot run

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is ON, but a Cloud Agent failed to start.

The `touch -r` approach with equal mtimes didn't properly isolate
the hash-based detection since equal mtimes also trigger rebuilds.
Changed to `sleep + touch output` to ensure output is newer than
source, so only the hash check (size/path change) triggers rebuild.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@jdx jdx merged commit 7370012 into main Jan 31, 2026
33 of 35 checks passed
@jdx jdx deleted the feat/task-source-freshness branch January 31, 2026 21:14
mise-en-dev added a commit that referenced this pull request Feb 1, 2026
### 🚀 Features

- **(edit)** add interactive config editor (`mise edit`) by @jdx in
[#7930](#7930)
- **(lockfile)** graduate lockfiles from experimental by @jdx in
[#7929](#7929)
- **(task)** add support for usage values in task confirm dialog by
@roele in [#7924](#7924)
- **(task)** improve source freshness checking with edge case handling
by @jdx in [#7932](#7932)

### 🐛 Bug Fixes

- **(activate)** preserve ordering of paths appended after mise activate
by @jdx in [#7919](#7919)
- **(install)** sort failed installations for deterministic error output
by @jdx in [#7936](#7936)
- **(lockfile)** preserve URL and prefer sha256 when merging platform
info by @jdx in [#7923](#7923)
- **(lockfile)** add atomic writes and cache invalidation by @jdx in
[#7927](#7927)
- **(templates)** use sha256 for hash filter instead of blake3 by @jdx
in [#7925](#7925)
- **(upgrade)** respect tracked configs when pruning old versions by
@jdx in [#7926](#7926)

### 🚜 Refactor

- **(progress)** migrate from indicatif to clx by @jdx in
[#7928](#7928)

### 📚 Documentation

- improve clarity on uvx and pipx dependencies by @ygormutti in
[#7878](#7878)

### ⚡ Performance

- **(install)** use Kahn's algorithm for dependency scheduling by @jdx
in [#7933](#7933)
- use Aho-Corasick for efficient redaction by @jdx in
[#7931](#7931)

### 🧪 Testing

- remove flaky test_http_version_list test by @jdx in
[#7934](#7934)

### Chore

- use github backend instead of ubi in mise.lock by @jdx in
[#7922](#7922)

### New Contributors

- @ygormutti made their first contribution in
[#7878](#7878)
lucasew pushed a commit to lucasew/CONTRIB-mise that referenced this pull request Feb 18, 2026
…jdx#7932)

## Summary

Improves task source freshness checking to handle edge cases with
tarball extraction and unreliable mtimes:

- **Epoch timestamp detection**: Files with `mtime == UNIX_EPOCH`
(common when extracted from tarballs without preserved timestamps) are
treated as stale
- **Size-based detection**: Metadata hash now includes file size
alongside path, catching changes even when mtimes are unreliable
- **Warning for missing outputs**: Warns when task completes but
expected outputs don't exist

Also adds two optional settings:

- `task.source_freshness_hash_contents` (default: false) - Use blake3
content hashing instead of metadata (more accurate but slower)
- `task.source_freshness_equal_mtime_is_fresh` (default: false) - Use
`<=` instead of `<` for mtime comparison

## Files Changed

- `settings.toml` - Add 2 new task settings
- `schema/mise.json` - Generated schema updates
- `src/task/task_source_checker.rs` - Core freshness logic improvements
- `e2e/tasks/test_task_source_freshness` - E2E test coverage

## Test plan

- [x] Build succeeds: `mise run build`
- [x] Lint passes: `mise run lint`
- [x] E2E tests pass: `mise run test:e2e test_task_source_freshness`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Medium risk because it changes the cache-busting logic that determines
whether tasks re-run (hashing + mtime comparison) and adds new warnings
that may affect CI/log expectations; defaults preserve prior behavior
but edge-case behavior changes.
> 
> **Overview**
> Improves task incremental execution by making `sources_are_fresh` more
robust: treats `UNIX_EPOCH` mtimes as stale, supports optional blake3
*content* hashing, and extends the default metadata hash to include file
size and path (catching changes even when mtimes are unreliable). It
also adds an optional setting to treat equal source/output mtimes as
fresh (`<=`), and persists separate hash files for metadata vs content
mode.
> 
> After successful runs, tasks with explicit `outputs` now warn if
expected output paths (including globs) were not produced, and
`save_checksum` is updated to use the task’s resolved working directory.
Adds E2E coverage for missing outputs, size/path change detection,
rename detection, and glob output behavior, plus settings/schema entries
for the new knobs.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
4fb13ed. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
lucasew pushed a commit to lucasew/CONTRIB-mise that referenced this pull request Feb 18, 2026
### 🚀 Features

- **(edit)** add interactive config editor (`mise edit`) by @jdx in
[jdx#7930](jdx#7930)
- **(lockfile)** graduate lockfiles from experimental by @jdx in
[jdx#7929](jdx#7929)
- **(task)** add support for usage values in task confirm dialog by
@roele in [jdx#7924](jdx#7924)
- **(task)** improve source freshness checking with edge case handling
by @jdx in [jdx#7932](jdx#7932)

### 🐛 Bug Fixes

- **(activate)** preserve ordering of paths appended after mise activate
by @jdx in [jdx#7919](jdx#7919)
- **(install)** sort failed installations for deterministic error output
by @jdx in [jdx#7936](jdx#7936)
- **(lockfile)** preserve URL and prefer sha256 when merging platform
info by @jdx in [jdx#7923](jdx#7923)
- **(lockfile)** add atomic writes and cache invalidation by @jdx in
[jdx#7927](jdx#7927)
- **(templates)** use sha256 for hash filter instead of blake3 by @jdx
in [jdx#7925](jdx#7925)
- **(upgrade)** respect tracked configs when pruning old versions by
@jdx in [jdx#7926](jdx#7926)

### 🚜 Refactor

- **(progress)** migrate from indicatif to clx by @jdx in
[jdx#7928](jdx#7928)

### 📚 Documentation

- improve clarity on uvx and pipx dependencies by @ygormutti in
[jdx#7878](jdx#7878)

### ⚡ Performance

- **(install)** use Kahn's algorithm for dependency scheduling by @jdx
in [jdx#7933](jdx#7933)
- use Aho-Corasick for efficient redaction by @jdx in
[jdx#7931](jdx#7931)

### 🧪 Testing

- remove flaky test_http_version_list test by @jdx in
[jdx#7934](jdx#7934)

### Chore

- use github backend instead of ubi in mise.lock by @jdx in
[jdx#7922](jdx#7922)

### New Contributors

- @ygormutti made their first contribution in
[jdx#7878](jdx#7878)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants