Skip to content

chore(deps): Bump trueno from 0.6.0 to 0.7.0#69

Closed
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/cargo/trueno-0.7.0
Closed

chore(deps): Bump trueno from 0.6.0 to 0.7.0#69
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/cargo/trueno-0.7.0

Conversation

@dependabot

@dependabot dependabot Bot commented on behalf of github Nov 24, 2025

Copy link
Copy Markdown
Contributor

Bumps trueno from 0.6.0 to 0.7.0.

Release notes

Sourced from trueno's releases.

v0.7.0 - Phase 3: Large Matrix Optimization

Trueno v0.7.0 - Phase 3: Large Matrix Optimization 🚀

Performance Achievements

18% improvement for 1024×1024 matrices via 3-level cache blocking

  • 3-level cache hierarchy (L3 → L2 → micro-kernel) for matrices ≥512×512

    • L3 blocks: 256×256 (fits in 4-16MB L3 cache)
    • L2 blocks: 64×64 (fits in 256KB L2 cache)
    • Micro-kernel: 4×1 AVX2/FMA (register blocking)
    • Smart threshold: Only activates for matrices ≥512×512
  • Zero-allocation implementation:

    • No Vec allocations in hot path
    • Code duplication with if/else branches
    • Preserves fast 2-level path for smaller matrices
  • Performance results:

    • 1024×1024: 47.4 ms (18% faster than v0.6.0's 57.8 ms)
    • 512×512: ~5.3 ms (8.5% improvement)
    • 256×256: No regression (uses 2-level path)
    • Target: Within 1.5× of NumPy (currently 1.64×)

Quality & Testing

  • Test coverage: 90.26% (trueno library, exceeds 90% EXTREME TDD requirement)
  • Added 60+ new tests across xtask tooling and core library
  • Fixed clippy warnings (needless_range_loop)
  • Updated coverage policy: xtask (dev tooling) excluded from main coverage requirement
  • All quality gates passing: lint, format, tests, coverage

Installation

cargo add trueno@0.7.0

Or add to Cargo.toml:

[dependencies]
trueno = "0.7.0"

Full Changelog

See CHANGELOG.md for complete details.


... (truncated)

Changelog

Sourced from trueno's changelog.

[0.7.0] - 2025-11-22

Performance - Phase 3: Large Matrix Optimization 🚀

Achievement: 18% improvement for 1024×1024 matrices via 3-level cache blocking

  • 3-level cache hierarchy (L3 → L2 → micro-kernel) for matrices ≥512×512

    • L3 blocks: 256×256 (fits in 4-16MB L3 cache)
    • L2 blocks: 64×64 (fits in 256KB L2 cache)
    • Micro-kernel: 4×1 AVX2/FMA (register blocking)
    • Smart threshold: Only activates for matrices ≥512×512
  • Zero-allocation implementation:

    • No Vec allocations in hot path
    • Code duplication with if/else branches
    • Preserves fast 2-level path for smaller matrices
  • Performance results:

    • 1024×1024: 47.4 ms (18% faster than v0.6.0's 57.8 ms)
    • 512×512: ~5.3 ms (8.5% improvement)
    • 256×256: No regression (uses 2-level path)
    • Target: Within 1.5× of NumPy (currently 1.64×)
  • Testing:

    • Added test_matmul_3level_blocking for 512×512 matrices
    • 878 tests passing (all existing tests pass)
    • Coverage: 90.41% (improved from 90.00%)

Quality & Testing

  • Test coverage: 90.20% (trueno library, exceeds 90% EXTREME TDD requirement)
  • Added 60+ new tests across xtask tooling and core library
  • Fixed clippy warnings (needless_range_loop)
  • Updated coverage policy: xtask (dev tooling) excluded from main coverage requirement
  • All quality gates passing: lint, format, tests, coverage

Documentation

  • Updated Phase 2 book chapter with 3-level blocking details
  • Added benchmark data for 512×512 and 1024×1024
  • GitHub issue #34 tracking Phase 3 progress
Commits
  • 6763a1b [FIX] Add clippy allow for clone_on_copy in tests
  • 24d8376 [RELEASE] Bump version to 0.7.0
  • d8e607e [FIX] Fix clippy warnings and improve test coverage to 90.20%
  • b9ff2f9 Merge pull request #37 from paiml/claude/continue-work-01QPEw1xeDsUogWMvMR7NEzE
  • 75bef25 [PERF] SIMD-optimized vecmat using row-wise accumulation
  • 76aaee7 [BENCH] Add matvec performance benchmark
  • c5d053f [PERF] SIMD-optimized matvec using AVX2/SSE2 dot products
  • ae5b8b1 [FIX] Add cfg attributes to parallel-only code to fix dead_code warnings
  • e13c116 Merge pull request #36 from paiml/claude/continue-work-01QPEw1xeDsUogWMvMR7NEzE
  • 836f226 [PERF] Phase 4: Lock-free multi-threading achieves 2.72× speedup (Refs #34)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [trueno](https://github.com/paiml/trueno) from 0.6.0 to 0.7.0.
- [Release notes](https://github.com/paiml/trueno/releases)
- [Changelog](https://github.com/paiml/trueno/blob/main/CHANGELOG.md)
- [Commits](paiml/trueno@v0.6.0...v0.7.0)

---
updated-dependencies:
- dependency-name: trueno
  dependency-version: 0.7.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot @github

dependabot Bot commented on behalf of github Nov 24, 2025

Copy link
Copy Markdown
Contributor Author

Labels

The following labels could not be found: dependencies, rust. Please create them before Dependabot can add them to a pull request.

Please fix the above issues or remove invalid values from dependabot.yml.

@dependabot @github

dependabot Bot commented on behalf of github Nov 24, 2025

Copy link
Copy Markdown
Contributor Author

Looks like trueno is up-to-date now, so this is no longer needed.

@dependabot dependabot Bot closed this Nov 24, 2025
@dependabot dependabot Bot deleted the dependabot/cargo/trueno-0.7.0 branch November 24, 2025 12:34
noahgift added a commit that referenced this pull request Apr 18, 2026
Adds the third-format ship-blocker gate mirroring PM-007
(safetensors) — catches manifests that lie about GGUF quantization
before 30+ GiB of mis-labelled bytes move across the network.

Design: predominant non-float tensor type is the authoritative
signal. `general.file_type` is retained as a fallback only — real
llama.cpp quantize output (e.g. our 8 GiB teacher GGUF) has shipped
with stale ftype=0 despite fully Q4_K tensors, so trusting the
metadata_kv field would force a false FAIL on an artifact every
inference engine happily consumes.

New API:
  • read_gguf_signature(path) — reads metadata_kv + tensor_metadata
  • predominant_quant_type(counts) — majority non-float type,
    falling back to majority float only when all tensors are float
  • expected_ggml_tensor_type(quant) — manifest string → GGML type
    (ggml-common.h enum ggml_type)
  • ggml_type_name(t) — u32 → "Q4_K" etc.

Verified on real artifact:
  $ apr validate-manifest paiml-qwen2.5-coder-7b-apache-q4k-v1-gguf.yaml \
      --artifact qwen2.5-coder-7b-instruct-q4k.gguf
  [PASS] FALSIFY-PM-008: predominant tensor type = 12 (Q4_K)
         matches quantization 'q4_k'
         (note: general.file_type=0=ALL_F32 is stale)

Tests: 15 unit tests (10 original ftype-only + 5 new for
tensor-authoritative path, including the real teacher scenario
of Q4_K tensors + stale ftype=0).

Contract: publish-manifest-v1.yaml v1.2.1 — PM-008 entry rewritten
to describe tensor-authoritative semantics.

Refs: SPEC-SHIP-TWO-001 §12.7.2 (ship-blocker class)
Closes: task #69
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants