Summary
aprender has 499 files (≈33% of src/) with meaningless _part_XX mechanical split names. These names carry zero semantic signal and make the codebase unnavigable.
Cross-ref: paiml/paiml-mcp-agent-toolkit#233 (semantic-assisted rename tooling for pmat query)
Scale
Total _part_ files: 499
Double-nested (_part_XX_part_YY): 70+
Worst Directories
| Directory |
_part_ Files |
format/ |
44 |
format/converter/tests/ |
25 |
format/converter/ |
24 |
online/ |
13 |
nn/ |
13 |
voice/ |
12 |
synthetic/ |
12 |
format/rosetta/ |
12 |
metaheuristics/ |
10 |
format/v2/ |
10 |
format/gguf/ |
10 |
tree/ |
9 |
stats/ |
8 |
optim/tests/ |
8 |
linear_model/ |
8 |
Concrete Rename Examples
format/converter/ — 24 files
The format/converter/ module has the most part files in aprender. These are splits of the APR/GGUF/SafeTensors conversion pipeline:
| Current Pattern |
Count |
Likely Semantic Split |
export_part_02.rs through export_part_05.rs |
4 |
Split by format: export_gguf.rs, export_safetensors.rs, export_apr.rs, export_helpers.rs |
import_part_02.rs through import_part_06.rs |
5 |
Split by format: import_gguf.rs, import_safetensors.rs, etc. |
mod_part_02.rs through mod_part_04.rs |
3 |
Core converter types/dispatch |
merge_part_02.rs |
1 |
Model merge continuation |
write_part_02.rs, write_part_03.rs |
2 |
Writer splits |
Test files (format/converter/tests/): 25 _part_ files including:
coverage_functions_part_03_part_02.rs (double-nested)
coverage_types_tests_part_05.rs
pure_functions_part_05.rs
These should be named by what they test: test_q4k_roundtrip.rs, test_metadata_parsing.rs, etc.
autograd/ — 5 files
| Current Name |
Likely Content |
Suggested Name |
grad_fn_part_02.rs |
Additional gradient function impls |
grad_fn_binary.rs or grad_fn_matmul.rs |
grad_fn_part_03.rs |
More gradient functions |
grad_fn_loss.rs or grad_fn_activation.rs |
grad_fn_part_03_part_02.rs |
Double-split overflow |
Name by actual gradient ops inside |
grad_fn_part_03_part_03.rs |
Double-split overflow |
Name by actual gradient ops inside |
ops/mod_part_02.rs |
Additional autograd ops |
Name by op category |
format/gguf/ — 10 files
| Current Pattern |
Count |
Likely Semantic Split |
reader_part_02.rs through reader_part_03.rs |
4 (with double nesting) |
reader_header.rs, reader_tensors.rs, reader_metadata.rs |
dequant_part_02.rs, dequant_part_03.rs |
2 |
dequant_q4.rs, dequant_q8.rs (by quant type) |
reader_part_02_part_02.rs, reader_part_02_part_03.rs |
2 |
Double-nested reader splits |
| Various test parts |
4 |
Name by test subject |
nn/ — 13 files
| Current Pattern |
Likely Semantic Split |
quantization_part_02.rs, quantization_part_03.rs |
quantization_calibration.rs, quantization_schemes.rs |
quantization_part_03_part_02.rs, quantization_part_03_part_03.rs |
Double-nested quantization splits |
vae_part_02.rs through vae_part_03_part_03.rs |
vae_encoder.rs, vae_decoder.rs, vae_loss.rs |
transformer/mod_part_02.rs, mod_part_03.rs |
transformer_attention.rs, transformer_ffn.rs |
Other Directories
| Directory |
Files |
Notes |
citl/ |
16 |
Compiler, encoder, neural, pattern — each with _part_ splits |
online/ |
13 |
Corpus, curriculum, distillation splits |
voice/ |
12 |
Isolation, style splits |
synthetic/ |
12 |
Code EDA, shell generation splits |
tree/ |
9 |
Helpers, random forest splits |
stats/ |
8 |
Statistics module splits |
metaheuristics/ |
10 |
NAS, constructive search splits |
embed/ |
7 |
Tiny embedding model splits |
text/tokenize/ |
7 |
Tokenizer splits |
graph/ |
7 |
Graph module splits (double-nested) |
cluster/tests/ |
6 |
Clustering test splits |
classification/ |
6 |
Classifier splits |
Execution Strategy
Phase 1: format/ (69 files — highest concentration)
The format module (converter, gguf, rosetta, v2) has 69 _part_ files. These are the most navigated files (model I/O) and benefit most from semantic names.
Phase 2: nn/ + autograd/ (18 files — core ML)
Neural network and autograd modules are high-traffic for anyone doing ML work.
Phase 3: Domain Modules (remaining ~412 files)
Everything else: citl, online, voice, synthetic, tree, stats, etc.
Rename Mechanics
Each rename requires:
git mv old_part_name.rs new_semantic_name.rs
- Update
#[path = "old_part_name.rs"] mod name; in parent module
- Verify
cargo test --lib still passes
- Batch renames per directory to keep commits reviewable
Tooling Support
When paiml/paiml-mcp-agent-toolkit#233 (pmat query --suggest-rename) ships, use it to auto-generate the rename plan:
pmat query --suggest-rename --path src/format/converter/
pmat query --suggest-rename --path src/nn/
pmat query --suggest-rename --path src/autograd/
Until then, manual inspection of function signatures per file is required (as done for the examples above).
Impact
- 499 files renamed to meaningful names
- Developer navigation time reduced significantly
- IDE go-to-file and fuzzy-find become useful (searching "loader" finds the loader, not
mod_part_02_part_02.rs)
pmat query results show readable file paths
- New contributors can understand the codebase structure without reading every file
Summary
aprender has 499 files (≈33% of src/) with meaningless
_part_XXmechanical split names. These names carry zero semantic signal and make the codebase unnavigable.Cross-ref: paiml/paiml-mcp-agent-toolkit#233 (semantic-assisted rename tooling for
pmat query)Scale
Worst Directories
_part_Filesformat/format/converter/tests/format/converter/online/nn/voice/synthetic/format/rosetta/metaheuristics/format/v2/format/gguf/tree/stats/optim/tests/linear_model/Concrete Rename Examples
format/converter/— 24 filesThe
format/converter/module has the most part files in aprender. These are splits of the APR/GGUF/SafeTensors conversion pipeline:export_part_02.rsthroughexport_part_05.rsexport_gguf.rs,export_safetensors.rs,export_apr.rs,export_helpers.rsimport_part_02.rsthroughimport_part_06.rsimport_gguf.rs,import_safetensors.rs, etc.mod_part_02.rsthroughmod_part_04.rsmerge_part_02.rswrite_part_02.rs,write_part_03.rsTest files (
format/converter/tests/): 25_part_files including:coverage_functions_part_03_part_02.rs(double-nested)coverage_types_tests_part_05.rspure_functions_part_05.rsThese should be named by what they test:
test_q4k_roundtrip.rs,test_metadata_parsing.rs, etc.autograd/— 5 filesgrad_fn_part_02.rsgrad_fn_binary.rsorgrad_fn_matmul.rsgrad_fn_part_03.rsgrad_fn_loss.rsorgrad_fn_activation.rsgrad_fn_part_03_part_02.rsgrad_fn_part_03_part_03.rsops/mod_part_02.rsformat/gguf/— 10 filesreader_part_02.rsthroughreader_part_03.rsreader_header.rs,reader_tensors.rs,reader_metadata.rsdequant_part_02.rs,dequant_part_03.rsdequant_q4.rs,dequant_q8.rs(by quant type)reader_part_02_part_02.rs,reader_part_02_part_03.rsnn/— 13 filesquantization_part_02.rs,quantization_part_03.rsquantization_calibration.rs,quantization_schemes.rsquantization_part_03_part_02.rs,quantization_part_03_part_03.rsvae_part_02.rsthroughvae_part_03_part_03.rsvae_encoder.rs,vae_decoder.rs,vae_loss.rstransformer/mod_part_02.rs,mod_part_03.rstransformer_attention.rs,transformer_ffn.rsOther Directories
citl/_part_splitsonline/voice/synthetic/tree/stats/metaheuristics/embed/text/tokenize/graph/cluster/tests/classification/Execution Strategy
Phase 1:
format/(69 files — highest concentration)The format module (converter, gguf, rosetta, v2) has 69
_part_files. These are the most navigated files (model I/O) and benefit most from semantic names.Phase 2:
nn/+autograd/(18 files — core ML)Neural network and autograd modules are high-traffic for anyone doing ML work.
Phase 3: Domain Modules (remaining ~412 files)
Everything else: citl, online, voice, synthetic, tree, stats, etc.
Rename Mechanics
Each rename requires:
git mv old_part_name.rs new_semantic_name.rs#[path = "old_part_name.rs"] mod name;in parent modulecargo test --libstill passesTooling Support
When paiml/paiml-mcp-agent-toolkit#233 (
pmat query --suggest-rename) ships, use it to auto-generate the rename plan:Until then, manual inspection of function signatures per file is required (as done for the examples above).
Impact
mod_part_02_part_02.rs)pmat queryresults show readable file paths