Skip to content

fix: speed-up build newick#27

Merged
Neclow merged 3 commits intosbhattlab:mainfrom
Neclow:fix_build_newick2
May 13, 2025
Merged

fix: speed-up build newick#27
Neclow merged 3 commits intosbhattlab:mainfrom
Neclow:fix_build_newick2

Conversation

@Neclow
Copy link
Collaborator

@Neclow Neclow commented May 13, 2025

Supersedes #26

This avoids string inserts of left parens + minimises string clones during sub-newick formation (i.e., string concatenation)

@Neclow Neclow requested a review from Copilot May 13, 2025 11:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR aims to speed up the newick string construction by reducing string insert operations and minimizing unnecessary string clones.

  • Introduces a new helper function prepare_cache to pre-initialize string caches
  • Refactors build_newick to build the newick string by appending precomputed parts
  • Updates sample vector generation in profile_main and isolates the to_newick operation in benchmarks

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
phylo2vec/src/tree_vec/ops/newick/mod.rs Adds prepare_cache and updates build_newick to improve performance and clarity
phylo2vec/src/profile_main.rs Replaces randomized sample_vector with a fixed vector for improved stability
phylo2vec/benches/benchmarks/core.rs Adjusts benchmarking to focus solely on the to_newick operation

@Neclow Neclow requested a review from Copilot May 13, 2025 12:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes Newick string construction by eliminating intermediate string insertions and reducing clones during subtree concatenation, and updates benchmarks and profiling code to reflect the changes.

  • Introduce prepare_cache to pre-build leaf prefixes and avoid repeated allocations
  • Refactor build_newick and build_newick_with_bls to build in-place via String::push
  • Update test, benchmark, and profiling code to use fixed vectors and isolate to_newick performance

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
src/tree_vec/ops/newick/mod.rs Added prepare_cache, refactored build_newick* to in-place
src/tree_vec/ops/mod.rs Added a debug println! in a test
src/profile_main.rs Switched profiling to use a fixed vector
benches/benchmarks/core.rs Simplified benchmarks to isolate to_newick
Comments suppressed due to low confidence (1)

phylo2vec/src/tree_vec/ops/newick/mod.rs:260

  • [nitpick] This comment is vague; either expand it with a concrete benchmark result or remove it to avoid confusion.
// Faster than map+collect for some reason


for (i, (&(c1, c2), &[bl1, bl2])) in pairs.iter().zip(branch_lengths.iter()).enumerate() {
let s1 = std::mem::take(&mut cache[c1]);
// let s1 = std::mem::take(&mut cache[c1]);
Copy link

Copilot AI May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this commented-out code since it is no longer needed and clutters the function.

Suggested change
// let s1 = std::mem::take(&mut cache[c1]);

Copilot uses AI. Check for mistakes.
sub_newick.push(')');
sub_newick.push_str(&sp);
cache[c1] = sub_newick;
// let capacity = s1.len() + s2.len() + sp.len() + sb1.len() + sb2.len() + 5;
Copy link

Copilot AI May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple lines of commented-out legacy capacity calculations and push logic—consider removing them to improve readability.

Copilot uses AI. Check for mistakes.
], "((0:0.1,2:0.2)5:0.5,(1:1,3:3)4:0.7)6;")]
fn test_to_newick_from_matrix(#[case] m: Vec<Vec<f32>>, #[case] expected: &str) {
let newick = to_newick_from_matrix(&m);
println!("Newick: {}", newick);
Copy link

Copilot AI May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This debug print inside a test will clutter test output; consider removing it or using dbg! only when actively debugging.

Suggested change
println!("Newick: {}", newick);
dbg!(&newick); // Use dbg! for debugging purposes

Copilot uses AI. Check for mistakes.
@Neclow Neclow merged commit a3d3cf6 into sbhattlab:main May 13, 2025
7 checks passed
@Neclow Neclow mentioned this pull request May 13, 2025
@Neclow Neclow deleted the fix_build_newick2 branch October 16, 2025 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants