Skip to content

Regex cleanup#306

Merged
MordechaiHadad merged 16 commits into
MordechaiHadad:masterfrom
MrDwarf7:regex-cleanup
Aug 22, 2025
Merged

Regex cleanup#306
MordechaiHadad merged 16 commits into
MordechaiHadad:masterfrom
MrDwarf7:regex-cleanup

Conversation

@MrDwarf7

@MrDwarf7 MrDwarf7 commented Aug 4, 2025

Copy link
Copy Markdown
Contributor

Performance and Code Quality Improvements: Regex Optimization and Refactoring

Summary

This PR introduces significant performance improvements and code quality enhancements through regex compilation optimization, code de-duplication, and some minor structural refactoring.
The changes move repeatedly compiled regex patterns to compile-time constants and implement several architectural improvements.

Aimed to keep the revset fairly small to make it easy to review, let me know if it's a bit much much.

The Short & Sweet / tl;dr / lgtm - Technical Improvements

  • Replaced runtime Regex::new() calls with static LazyLock<Regex> patterns
  • Implemented EnvVarProcessor trait for cleaner environment variable handling
  • Consolidated platform-specific file extension logic into compile-time constants
  • Mild improvements to error handling patterns and reduced code duplication

Key Changes

Perf. Optimizations

  • Regex compilation: Moved all regex patterns from runtime compilation to compile-time static constants using the std lib's LazyLock.
    The docs for the std lib's Regex mention that regex compilation is an expensive operation, and compiling regex patterns at runtime run every time they're read in.
    This can be especially rough on performance in hot-loops. (I've always treated it as a case of 'if you can move it to compile time, do it'. (Such is Rusts philosophy anyway.))

    This will also remove any potential overhead bob may have been encountering when making repeat calls to the functions.

  • New constants module: Created 'src/consts.rs' with centralized regex patterns (VERSION_REGEX, NIGHTLY_REGEX, ENVIRONMENT_VAR_REGEX)
    Moving Regex to compile time is of course the primary reason for the PR; however, centralizing also makes it easier to maintain & write tests for in the future.

  • Platform specific constants: Introduced FILETYPE_EXT constants for platform-specific file extensions ('tar.gz' vs. 'zip')
    This was previously a function call with an internal cfg!(target_family = _ ) macro call.
    Given the function call only exists to make an if-else decision based on a compile-time flag set by cargo, it makes sense to move this to be just a straight-up constant instead.
    Conditional compilation can still be expanded via 'target_os' vs 'target_family' etc. or the function re-implemented again later if needed.

Code Structure & Quality

  • Env. Variable processing: Refactored config.rs to use a polymorphic trait-based approach, favoring array->trait(), which is more idiomatic in Rust and should also make it easier
    to extend in the future if bob decided to further support Environment Variables without moving towards something like the config crate.

  • install_handler.rs cleanup: Minor streamlining for readability, extracting the progress bar logic into PbWrapper struct, very minor code organization done in src/handlers/install_handler.rs
    Note: Had I more time, I would have liked to have done a more thorough refactor of the install_handler.rs file (ie: cleaning up boolean flags & preferring type-heavier pattern-matching).
    That being said this PR was mostly to resolve the Regex, so I didn't want to go too far off the beaten path.

  • Import path standardization: Consistently prefer crate::* over super::* import calls throughout the codebase.
    super::<> as a preference becomes a lot harder to manage if a project grows in size, with refactoring becoming a pain point due to imports when moving files around.

Note: I'd like to highlight that these specific changes did involve modifications to the #[cfg(test)] module items in src/helpers/mod.rs - by way of removal of the prefixed super:: and moving that to the module level.
I've run tests locally (via Arch-Linux) and they all pass and I've used the binary to do installs/uninstalls etc.

Performance Impact

  • Elimination of runtime Regex compilation: Previously regex patterns were compiled on call; they're now compiled once at compile time
  • Reduced memory allocations: Static patterns reduce heap allocations during regex operations (I've also attempted to minimize cloning and other allocations where possible)
  • Code maintainability/readability: Centralized constants make regex patterns easier to maintain and modify

@MordechaiHadad

Copy link
Copy Markdown
Owner

Damn thats one massive PR, will try to review this in the following days

@MrDwarf7

MrDwarf7 commented Aug 8, 2025

Copy link
Copy Markdown
Contributor Author

@MordechaiHadad - Yeah I sort of realized this after submitting it. Despite most of the files having been touched a lot of them are the result of formatting - If you'd like I can separate out some of the formatting context + rebase it and split that into a different PR, may help?
I've tried to keep the commits themselves fairly modular and broken up logically, which should assist with reviewing.

Either way let me know if there's feedback or preference to split it up 😃

Comment thread Cargo.toml
Comment thread .deepsource.toml
Comment thread src/github_requests.rs
@MrDwarf7

Copy link
Copy Markdown
Contributor Author

@MordechaiHadad - Let me know if there's anything else you'd like cleaned up or done before a merge, happy to do so or talk through changes etc.

Comment thread src/config.rs
Comment thread src/consts.rs
Comment thread src/handlers/install_handler.rs
Prior: 
-edition = "2021"
-rust-version = "1.82"

Rust edition 2024 cannot be used on 1.82 as it requires at least 1.85.
the min. version has been bumped to accomodate the edition change.

chore: dev profile for compilation speed
chore: `extern crate <>` is not needed anymore
split out the ProgressBar logic from the response parsing logic
MrDwarf7 added a commit to MrDwarf7/bob that referenced this pull request Aug 20, 2025
@MrDwarf7

Copy link
Copy Markdown
Contributor Author

I've also rebased this so it's up-to-date with master branch too.

@MordechaiHadad

Copy link
Copy Markdown
Owner

damn so many commits, ill need to rereview this again in a few days, btw you can safely ignore the deepsource and codacy nonsense, i am just testing those, so you dont need to bother.

Further review showed lingering Regex calls
that could be moved to `consts.rs`
for better organization and performance.
This change also cleans up test module nesting
@MrDwarf7

Copy link
Copy Markdown
Contributor Author

I've been using jj/jujutsu over git for a bit now but have yet to do PR's with it for the most part. Didn't realize when it does its own internal rebasing it actually handles them all individually (Makes sense I guess as the hash technically changes from git's perspective).

I'll look into how jj handles GitHub's PR threads as yeah, that's a bit of a mess. Genuinely didn't realize it would generate that much noise.

@MordechaiHadad

Copy link
Copy Markdown
Owner

I guess we done really? cant really think of anything to change it looks good.

@MrDwarf7

Copy link
Copy Markdown
Contributor Author

Perfect then 💯

I'm aware of the other PR(s) that potentially touch around regex (I've not looked at it yet, but I assume #315 probably does).
Once this is merged then I can look into it and see if it needs to be fixed up, given the changes from this.

@MordechaiHadad

Copy link
Copy Markdown
Owner

merging this now, works both on linux and windows. dont own a mac to tests, but if both work i doubt mac will fail.

@MordechaiHadad MordechaiHadad merged commit 3e7a70f into MordechaiHadad:master Aug 22, 2025
22 of 24 checks passed
@MrDwarf7 MrDwarf7 deleted the regex-cleanup branch August 22, 2025 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants