update to latest memchr + upgrade to Rust 2018 + bump MSRV to Rust 1.41#767
Merged
BurntSushi merged 6 commits intomasterfrom May 1, 2021
Merged
update to latest memchr + upgrade to Rust 2018 + bump MSRV to Rust 1.41#767BurntSushi merged 6 commits intomasterfrom
BurntSushi merged 6 commits intomasterfrom
Conversation
BurntSushi
added a commit
to BurntSushi/aho-corasick
that referenced
this pull request
Apr 30, 2021
This is in line with similar changes to the regex and memchr crates: BurntSushi/memchr#82 and rust-lang/regex#767
BurntSushi
added a commit
to BurntSushi/aho-corasick
that referenced
this pull request
Apr 30, 2021
This is in line with similar changes to the regex and memchr crates: BurntSushi/memchr#82 and rust-lang/regex#767
69f66d9 to
a09f8d0
Compare
This removes the ad hoc FreqyPacked searcher and the implementation of Boyer-Moore, and replaces it with a new implementation of memmem in the memchr crate. (Introduced in memchr 2.4.) Since memchr 2.4 also moves to Rust 2018, we'll do the same in subsequent commits. (Finally.) The benchmarks look about as expected. Latency on some of the smaller benchmarks has worsened slightly by a nanosecond or two. The top throughput speed has also decreased, and some other benchmarks (especially ones with frequent literal matches) have improved dramatically.
This commit does a number of manual fixups to the code after the previous two commits were done via 'cargo fix' automatically. Actually, this contains more 'cargo fix' annotations, since I had forgotten to add 'edition = "2018"' to all sub-crates.
This was long overdue, and we were motivated by memchr's move to Rust 2018 in BurntSushi/memchr#82. Rust 1.41.1 was selected because it's the current version of Rust in Debian Stable. It also feels old enough to assure wide support.
It looks like 'cargo fix' didn't do this.
a09f8d0 to
dada2ce
Compare
Member
Author
|
This PR is on crates.io in |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The main motivation for this PR is to use the new memmem implementation in
memchr 2.4(not quite released at time of writing, but in a PR). This let's us delete regex's own bespoke substring search implementations ("FreqyPacked" along with Boyer-Moore). The main benefit of the new implementation is that it should roughly maintain the speed of the old algorithms, but keep its speed in a lot more cases. i.e., It should have far fewer weaknesses. Plus, the algorithm is now available for anyone to use without bringing inregex.While we're here, we (finally) move to Rust 2018 and bump the MSRV to Rust 1.41 (since that's what's in Debian Stable). There's no particular reason why I waited so long to do this. It was never my intent to support such an old version of Rust for so long. There was just never a strong impetus to upgrade. But with Rust 2021 around the bend, it seems appropriate to at least migrate to Rust 2018. Hopefully we'll get to Rust 2021 sooner.
(The plan is to merge this PR once I do a similar change to the
aho-corasickcrate.)