Add benchmarking using arbitrary fuzzing#465
Merged
Merged
Conversation
Codecov ReportPatch coverage:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## master #465 +/- ##
==========================================
+ Coverage 85.19% 85.28% +0.08%
==========================================
Files 66 72 +6
Lines 8513 8850 +337
==========================================
+ Hits 7253 7548 +295
- Misses 1260 1302 +42
☔ View full report in Codecov by Sentry. |
1 task
09f17fe to
4af5152
Compare
Member
Author
|
?r @torkleyy @manunio This PR became quite big and contains several parts:
If you have some time, I'd appreciate any feedback you can give on this PR - thanks in advance! |
Member
Author
|
P.S. the benchmarking CI test is expected to still fail since it cannot yet compare against the benchmark on the main branch, which is only added in this PR |
… reading in the corpus
…any, also for Some + Fix check_struct_type lookahead
…zing deserialisation
…heck for unwrapping newtype variants
c23cb59 to
93d06a7
Compare
juntyr
added a commit
to juntyr/ron
that referenced
this pull request
Aug 20, 2023
juntyr
added a commit
that referenced
this pull request
Aug 20, 2023
* First steps towards a lossless Value::Number * Allow parsing +unsigned as unsigned int * Add additional tests for number parsing * Added CHANGELOG entry * Improve coverage by running tests across all features * Refactor number parsing for better readability * Extend number tests to typed ser+de * Adjust #465 tests to lossless Value::Number
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is the start of my very roundabout way to get back to #444, where we really need a benchmark that captures something other than JSON-like-RON to
ron::Value. I hope to upgrade our arbitrary fuzzer to use proper typing to generate an arbitrary data structure and its correspondingSerializeandDeserializeimplementation. For a new PR, we would then first run the fuzzer, then extract the corpus for the arbitrary target, and then benchmark serialising and deserialising based on these examples. Ideally, the current main branch would also be pulled in again and run on these benchmarks as well to provide an automatic comparison.This will probably take me several weekends to fully implement, but I hope it will finally give us the needed insights to land #444 with the best perf-maintainability tradeoff.
CHANGELOG.mdAdd tests to document the following bugs found by fuzzing and now fixed:
rcan be parsed byron::Value(which previously thought this was the start of a raw string)'\\'are serialised as raw strings when escaping is turned offOptions which are serialised with#![enable(implicit_some)]and contains aNonecannot be uniquely deserialised, since we have no idea where theNonecame from. This case has to be tracked, so thatSomes can be inserted in case aNoneis detected inside an unbroken stack of implicitSomes."A('/')"intoron::Valuefails as the struct type searcher reads into the char and then finds a weird comment starter thereProblematic bugs which need to be documented, tested, and discussed further:
Some(...)insidedeserialize_anywith#![enable(unwrap_variant_newtypes)]cannot work as currently implemented, thus it is now properly detected with a new (very specific) error code. Unwrapping variant newtypes currently reaches throughOptions, and [v0.9] Breaking: Treat Some like any newtype variant #413 makes it more explicit by treatingSomelike a newtype variant. However,deserialize_anycannot support newtype variantSomein all cases, since it special-casesSome(...)to look at.... E.g.Some(a: 4)works great in typed mode and looks very nice, but cannot be supported here. Either we decide to makeSomeexplicitly not a newtype variant (which is a breaking change since it kind of escaped through it before and loses us the nice syntax), or we keep this very obscure error which should not be encountered often. The former would definitely be safer. Another alternative is to use Add minimal support for internally tagged and untagged enums #451 to pre-parse the struct type indeserialize_anywhenunwrap_variant_newtypesis enabled and to handle tuples, structs, and unit structs with special cases.Future work