This strategy function
fn invalid_ts() -> impl Strategy<Value = Vec<u8>> {
prop::string::bytes_regex(
r"(?s-u:|[^0-9].*|[0-9]+[^0-9.].*|[0-9]+\.[0-9]*[^0-9].*)"
).unwrap()
}
is intended to generate, among other things, invalid UTF-8 byte sequences, because it's for testing a parser that works directly from data on disk that cannot be trusted. What it actually does is crash on the unwrap() with
thread 'attrs::test::parse_xattr_ts_invalid' panicked at 'called `Result::unwrap()` on an `Err` value:
RegexSyntax(Translate(Error {
kind: InvalidUtf8,
pattern: "(?s-u:|[^0-9].*|[0-9]+[^0-9.].*|[0-9]+\\.[0-9]*[^0-9].*)",
span: Span(Position(o: 7, l: 1, c: 8), Position(o: 13, l: 1, c: 14))
}))'
Looking at the code, I believe the change needed is for bytes_regex to have its own version of regex_to_hir that calls ParserBuilder::allow_invalid_utf8(true).
This change should have no visible effect on any existing code that uses bytes_regex, since one must also opt into generation of invalid UTF-8 within the regex itself (that's one of the things the (?s-u: does) and any existing regex that uses that flag must be using it in a way that actually can't generate invalid UTF-8, or else they'd hit the same crash I'm hitting.
This strategy function
is intended to generate, among other things, invalid UTF-8 byte sequences, because it's for testing a parser that works directly from data on disk that cannot be trusted. What it actually does is crash on the
unwrap()withLooking at the code, I believe the change needed is for
bytes_regexto have its own version ofregex_to_hirthat callsParserBuilder::allow_invalid_utf8(true).This change should have no visible effect on any existing code that uses
bytes_regex, since one must also opt into generation of invalid UTF-8 within the regex itself (that's one of the things the(?s-u:does) and any existing regex that uses that flag must be using it in a way that actually can't generate invalid UTF-8, or else they'd hit the same crash I'm hitting.