Skip to content

Hir::is_match_empty returns false for \b, but should return true #859

@BurntSushi

Description

@BurntSushi

The predicate in question: https://docs.rs/regex-syntax/latest/regex_syntax/hir/struct.Hir.html#method.is_match_empty

The issue here is that is_match_empty returns true for \B but not for \b. I had done this because \B matches "" but \b does not. However, as of version 1.5.5, this program runs without panicking:

use regex::Regex;

fn main() {
    let wb = Regex::new(r"\b").unwrap();
    let notwb = Regex::new(r"\B").unwrap();
    
    assert!(!wb.is_match(""));
    assert!(notwb.is_match(""));
    
    let got: Vec<_> = wb.find_iter("a").map(|m| m.range()).collect();
    assert_eq!(vec![0..0, 1..1], got);
    
    let got: Vec<_> = notwb.find_iter("a").map(|m| m.range()).collect();
    assert!(got.is_empty());
}

Playground link.

Thus proving that \b does indeed report matches that correspond to the empty string. Therefore, it is a bug that is_match_empty returns false for \b. The issue here is that neither \B nor \b match every empty string. Instead, they only match a subset of empty strings.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions