-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Closed
Labels
bugA bug.A bug.rollupA PR that has been merged with many others in a rollup.A PR that has been merged with many others in a rollup.
Description
What version of ripgrep are you using?
$ rg --version
ripgrep 11.0.1 (rev 1f1cd9b467)
-SIMD -AVX (compiled)
+SIMD -AVX (runtime)How did you install ripgrep?
Using https://github.com/BurntSushi/ripgrep/releases/download/11.0.1/ripgrep_11.0.1_amd64.deb
What operating system are you using ripgrep on?
Ubuntu 16.04.1
Describe your question, feature request, or bug.
Behavior of \b word anchor applied on non-word characters is different than what is observed on other regex engines. I found this accidentally, wasn't a usecase I needed. But filing the issue anyway, incase this may be a bug.
If this is a bug, what are the steps to reproduce the behavior?
See example below.
If this is a bug, what is the actual behavior?
$ echo 'I have 12, he has 2!' | rg -o '\b..\b'
I
12If this is a bug, what is the expected behavior?
I feel ripgrep should also behave as seen in other regex engines.
$ # GNU grep 3.3
$ # grep -oP '\b..\b' and rg -oP '\b..\b' also produce same result
$ echo 'I have 12, he has 2!' | grep -o '\b..\b'
I
12
,
he
2
$ echo 'I have 12, he has 2!' | perl -lne 'print join "|", /\b..\b/g'
I |12|, |he| 2
$ echo 'I have 12, he has 2!' | ruby -ne 'puts $_.scan(/\b..\b/).join("|")'
I |12|, |he| 2
$ # python3.7
>>> import re
>>> re.findall(r'\b..\b', 'I have 12, he has 2!')
['I ', '12', ', ', 'he', ' 2']The below image might help in understanding what is happening in all these regex engines. Vertical bar represents word boundary. Note that even though ! is at end of line, it doesn't have boundary after as it is not a word character.
Note
grep -o '\<..\>'(also invim) will be different compared to\bas\bcannot differentiate between start and end word boundarygrep -ow '..'is same asrg -ow '..'(perhaps because both do(?<!\w)pattern(?!\w)for-woption)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugA bug.A bug.rollupA PR that has been merged with many others in a rollup.A PR that has been merged with many others in a rollup.
