Skip to content

[REGRESSION] REGEXP does not match unicode chars #2054

@T1mL3arn

Description

@T1mL3arn

What did you do?

Run query

with tbl as (
	select 'hello 1' as name
	union
	select 'привет 1'
)
select name from tbl
where name REGEXP '\w+ \d'
;

What did you expect to see?

hello 1
привет 1

What did you see instead?

hello 1

Useful extra information

The problem discovered after migration to new regexp. The expected result received from 3.11.2 version and actual result from nightly.

According to qt docs

character classes only match ASCII characters by default when using QRegularExpression. It is possible to change this behaviour by using the UseUnicodePropertiesOption pattern option.

As a temp workaround it is possible to rewrite \w as a range like [a-zа-я bla-bla-bla]

Win 8.1 x64, DB4S Version 3.11.99 (Nov 23 2019)

Metadata

Metadata

Assignees

Labels

bugConfirmed bugs or reports that are very likely to be bugs.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions