-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Closed
Milestone
Description
Description
The combination of ignoring case and using intervals that involve \u0130 (Turkish I with dot) and \u0131 (Turkish i without dot) gives wrong matching results as the repo shows.
Configuration
.NET 6.0 preview
Regression?
Seems so. At least the below code works correctly in .NET 5.0.
Other information
Expected behavior is that the following code prints True but it prints False.
The pattern below must trivially match the input because all of the letters fall in the given intervals
IgnoreCase can only add letters (not remove letters) so the match must hold.
If the IgnoreCase option is omitted the code works correctly.
using System.Text.RegularExpressions;
using System.Globalization;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
string input = "I\u0131\u0130i";
string pattern = "[H-J][\u0131-\u0140][\u0120-\u0130][h-j]";
var culture = CultureInfo.CurrentCulture;
CultureInfo.CurrentCulture = new CultureInfo("tr-TR");
Regex re = new Regex(pattern, RegexOptions.IgnoreCase);
CultureInfo.CurrentCulture = culture;
Console.WriteLine(re.IsMatch(input));
}
}
}