-
Notifications
You must be signed in to change notification settings - Fork 58
Closed
Description
Note: This is purely based on a code-review, so I have no test case currently (though I can make one up if needs be).
In the IterableCodeExtractor::calculateMatchScore() method there are two things which caught my attention:
- The
preg_quote()function is used to escape arbitrary strings, whilemb_ereg()is used for the regex matching.
This is problematic aspreg_quote()is part of the PCRE extension, which uses the PCRE regex engine, whilemb_ereg()is part of the MBString extension, which uses the Oniguruma regex engine.
These engines are not 100% compatible, so in effect you could be escaping too much/too little by usingpreg_quote(), such as the delimiter (used in PCRE, not used in MBString). - The
mb_ereg()function is used without themb_internal_encodingor themb_regex_encodingbeing set.
Whethermb_internal_encodingis needed may depend on where the input comes from, howevermb_regex_encodingshould most definitely be set.
The defaultmb_regex_encodingisEUC-JPin PHP 5.4 and 5.5 and only becameUTF-8in PHP 5.6, though as other code may have also called this function, the encoding being the default can not be relied upon and it should be set before usingmb_ereg().
Note:mb_internal_encodingwas deprecated in PHP 5.6 in favour ofdefault_encoding, so you need a compatibility layer here.
swissspidyswissspidy