Conversation
c287929 to
2516179
Compare
pywhat/Data/regex.json
Outdated
| "Regex": "^(1Z[0-9A-Z]{6}[0-9]{2}[0-9]{8})$", | ||
| "plural_name": false, | ||
| "Description": null, | ||
| "Rarity": 1, |
There was a problem hiding this comment.
I would say that rarity should be lowered.
There was a problem hiding this comment.
Any suggestions? 🙂
There was a problem hiding this comment.
Why 0.3? I'd say higher like 0.5 or 0.6 because:
- The string has to start with
1Z - It needs 7 chars
0-9A-Z - It has exactly 2 numbers
- It has 8 numbers
Also, can we make it:
- ^(1Z[0-9A-Z]{6}[0-9]{2}[0-9]{8})$
- + ^(1Z[0-9A-Z]{6}[0-9]{10})$?
There was a problem hiding this comment.
I think 0.4 or 0.5. And yes, regex should be changed.
There was a problem hiding this comment.
The idea of the 2+8 split is because the first 2 digits in this group represent a service indicator code and perhaps it could be captured and handled in the future.
There was a problem hiding this comment.
Aside: I wonder if the "rarity" could be estimated more reliably through some entropy-based measure 🤔
There was a problem hiding this comment.
service indicator code
We have precedence for this called sub-categories. See the Mastercard / Phone Numbers regex. I am not sure it'll work on data in the middle of the regex, we may need to change the code for that :)
Aside: I wonder if the "rarity" could be estimated more reliably through some entropy-based measure 🤔
Probably! Currently I am estimating it based on what I see when people post this:

And also whether we have any false positives.
There was a problem hiding this comment.
@P403n1x87 You can use subcategories with regex method for that.
14b9fcc to
45d37fe
Compare
45d37fe to
e3880c0
Compare
Codecov Report
@@ Coverage Diff @@
## main #228 +/- ##
=======================================
Coverage 92.60% 92.60%
=======================================
Files 15 15
Lines 1217 1217
=======================================
Hits 1127 1127
Misses 90 90 Continue to review full report at Codecov.
|
Co-authored-by: piatrashkakanstantinass <74979584+piatrashkakanstantinass@users.noreply.github.com>
⚠ Pull Requests not made with this template will be automatically closed 🔥
Prerequisites
Why do we need this pull request?
What GitHub issues does this fix?
N. A.
Copy / paste of output