Skip to content

feat: UPS tracking numbers#228

Merged
bee-san merged 3 commits intobee-san:mainfrom
P403n1x87:feat/ups-tracking
Nov 7, 2021
Merged

feat: UPS tracking numbers#228
bee-san merged 3 commits intobee-san:mainfrom
P403n1x87:feat/ups-tracking

Conversation

@P403n1x87
Copy link
Copy Markdown
Contributor

@P403n1x87 P403n1x87 commented Nov 4, 2021

⚠ Pull Requests not made with this template will be automatically closed 🔥

Prerequisites

Why do we need this pull request?

  • This PR adds the regex for UPS tracking numbers

What GitHub issues does this fix?

N. A.

Copy / paste of output

pywhat 1Z123CAB9912345678    
Matched on: 1Z123CAB9912345678
Name: UPS Tracking Number
Link:  https://www.ups.com/track?tracknum=1Z123CAB9912345678

"Regex": "^(1Z[0-9A-Z]{6}[0-9]{2}[0-9]{8})$",
"plural_name": false,
"Description": null,
"Rarity": 1,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that rarity should be lowered.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestions? 🙂

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something around 0.3

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 0.3? I'd say higher like 0.5 or 0.6 because:

  1. The string has to start with 1Z
  2. It needs 7 chars 0-9A-Z
  3. It has exactly 2 numbers
  4. It has 8 numbers

Also, can we make it:

- ^(1Z[0-9A-Z]{6}[0-9]{2}[0-9]{8})$
- + ^(1Z[0-9A-Z]{6}[0-9]{10})$

?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 0.4 or 0.5. And yes, regex should be changed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of the 2+8 split is because the first 2 digits in this group represent a service indicator code and perhaps it could be captured and handled in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I wonder if the "rarity" could be estimated more reliably through some entropy-based measure 🤔

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

service indicator code

We have precedence for this called sub-categories. See the Mastercard / Phone Numbers regex. I am not sure it'll work on data in the middle of the regex, we may need to change the code for that :)

Aside: I wonder if the "rarity" could be estimated more reliably through some entropy-based measure 🤔

Probably! Currently I am estimating it based on what I see when people post this:
Screenshot 2021-11-04 at 16 09 24

And also whether we have any false positives.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@P403n1x87 You can use subcategories with regex method for that.

@P403n1x87 P403n1x87 force-pushed the feat/ups-tracking branch 3 times, most recently from 14b9fcc to 45d37fe Compare November 7, 2021 11:02
@P403n1x87 P403n1x87 requested review from a user and bee-san November 7, 2021 11:03
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Nov 7, 2021

Codecov Report

Merging #228 (62b3df8) into main (a5a4a3b) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main     #228   +/-   ##
=======================================
  Coverage   92.60%   92.60%           
=======================================
  Files          15       15           
  Lines        1217     1217           
=======================================
  Hits         1127     1127           
  Misses         90       90           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5a4a3b...62b3df8. Read the comment docs.

Co-authored-by: piatrashkakanstantinass <74979584+piatrashkakanstantinass@users.noreply.github.com>
@P403n1x87 P403n1x87 requested a review from a user November 7, 2021 11:27
@bee-san bee-san merged commit 559a89c into bee-san:main Nov 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants