Skip to content

Conversation

@schlunma
Copy link
Contributor

@schlunma schlunma commented May 9, 2025

Description

This PR allows reading facets from filenames and also makes it much more flexible. Examples:

path = "/climate_data/value1/xyz/filename.nc"
template = "{facet1}/xyz/{facet2}.nc"
path2facets(path, template) = {"facet1": "value1", "facet2": "filename"}

path = "/climate_data/Tier3/ds/tas_ds_Amon_1993.nc"
template = "Tier{tier}/{dataset}/{short_name}_*.nc"
path2facets(path, template) = {"tier": "3", "dataset": "ds", "short_name": "tas"}

path = "/climate_data/value-1-value-2/value-2/filename.nc"
template = "{facet1}-{facet2}/{facet2}/{facet3}.nc"
path2facets(path, template) = {"facet1": "value-1", "facet2": "value-2", "facet3", "filename"}

path = "/climate_data/1/2345/678/910/11/filename.nc"
template = "{f1}/{f2}{f3}{f4}{f5}/{f6}{f7}{f8}/{f9}{f10}/{f11}/{filename}.nc"
path2facets(path, template) = {"f1": "1", "f11": "11", "filename": "filename"}

# To be backwards-compatible
path = "/climate_data/value1/xyz/filename.nc"
template = "{facet1.upper}/xyz/{facet2.lower}.nc"
path2facets(path, template) = {"facet1": "value1", "facet2": "filename"}

Internally, this uses the regex engine to do all the heavy lifting. For this, the templates are converted to regex patterns that include named capture groups, e.g.

template = "{f2}/{f3}[._]{f4}*"
regex_pattern = r"(?P<f2>[^_/]*?)/(?P<f3>[^_/]*?)[\._](?P<f4>[^_/]*?).*?"

Closes #1943.


Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@schlunma schlunma added this to the v2.13.0 milestone May 9, 2025
@schlunma schlunma requested a review from bouweandela May 9, 2025 09:59
@schlunma schlunma added the enhancement New feature or request label May 9, 2025
@codecov
Copy link

codecov bot commented May 9, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.15%. Comparing base (170a938) to head (b8e2284).
⚠️ Report is 56 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2725   +/-   ##
=======================================
  Coverage   95.14%   95.15%           
=======================================
  Files         259      259           
  Lines       15113    15138   +25     
=======================================
+ Hits        14379    14404   +25     
  Misses        734      734           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@bouweandela bouweandela left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@valeriupredoi valeriupredoi merged commit a116cc6 into main May 16, 2025
7 checks passed
@valeriupredoi valeriupredoi deleted the facets_from_filenames branch May 16, 2025 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants