Skip to content

Support for generating a set of tracking-ids from a slicing operation into an aggregation. #451

@bnlawrence

Description

@bnlawrence

Consider the following use case:

  • A cf aggregation points to 365 daily files each of which has a high resolution 3D grid for a variable for 24 hours.
  • A user does a cf.within (or any other sort of valid slice) into that aggregation to extract a mean value of a particular variable over a week.
  • The calculation will touch seven files. We can think of those seven files as the necessary data to reproduce the calculation. So these are the digital artifacts we want to save for reproduction, identify in a workflow, and cite in a paper.

(This is obviously a trivial case, it gets more interesting, if say, these are calculations carried out across ensembles from multiple institutions).

The feature request is that

  1. the aggregation metadata includes the tracking-ids (if they are present) in such a way that the same cf.within (or other slice) can return a set of tracking-ids which can be added to a list of "provenance sources" ... (so potentially a series of cf calculations can generate a list of all the files needed for reproduction (and/or citation).
  2. cf-python supports the use of the slicing operation so it does do this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    aggregationRerlating to metadata-based field and domain aggregationenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions