Skip to content

[SCHEMA] Add metadata term files#774

Merged
tsalo merged 76 commits intomasterfrom
metadata-schema
Jul 13, 2021
Merged

[SCHEMA] Add metadata term files#774
tsalo merged 76 commits intomasterfrom
metadata-schema

Conversation

@tsalo
Copy link
Copy Markdown
Member

@tsalo tsalo commented Apr 13, 2021

References #699. Continues from #762.

Changes proposed:

  • Take the metadata JSON files from the bids-validator and convert them to YAML format.
  • Add "description" and "key_name" fields to the metadata YAML files.
  • Add functions to render the metadata information in tables within the schema. These functions require the specification text to include the requirement levels.
  • Create files for a number of metadata terms that aren't present in the validator, but are present in specification tables.

To do:

Reviewing

I think the best approach to reviewing this PR is to (1) review the proposed metadata schema structure defined in the schema README and a handful or randomly chosen example terms, and (2) compare the rendered metadata tables against the hardcoded ones in the generated specification. These tasks can be split across multiple reviewers.

  • 02-common-principles
  • 03-modality-agnostic-files
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Common Metadata Fields
    • Scanner Hardware
    • Sequence Specifics
    • remaining sections
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Anatomy imaging data
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Task (including resting state) imaging data
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Diffusion imaging data
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Arterial Spin Labeling perfusion data
  • 04-modality-specific-files/01-magnetic-resonance-imaging-data Fieldmap data
  • 04-modality-specific-files/02-magnetoencephalography
  • 04-modality-specific-files/03-electroencephalography
  • 04-modality-specific-files/04-intracranial-electroencephalography
  • 04-modality-specific-files/08-genetic-descriptor
  • 09-positron-emission-tomography
  • 05-derivatives/02-common-data-types
  • 05-derivatives/03-imaging
  • 99-appendices/11-qmri
  • schema/README.md : The README describing the schema structure.
  • The concept of path "formats". I tried to break down the types of paths that are possible and give each one its own name, but that was before [ENH] Major update to pointing to files within, outside of, or remote to a current BIDS dataset #820. If possible, could someone who is familiar with [ENH] Major update to pointing to files within, outside of, or remote to a current BIDS dataset #820 take a look at these? They're discussed in the README file.

* Draft a handful of metadata term files.

* Add example with specific possible values.

* Match the validator schemas better.

* Fix formatting.

* Fix formatting again!

* Draft semi-functional rendering functions.

* Use unit abbreviations.

* Add more fields.

* Get macro working.

* Add terms from first table.

* Add AnatomicalLandmarkCoordinateSystem.

* fMRI task information table.

* More terms.

* More terms.

* More terms.

* EchoTime and FlipAngle

* Add tables.

* More tables.

* More terms.

* More terms.

* More terms.

* Fix spacing.

* Clean things up.

* More terms.

* More terms!

* More terms.

* Some iEEG terms.

* More iEEG terms.

* Add ASL labeling terms.

* Next batch.

* More terms.

* More terms.

* More terms.

* Fix mistakes.

* Last terms.

* Reference yamls.

* Fix mistakes.

* Change format of coordinate system files.

* Use degree in associated files.

* Some of the missing terms.

* A few more terms.

* Fix typos.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* Last terms.

* Fix link.

* Fix internal links.

* Fix links for real.

* Derivative terms.

* Fix up code link.

* Use backslashes for continued strings.

* Replace $ref with file contents.

Also support plural datatype strings and all manner of newlines in descriptions.

* Fix genetics.

* Describe the structure of metadata YAML files.

* Make metadatatype function recursive.

* Improve search function.

* Start adding PET fields.

* Add some fields.

* More terms.

* More terms.

* More terms.

* Fix mistakes.

* More terms.

* Replace InstitutionDepartmentName with existing InstitutionalDepartmentName.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* More terms.

* Last terms.

* Add unit format for strings.

Unused for now but could be useful later.

* Add dataset_relative and participant_relative string formats.

* Update READMEs.

* Fix formats in README.

* Support table-specific metadata description extensions.

* Employ description extensions with IntendedFor.

* Remove explicit defaults from YAML files.

* Replace Minimum with minimum.

* Replace inclusiveMaximum with maximum.

* Replace implicit links with explicit ones.

* Rename key_name to name.

* Rename "Unit" to "Units"
@tsalo
Copy link
Copy Markdown
Member Author

tsalo commented Apr 13, 2021

Now that these changes are on a branch of the main repository, it should be easier for other folks to contribute directly. The main thing that I could use help with is generalizing the base metadata definitions when necessary and appending section-specific information to those base definitions in the macro calls.

@tsalo tsalo added the schema Issues related to the YAML schema representation of the specification. Patch version release. label Apr 13, 2021
@tsalo
Copy link
Copy Markdown
Member Author

tsalo commented Apr 26, 2021

Okay, I think this is finally ready! @bids-standard/maintainers if any of you have the time, I'd appreciate your review.

Copy link
Copy Markdown
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started by reviewing the code. Have not yet looked through the schema files or checked all of the rendered tables.

One thing I worry about, after moving to this, is that minor bugs in the code may result in visible changes to a table without being obviously part of the review process. Do we have a strategy to note when a change to the schema code produces a change in the generated markdown?

Thanks @effigies!

Co-authored-by: Chris Markiewicz <markiewicz@stanford.edu>
@effigies
Copy link
Copy Markdown
Collaborator

Quick note: I ran across yaml.safe_load which has been around forever as a convenience function equivalent to partial(yaml.load, loader=yaml.SafeLoader).

Copy link
Copy Markdown
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review for Task (including resting state) imaging data

{{ MACROS___make_metadata_table(
{
"RepetitionTime": "REQUIRED",
"VolumeTiming": "REQUIRED",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to say:

Suggested change
"VolumeTiming": "REQUIRED",
"VolumeTiming": "REQUIRED, unless `RepetitionTime` and `DelayTime` are defined.",

And make similar adjustments for RepetitionTime and DelayTime requirement levels?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice that there are more fields that could benefit from such an overhaul (AcquisitionDuration)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we push this one to a separate issue/PR? I think it ties into that weird dependency table for timing-related field and will require more thought than I can put in at the moment.

image

Copy link
Copy Markdown
Member

@sappelhoff sappelhoff Jul 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, you are right. Let's not open that can of worms (especially not me; EDIT: Given my lack of experience with MRI).

Copy link
Copy Markdown
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finished my run through the remaining MRI sections

Copy link
Copy Markdown
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I now finished going through all changes and provided that my remaining comments are addressed, I think we can merge this 👍

This is a great piece of work @tsalo, thanks a lot.

tsalo and others added 2 commits July 5, 2021 14:35
Co-authored-by: Stefan Appelhoff <stefan.appelhoff@mailbox.org>
Co-authored-by: Stefan Appelhoff <stefan.appelhoff@mailbox.org>
@tsalo
Copy link
Copy Markdown
Member Author

tsalo commented Jul 7, 2021

@Remi-Gau @effigies In yesterday's maintainers call, we decided that it would be unrealistic to request a full review from anyone else, so instead of reviewing all of the content we were wondering if you'd be willing to do more of a conceptual review or a code review. We should have plenty of time before the next release for folks to identify any issues with the content, so any small content issues can be resolved after this is merged. WDYT?

@Tokazama
Copy link
Copy Markdown
Member

Tokazama commented Jul 7, 2021

Is there any reason that all the metadata files are separate instead of consolidated into one file (or even just a few)?

@tsalo
Copy link
Copy Markdown
Member Author

tsalo commented Jul 8, 2021

The biggest reason is that I find individual files easier to work with than one big file. While the idea of a few files is appealing, I don't think it would work well for metadata. The obvious delineation is based on data type, but unfortunately many metadata terms can be used for multiple data types.

@Remi-Gau
Copy link
Copy Markdown
Collaborator

Remi-Gau commented Jul 9, 2021

@Remi-Gau @effigies In yesterday's maintainers call, we decided that it would be unrealistic to request a full review from anyone else, so instead of reviewing all of the content we were wondering if you'd be willing to do more of a conceptual review or a code review. We should have plenty of time before the next release for folks to identify any issues with the content, so any small content issues can be resolved after this is merged. WDYT?

Good with me though I will only be back "on duty" later this month (26th) so don't let my silence hold you back on the merge anyway (plus most of my comments will be on formatting and missing comas most likely).

Copy link
Copy Markdown
Member

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsalo could you please update this PR with respect to

Then I'd be fine with removing the hard-coded tables and merging, as Chris and Remi are currently unavailable and we didn't plan on having a "deep" review anyhow (that's already happened).

@tsalo
Copy link
Copy Markdown
Member Author

tsalo commented Jul 13, 2021

Awesome! I incorporated the changes for the two PRs, and looked at the rendered docs. The docs look good, so I'm going to remove the hardcoded tables now.

@tsalo
Copy link
Copy Markdown
Member Author

tsalo commented Jul 13, 2021

Okay the tables have been removed and the CI is passing.... should I merge? Is it time?!

Copy link
Copy Markdown
Collaborator

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a code review. Only one suggestion, really. I haven't tested, so you may want to test before merging. Or reject if it seems less clear.

Co-authored-by: Chris Markiewicz <markiewicz@stanford.edu>
@tsalo tsalo mentioned this pull request Jul 13, 2021
3 tasks
@tsalo
Copy link
Copy Markdown
Member Author

tsalo commented Jul 13, 2021

The rendered docs look good with the new changes, so I'm merging!

@effigies
Copy link
Copy Markdown
Collaborator

Seems to work. If you're happy to merge, fire away!

@tsalo tsalo merged commit 5de7cfc into master Jul 13, 2021
@tsalo tsalo deleted the metadata-schema branch July 13, 2021 16:47
@sappelhoff
Copy link
Copy Markdown
Member

😎 nice work! Very happy to see this in

@tsalo tsalo added the schema-code Updates or changes to the code used to parse, filter, and render the schema. label Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

schema Issues related to the YAML schema representation of the specification. Patch version release. schema-code Updates or changes to the code used to parse, filter, and render the schema.

Projects

Development

Successfully merging this pull request may close these issues.

7 participants