[INFRA] Convert entity table to yaml by tsalo · Pull Request #475 · bids-standard/bids-specification

tsalo · 2020-05-17T21:52:02Z

Closes #466, closes #343, and closes #290. Can be used, once an entity-table rendering script is written, to deal with #290.

The yaml/json structure will need to be have sufficient information for tools like bids-validator, heudiconv, and pybids to be able to use them. I've somewhat based the files on the bids-validator rules files.

Changes proposed:

Entities and datatypes have been described with yaml files, stored in the new schema/ folder.
- This structure reflects more a proof of concept than a final version for the schema.
Datatypes are broadly split into standard datatypes and "auxiliary" datatypes (i.e., datatypes that are associated with multiple modalities, and which don't contain imaging data directly).
@yarikoptic wrote code to generate the Entity Table from the schema files, which I have modified to some extent.
The Entity Table differs in some respects:
- Run is optional for anat data. See Update entity table #343.
- Rows are in alphabetical order, based on the Entity string.
- Under DWI, bvec and bval have been dropped and sbref has been added, since bvec and bval are extensions, not suffixes. The addition of sbref is related to Update entity table #343.
- Under beh, the suffix beh has been added. Related to Update entity table #343.
- Suffixes may be listed in slightly different orders, because in the new version they're listed based on subgroups within datatypes.
- Single-suffix datatypes include the suffix in parentheses.
- Auxiliary datatypes have the suffix followed by associated datatypes in parentheses (like the current table), but the associated datatypes are space-separated instead of slash-separated.
- Headshape and markers are labeled like standard suffixes, rather than auxiliary datatypes.

edit @sappelhoff 2020-06-07: add to do list:

to do

make sure Update entity table #343 is closed by this PR
think about whether Split up the entity table into smaller sections #290 can be addressed with this PR as well
will this PR make Entity Table (Appendix) improvements: Custom CSS #289 irrelevant? If yes, close it.
Update CONTRIBUTING.md with information about the schema.
Updates pending [ENH] Support run and acq entities in behavior-only data #556.

yarikoptic

You haven't decided on inclusion mechanism yet, did you?
Left one more comment.
I think concentrating on entity table first would be best (just my reaction when I saw also descriptions of top level files/directories which I think could just go into a single yaml).

src/schema/top_level_files.yml

yarikoptic

Left some comments throughout

src/schema/datatypes/anat.yml

src/schema/associated_data.yml

src/schema/datatypes/anat.yml

yarikoptic · 2020-05-20T14:27:28Z

re "inclusion" -- I guess terminology is not clear. The recent commit is adding required field -- so it is about stating the requirement to include any given file in BIDS dataset. I meant "inclusion" of one .yaml file into another. I.e. we need to establish a hierarchy with a single YAML file as an entry point, so loading that one would load all of them. I would just use https://github.com/tanbro/pyyaml-include which is inline with other ad-hoc solutions proposed on https://stackoverflow.com/questions/528281/how-can-i-include-a-yaml-file-inside-another .

I could take care about it whenever I get to work on rendering the entity table.

src/schema/datatypes/anat.yaml

tsalo · 2020-05-22T00:08:39Z

re "inclusion" -- I guess terminology is not clear. The recent commit is adding required field -- so it is about stating the requirement to include any given file in BIDS dataset. I meant "inclusion" of one .yaml file into another. I.e. we need to establish a hierarchy with a single YAML file as an entry point, so loading that one would load all of them. I would just use https://github.com/tanbro/pyyaml-include which is inline with other ad-hoc solutions proposed on https://stackoverflow.com/questions/528281/how-can-i-include-a-yaml-file-inside-another .

I could take care about it whenever I get to work on rendering the entity table.

That would be great, thank you. I'm just not that familiar with yaml.

src/schema/datatypes/events.yaml

=== Do not change lines below === { "chain": [], "cmd": "git-sedi suffices suffixes", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "src/schema" } ^^^ Do not change lines above ^^^

yarikoptic · 2020-05-22T02:06:19Z

src/schema/datatypes/meg.yaml

+    - json
+    - ctf/
+    - fif
+    - 4d/


<rant> wow, I didn't know that (sub)folders are allowed here. I truly think this was a step backward for meeg proposal to just allow all those formats -- there is no 'standardization' now, tools will just support some but not the other formats etc. </rant>

yarikoptic · 2020-05-22T02:26:36Z

src/schema/datatypes/photo.yaml

+    - eeg
+    - ieeg
+  suffixes:
+    - photo


Dear @tsalo -- Thank you for leading this effort! It is really fun to look at BIDS in such as structured way!

So within this datatypes/ we indeed have datatypes which gained their folder/ within BIDS hierarchy (they also have themselves duplicated now in datatypes) such as anat/, func/, fmap/, etc. (+ beh/ which isn't imaging), and then we have another group -- those which could have indeed been a modality of their own, but they are not "imaging data". They became auxiliary files which could go along with base imaging (and beh/) datatypes. I think that

all the photo, channels etc should get their own "folder" in this schema, e.g. auxiliary or auxdatatypes, and be moved there (instead of being under datatypes)

datatypes/ entries can get rid of the datatypes field in their .yaml files -- they should then have 1-to-1 correspondence from the filename to corresponding datatype.

On the top level (for inclusion) I see smth like

datatypes: anat: !include datatypes/anat.yaml func: !include datatypes/func.yaml ... auxdatatypes: channels: !include auxdatatypes/channels.yaml event: !include auxdatatypes/events.yaml

This would make it easier to tell one (which gets its own folder in BIDS) from another (just provides additional suffixed files).

Alternative could be - keep as is, and then add additional field (in a yet to be created higher level "inclusion" file) to signal which datatype is auxiliary, but then it would be a bit less structured IMHO and most likely we would need to maintain that redundant self mentioning within datatypes for the base data types.

I split auxdatatypes from datatypes and dropped the datatypes key from the datatype yamls. I have to say, it looks much cleaner.

src/schema/datatypes/anat.yaml

vsoch · 2020-06-02T20:27:27Z

@tsalo is there something I can help with here?

tsalo · 2020-06-03T14:42:00Z

@vsoch Absolutely! At this point, I think any input would be helpful, to ensure that the yaml/json files properly reflect the specification. Once the yaml/json files are more finalized, though, we will need scripts to build the tables for the specification, and bids-validator and pybids will also need interfaces. I know that @yarikoptic can help with that, but I won't be much use at that point, so your help would be amazing.

@yarikoptic I didn't realize I hadn't touched this in two weeks. Sorry about that! I will try to start responding more over the next couple of days.

vsoch · 2020-06-03T20:57:00Z

@tsalo I helped with the original development of BIDS long ago and far away, but haven't really looked at or used it in years, so I probably am not one to give feedback on that. Once that is done and you need to build tables, however, that might be something I can help with, as long as it's possible to scope out what exactly is needed (e.g., an example output table to produce).

tsalo · 2020-08-07T20:12:12Z

I think that all of the issues have been resolved, so I think a review would be helpful.

effigies · 2020-08-07T21:20:33Z

First thing on Monday, if I don't end up working this weekend. :-)

effigies

Overall looks good, a couple small issues. And obviously need to merge in or rebase onto master and rerun.

CONTRIBUTING.md

src/schema/datatypes/func.yaml

src/schema/datatypes/beh.yaml

Co-authored-by: Chris Markiewicz <effigies@gmail.com>

tsalo · 2020-08-10T19:04:02Z

Thanks @effigies! I've merged your changes, updated from master, and added in the remaining associated changes from #556.

effigies

LGTM. Thanks for this huge effort!

All suggestions addressed

yarikoptic · 2020-08-10T20:06:33Z

src/schema/entities.yaml

+acq:
+  name: Acquisition
+  description: |
+    The OPTIONAL acq-<label> key/value pair corresponds to a custom label the


if `TXT` is allowed in yaml, we better make those acq-<label> into `acq-<label>` right away (as well as sample filenames which could have some characters which markdown could interpret for formatting) to simplify any possible future rendering... Since we do not render them currently though -- I think it would be ok to proceed as is (unless some other really needed changes are suggested) and just defray it to a separate PR

src/schema/entities.yaml

yarikoptic · 2020-08-10T20:15:59Z

src/schema/entities.yaml

+ce:
+  name: Contrast Enhancing Agent
+  description: |
+    Similarly the OPTIONAL ce-<label> key/value can be used to distinguish


optional or not is a property per specific modality etc.

"Similarly ..." is the paragraph opening .

ce-<label> is duplicate with the record itself.

"can be used to distinguish" is also somewhat "to make it a sentence".

Here and in others I would have just kept it to the point, and start from the 2nd sentence: "The label is the name of the contrast agent.".

Do you mind if we push updating entity definitions until a PR for #567? I'm planning to make a version of #568 that is compatible with this PR's changes.

I would not mind at all! Just wanted to leave a note ;-)

yarikoptic · 2020-08-10T20:16:32Z

src/schema/entities.yaml

+mod:
+  name: Corresponding Modality
+  description: |
+    In such cases the OPTIONAL `mod-<label>` key/value pair corresponds to


in what "Such cases"?

yarikoptic · 2020-08-10T20:18:38Z

Thank you @tsalo ! It looks great!
While approving I have clicked too fast though -- I want to note that ideally descriptions in entities.yaml should be adjusted. But that could be done even after this PR since they aren't used ATM anywhere yet.

tsalo · 2020-08-10T23:39:43Z

Now that there are two approvals, should we wait five days before merging (pending any requested changes, of course)?

effigies · 2020-08-10T23:52:42Z

We don't really count cleanup changes as restarting the clock, usually. I would say this can be merged at @sappelhoff's convenience.

yarikoptic · 2020-08-11T03:05:19Z

One quick click for @sappelhoff , one giant leap for BIDS!

sappelhoff

Thanks a lot for this thorough work that will pay off in the future!

Also thanks to the reviewers and contributors :-)

I found that LICENSE is missing next to README and CHANGES in the top_level_files.yaml, so I am just going to add it and then merge. I agree that everything else can be done in follow up PRs.

sappelhoff · 2020-08-11T07:51:36Z

src/schema/entities.yaml

+  name: Split
+  description: |
+    In the case of long data recordings that exceed a file size of 2Gb, the
+    .fif files are conventionally split into multiple parts.


for the future, the split entity could be used by any file format IMHO

src/schema/top_level_files.yaml

tsalo added 4 commits May 16, 2020 12:42

Draft entity and datatype files.

18a3f24

Draft more specifications.

02c3403

Add specifications for top-level files and directories.

6ea0dfd

Reorganize yaml files.

6626368

tsalo mentioned this pull request May 17, 2020

Convert entity table to machine-readable format #466

Closed

yarikoptic reviewed May 17, 2020

View reviewed changes

src/schema/top_level_files.yml Outdated Show resolved Hide resolved

Add inclusion for top level files.

5c26565

yarikoptic mentioned this pull request May 20, 2020

[ENH] Add part entity for complex-valued data #424

Merged

yarikoptic reviewed May 20, 2020

View reviewed changes

src/schema/top_level_files.yml Outdated Show resolved Hide resolved

yarikoptic previously requested changes May 20, 2020

View reviewed changes

src/schema/datatypes/anat.yml Outdated Show resolved Hide resolved

src/schema/associated_data.yml Outdated Show resolved Hide resolved

src/schema/datatypes/anat.yml Outdated Show resolved Hide resolved

tsalo added 2 commits May 21, 2020 20:04

Rename yaml files and add required field.

d6601e7

Convert dict to list in example file.

ee9535f

tsalo commented May 22, 2020

View reviewed changes

src/schema/datatypes/anat.yaml Outdated Show resolved Hide resolved

yarikoptic reviewed May 22, 2020

View reviewed changes

src/schema/datatypes/events.yaml Outdated Show resolved Hide resolved

[DATALAD RUNCMD] Replace suffices with suffixes

56a42b6

=== Do not change lines below === { "chain": [], "cmd": "git-sedi suffices suffixes", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "src/schema" } ^^^ Do not change lines above ^^^

yarikoptic reviewed May 22, 2020

View reviewed changes

src/schema/datatypes/anat.yaml Outdated Show resolved Hide resolved

tsalo added 4 commits June 3, 2020 10:58

Add leading "." to extensions to match specification.

596ed00

Convert dict to list.

45fdfca

Fix iEEG extension based on issue.

9ec4e09

Split auxdatatypes and drop datatypes key.

d03d895

This was referenced Jun 3, 2020

Use bids-specification files for filename patterns bids-standard/pybids#626

Open

Use bids-specification files for filename patterns bids-standard/legacy-validator#974

Closed

Add descriptions to entities.

8d3436c

This was referenced Aug 10, 2020

[INFRA] Move entity definitions to a separate page #568

Merged

Move entity definitions to separate page(s) #567

Closed

yarikoptic mentioned this pull request Aug 10, 2020

Which BIDSVersion? #545

Closed

effigies reviewed Aug 10, 2020

View reviewed changes

CONTRIBUTING.md Outdated Show resolved Hide resolved

src/schema/datatypes/func.yaml Outdated Show resolved Hide resolved

src/schema/datatypes/beh.yaml Show resolved Hide resolved

tsalo and others added 3 commits August 10, 2020 14:54

Apply suggestions from code review

125c492

Co-authored-by: Chris Markiewicz <effigies@gmail.com>

Implement changes from bids-standard#556.

5c426b0

Merge branch 'master' into ref/json-entity

86650bf

effigies requested review from nicholst and yarikoptic August 10, 2020 19:11

effigies approved these changes Aug 10, 2020

View reviewed changes

yarikoptic mentioned this pull request Aug 10, 2020

"Unknown" modalities are mentioned in the text #569

Closed

yarikoptic approved these changes Aug 10, 2020

View reviewed changes

sappelhoff approved these changes Aug 11, 2020

View reviewed changes

add LICENSE to top_level_files.yaml

177b15a

sappelhoff merged commit a3f92ce into bids-standard:master Aug 11, 2020

bids-maintenance added a commit that referenced this pull request Aug 11, 2020

[DOC] Auto-generate changelog entry for PR #475

32dd59d

tsalo deleted the ref/json-entity branch August 11, 2020 14:02

rwblair mentioned this pull request Aug 11, 2020

Reorganize file_level_rules.json bids-standard/legacy-validator#1034

Closed

tsalo added schema Issues related to the YAML schema representation of the specification. Patch version release. schema-structure Changes to the fundamental organization/structure of the YAML schema. Minor version release. labels Apr 11, 2022

tsalo removed the help wanted Extra attention is needed label Jun 15, 2023

Conversation

tsalo commented May 17, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

to do

Uh oh!

yarikoptic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yarikoptic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yarikoptic commented May 20, 2020

Uh oh!

Uh oh!

tsalo commented May 22, 2020

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vsoch commented Jun 2, 2020

Uh oh!

tsalo commented Jun 3, 2020

Uh oh!

vsoch commented Jun 3, 2020

Uh oh!

tsalo commented Aug 7, 2020

Uh oh!

effigies commented Aug 7, 2020

Uh oh!

effigies left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tsalo commented Aug 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

effigies left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yarikoptic commented Aug 10, 2020

Uh oh!

tsalo commented Aug 10, 2020

Uh oh!

effigies commented Aug 10, 2020

Uh oh!

yarikoptic commented Aug 11, 2020

Uh oh!

sappelhoff left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

tsalo commented May 17, 2020 •

edited

Loading

tsalo commented Aug 10, 2020 •

edited

Loading