Skip to content

the subject / sample question #9

@lzehl

Description

@lzehl

This task force is coodinated by: @chrisvdt and @lzehl
Instructions for contributing: The first comment provides description of the issue and meant to trigger discussions and collect possible solutions or concrete ideas/aspects that are important to consider. It's content can be adopted over time. For change or extension requests please get in touch with @lzehl. The ongoing discussions around this issue can be held through comments (as usual).

GENERAL TOPIC:
While BIDS at the moment focuses on living (human) beings as a whole (in BIDS defined as subject or participant), neuroscience in general can be conducted on any living or dead (human or non-human) being as a whole ("subject") and any possible living or dead tissue sample extracted of that being ("tissue sample"). A being itself can reach from human to animal to a single-celled organism. How would a BIDS extension look that covers all these subject / tissue samples?

Other efforts to coordinate with considering this topic:

THE FOLDER HIERARCHY ISSUE:
The classical BIDS model foresees a hierarchical folder structure with an inheritance principle for metadata associated on each hierarchical level. For the raw data on living (or diseased) human beings (subjects) that means:

rawdata/
....sub-(label)/
........ses-(label)/ (optionally neglected if there is only one session)
............(data-type)/ (e.g., func, anat, dwi)

In order to be on the same page here the definitions from BIDS for some terms:

  • Dataset - a set of neuroimaging and behavioral data acquired for a purpose of a particular study. A dataset consists of data acquired from one or more subjects, possibly from multiple sessions.
  • Subject - [sub-(label)] a person or animal participating in the study. Used interchangeably with term Participant.
  • Session - [ses-(label)] a logical grouping of neuroimaging and behavioral data consistent across subjects. Session can (but doesn't have to) be synonymous to a visit in a longitudinal study. In general, subjects will stay in the scanner during one session. However, for example, if a subject has to leave the scanner room and then be re-positioned on the scanner bed, the set of MRI acquisitions will still be considered as a session and match sessions acquired in other subjects. Similarly, in situations where different data types are obtained over several visits (for example fMRI on one day followed by DWI the day after) those can be grouped in one session. Defining multiple sessions is appropriate when several identical or similar data acquisitions are planned and performed on all -or most- subjects, often in the case of some intervention between sessions (for example, training).
  • Data type - [(data-type)] a functional group of different types of data. BIDS defines eight data types: func (task based and resting state functional MRI), dwi (diffusion weighted imaging), fmap (field inhomogeneity mapping data such as field maps), anat (structural imaging such as T1, T2, PD, and so on), meg (magnetoencephalography), eeg (electroencephalography), ieeg (intracranial electroencephalography), beh (behavioral). Data files are contained in a directory named for the data type. In raw datasets, the data type directory is nested inside subject and (optionally) session directories.

[Note that we do not have to stick to those definitions, but if we vary from them we should explicitly state it to avoid misunderstandings in the discussions.]

To trigger the discussions: Let us assume we generalize the definition of a "subject" being the "thing" that is studied (a whole species [living or dead] or any part of a species [living or dead]). If we strictly follow the inheritance principle of BIDS, the following structure could be assumed for several use cases:

rawdata/
....sub-(label)/ (e.g., a mouse)
........ses-(label)/ (optionally neglected if there is only one session)
............(data-type)/ (e.g., func, anat, dwi)
........sub-(label)/ (e.g., the whole brain of that mouse)
............ses-(label)/ (optionally neglected if there is only one session)
.................(data-type)/ (e.g., anat, dwi, fixation)
............sub-(label)/ (e.g., a slice of that brain of that mouse)
................ses-(label)/ (optionally neglected if there is only one session)
.....................(data-type)/ (e.g., func [e.g. patch-clamp], anat [e.g., histology])
................sub-(label)/ (e.g., a biopsy of that slice of that brain of that mouse)
....................ses-(label)/ (optionally neglected if there is only one session)
.........................(data-type)/ (e.g., RNA analysis)

Questions:

  1. What are the advantages of such a solution and what are the disadvantages?
  2. Is it wise to group whole species together with extracted parts under one common term (e.g. "subject" or "specimen")?
    2.1) In how far do relevant metadata differ between a "subject" or a "tissue sample" (as defined in the general description)?
  3. In how far does the "provenance" of a subject (as whole or as part) need to be covered in the repository/folder structure?
    3.1) Could a subject-folder also be interpreted as a group / collection and still allowing the identification of a member of that group/collection?
  4. How would a solution for an opposite approach look like (keeping everything in a flat structure)?
    4.1) How would that affect the metadata storage / concept of BIDS?

[NOTE: Please get in touch with @lzehl to request changes / extensions for this first comment to keep it up-to-date with the result of the discussions in the remaining comments.]

Metadata

Metadata

Assignees

No one assigned

    Labels

    task forceDiscussions / planning / implementation around a specific use case

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions