effigies
diff --git a/‎pdf_build_src/process_markdowns.py‎
Lines changed: 1 addition & 0 deletions b/‎pdf_build_src/process_markdowns.py‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎src/02-common-principles.md‎
Lines changed: 31 additions & 16 deletions b/‎src/02-common-principles.md‎
Lines changed: 31 additions & 16 deletions
diff --git a/‎src/03-modality-agnostic-files.md‎
Lines changed: 125 additions & 30 deletions b/‎src/03-modality-agnostic-files.md‎
Lines changed: 125 additions & 30 deletions
@@ -445,6 +445,7 @@ def process_macros(duplicated_src_dir_path):
 
             # Replace code snippets in the text with their outputs
             matches = re.findall("({{.*?}})", contents)
+            matches = re.findall(re.compile("({{.*?}})", re.DOTALL), contents)
             for m in matches:
                 # Remove macro delimiters to get *just* the function call
                 function_string = m.strip("{} ")
 
@@ -32,6 +32,13 @@ misunderstanding we clarify them here.
     context, a session may also indicate a group of related scans,
     taken in one or more visits.
 
+1.  **Sample** - a sample pertaining to a subject such as tissue, primary cell
+    or cell-free sample.
+    The `sample-<label>` key/value pair is used to distinguish between different
+    samples from the same subject.
+    The label MUST be unique per subject and is RECOMMENDED to be unique
+    throughout the dataset.
+
 1.  **Data acquisition** - a continuous uninterrupted block of time during which
     a brain scanning instrument was acquiring data according to particular
     scanning sequence/protocol.
@@ -156,6 +163,15 @@ correspond to a unique identifier of that subject, such as `01`.
 The same holds for the `session` entity with its `ses-` key and its `<label>`
 value.
 
+The extra session layer (at least one `/ses-<label>` subfolder) SHOULD
+be added for all subjects if at least one subject in the dataset has more than
+one session.
+If a `/ses-<label>` subfolder is included as part of the directory hierarchy,
+then the same [`ses-<label>`](./99-appendices/09-entities.md#ses)
+key/value pair MUST also be included as part of the file names themselves.
+Acquisition time of session can
+be defined in the [sessions file](03-modality-agnostic-files.md#sessions-file).
+
 A chain of entities, followed by a suffix, connected by underscores (`_`)
 produces a human readable file name, such as `sub-01_task-rest_eeg.edf`.
 It is evident from the file name alone that the file contains resting state
@@ -352,7 +368,7 @@ then Case 1 will be assumed for clarity in templates and examples, but removing
 Case 2.
 In both cases, every derivatives dataset is considered a BIDS dataset and must
 include a `dataset_description.json` file at the root level (see
-[Dataset description][dataset-description].
+[Dataset description][dataset-description]).
 Consequently, files should be organized to comply with BIDS to the full extent
 possible (that is, unless explicitly contradicted for derivatives).
 Any subject-specific derivatives should be housed within each subject’s directory;
@@ -694,14 +710,19 @@ Note that if a field name included in the data dictionary matches a column name
 then that field MUST contain a description of the corresponding column,
 using an object containing the following fields:
 
-| **Key name** | **Requirement level** | **Data type**                           | **Description**                                                                                                 |
-| ------------ | --------------------- | --------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
-| LongName     | OPTIONAL              | [string][]                              | Long (unabbreviated) name of the column.                                                                        |
-| Description  | RECOMMENDED           | [string][]                              | Description of the column.                                                                                      |
-| Levels       | RECOMMENDED           | [object][] of [strings][]               | For categorical variables: An object of possible values (keys) and their descriptions (values).                 |
-| Units        | RECOMMENDED           | [string][]                              | Measurement units. SI units in CMIXF formatting are RECOMMENDED (see [Units](./02-common-principles.md#units)). |
-| TermURL      | RECOMMENDED           | [string][]                              | URL pointing to a formal definition of this type of data in an ontology available on the web.                   |
-| HED          | OPTIONAL              | [object][] of [strings][] or [string][] | Hierarchical Event Descriptor (HED) information, see: [Appendix III](./99-appendices/03-hed.md) for details.    |
+{{ MACROS___make_metadata_table(
+   {
+        "LongName": "OPTIONAL",
+        "Description": (
+            "RECOMMENDED",
+            "The description of the column.",
+        ),
+        "Levels": "RECOMMENDED",
+        "Units": "RECOMMENDED",
+        "TermURL": "RECOMMENDED",
+        "HED": "OPTIONAL",
+   }
+) }}
 
 Please note that while both `Units` and `Levels` are RECOMMENDED, typically only one
 of these two fields would be specified for describing a single TSV file column.
@@ -890,6 +911,7 @@ individual files see descriptions in the next section:
 
 ```Text
 sub-control01/
+    sub-control01_scans.tsv
     anat/
         sub-control01_T1w.nii.gz
         sub-control01_T1w.json
@@ -910,7 +932,6 @@ sub-control01/
         sub-control01_phasediff.nii.gz
         sub-control01_phasediff.json
         sub-control01_magnitude1.nii.gz
-        sub-control01_scans.tsv
 code/
     deface.py
 derivatives/
@@ -944,12 +965,6 @@ to suppress warnings or provide interpretations of your file names.
 
 [derived-dataset-description]: 03-modality-agnostic-files.md#derived-dataset-and-pipeline-description
 
-[string]: https://www.w3schools.com/js/js_json_syntax.asp
-
-[strings]: https://www.w3schools.com/js/js_json_syntax.asp
-
-[object]: https://www.json.org/json-en.html
-
 [deprecated]: ./02-common-principles.md#definitions
 
 [uris]: ./02-common-principles.md#uniform-resource-indicator
@@ -14,21 +14,23 @@ Templates:
 The file `dataset_description.json` is a JSON file describing the dataset.
 Every dataset MUST include this file with the following fields:
 
-| **Key name**       | **Requirement level**              | **Data type**            | **Description**                                                                                                                                                                                                                                       |
-|--------------------|------------------------------------|--------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| Name               | REQUIRED                           | [string][]               | Name of the dataset.                                                                                                                                                                                                                                  |
-| BIDSVersion        | REQUIRED                           | [string][]               | The version of the BIDS standard that was used.                                                                                                                                                                                                       |
-| HEDVersion         | RECOMMENDED                        | [string][]               | If HED tags are used: The version of the HED schema used to validate HED tags for study.                                                                                                                                                              |
-| DatasetLinks       | REQUIRED if [BIDS URIs][] are used | [object][] of [uris][]   | Used to map a given `<dataset-name>` from a [BIDS URI][] of the form `bids:<dataset-name>:/absolute/path/within/dataset` to a local or remote location. The `<dataset-name>`: `local` is a reserved keyword that MUST NOT be a key in `DatasetLinks`  |
-| DatasetType        | RECOMMENDED                        | [string][]               | The interpretation of the dataset. MUST be one of `"raw"` or `"derivative"`. For backwards compatibility, the default value is `"raw"`.                                                                                                               |
-| License            | RECOMMENDED                        | [string][]               | The license for the dataset. The use of license name abbreviations is RECOMMENDED for specifying a license (see [Appendix II](./99-appendices/02-licenses.md)). The corresponding full license text MAY be specified in an additional `LICENSE` file. |
-| Authors            | OPTIONAL                           | [array][] of [strings][] | List of individuals who contributed to the creation/curation of the dataset.                                                                                                                                                                          |
-| Acknowledgements   | OPTIONAL                           | [string][]               | Text acknowledging contributions of individuals or institutions beyond those listed in Authors or Funding.                                                                                                                                            |
-| HowToAcknowledge   | OPTIONAL                           | [string][]               | Text containing instructions on how researchers using this dataset should acknowledge the original authors. This field can also be used to define a publication that should be cited in publications that use the dataset.                            |
-| Funding            | OPTIONAL                           | [array][] of [strings][] | List of sources of funding (grant numbers).                                                                                                                                                                                                           |
-| EthicsApprovals    | OPTIONAL                           | [array][] of [strings][] | List of ethics committee approvals of the research protocols and/or protocol identifiers.                                                                                                                                                             |
-| ReferencesAndLinks | OPTIONAL                           | [array][] of [strings][] | List of references to publications that contain information on the dataset. A reference may be textual or a [URI][uri].                                                                                                                               |
-| DatasetDOI         | OPTIONAL                           | [string][]               | The Digital Object Identifier of the dataset (not the corresponding paper). DOIs SHOULD be expressed as a valid [URI][uri]; bare DOIs such as `10.0.2.3/dfjj.10` are [DEPRECATED][deprecated].                                                        |
+{{ MACROS___make_metadata_table(
+   {
+      "Name": "REQUIRED",
+      "BIDSVersion": "REQUIRED",
+      "HEDVersion": "RECOMMENDED",
+      "DatasetLinks": "REQUIRED if [BIDS URIs][] are used",
+      "DatasetType": "RECOMMENDED",
+      "License": "RECOMMENDED",
+      "Authors": "OPTIONAL",
+      "Acknowledgements": "OPTIONAL",
+      "HowToAcknowledge": "OPTIONAL",
+      "Funding": "OPTIONAL",
+      "EthicsApprovals": "OPTIONAL",
+      "ReferencesAndLinks": "OPTIONAL",
+      "DatasetDOI": "OPTIONAL",
+   }
+) }}
 
 Example:
 
@@ -70,10 +72,12 @@ In addition to the keys for raw BIDS datasets,
 derived BIDS datasets include the following REQUIRED and RECOMMENDED
 `dataset_description.json` keys:
 
-| **Key name**   | **Requirement level** | **Data type**            | **Description**                                                                                                                                                                      |
-|----------------|-----------------------|--------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| GeneratedBy    | REQUIRED              | [array][] of [objects][] | Used to specify provenance of the derived dataset. See table below for contents of each object.                                                                                      |
-| SourceDatasets | RECOMMENDED           | [array][] of [objects][] | Used to specify the locations and relevant attributes of all source datasets. Valid keys in each object include `URL`, `DOI` (see [URI][uri]), and `Version` with [string][] values. |
+{{ MACROS___make_metadata_table(
+   {
+      "GeneratedBy": "REQUIRED",
+      "SourceDatasets": "RECOMMENDED",
+   }
+) }}
 
 Each object in the `GeneratedBy` list includes the following REQUIRED, RECOMMENDED
 and OPTIONAL keys:
@@ -252,13 +256,80 @@ to date of birth.
 }
 ```
 
+## Samples file
+
+Template:
+
+```Text
+samples.tsv
+samples.json
+```
+
+The purpose of this file is to describe properties of samples, indicated by the `sample` entity.
+This file is REQUIRED if `sample-<label>` is present in any file name within the dataset.
+If this file exists, it MUST contain the three following columns:
+
+-   `sample_id`: MUST consist of `sample-<label>` values identifying one row
+    for each sample
+
+-   `participant_id`: MUST consist of `sub-<label>`
+
+-   `sample_type`: MUST consist of sample type values, either `cell line`, `in vitro differentiated cells`,
+    `primary cell`, `cell-free sample`, `cloning host`, `tissue`, `whole organisms`, `organoid` or
+    `technical sample` from [ENCODE Biosample Type](https://www.encodeproject.org/profiles/biosample_type)
+
+Other optional columns MAY be used to describe the samples.
+Each sample MUST be described by one and only one row.
+
+Commonly used *optional* columns in `samples.tsv` files are `pathology` and
+`derived_from`. We RECOMMEND to make use of these columns, and in case that
+you do use them, we RECOMMEND to use the following values for them:
+
+-   `pathology`: string value describing the pathology of the sample or type of control.
+    When different from `healthy`, pathology SHOULD be specified in `samples.tsv`.
+    The pathology MAY instead be specified in [Sessions files](./03-modality-agnostic-files.md#sessions-file)
+    in case it changes over time.
+
+-   `derived_from`: `sample-<label>` key/value pair from which a sample is derived from,
+    for example a slice of tissue (`sample-02`) derived from a block of tissue (`sample-01`),
+    as illustrated in the example below.
+
+`samples.tsv` example:
+
+```Text
+sample_id participant_id sample_type derived_from
+sample-01 sub-01 tissue n/a
+sample-02 sub-01 tissue sample-01
+sample-03 sub-01 tissue sample-01
+sample-04 sub-02 tissue n/a
+sample-05 sub-02 tissue n/a
+```
+
+It is RECOMMENDED to accompany each `samples.tsv` file with a sidecar
+`samples.json` file to describe the TSV column names and properties of their values
+(see also the [section on tabular files](02-common-principles.md#tabular-files)).
+
+`samples.json` example:
+
+```JSON
+{
+    "sample_type": {
+        "Description": "type of sample from ENCODE Biosample Type (https://www.encodeproject.org/profiles/biosample_type)",
+    },
+    "derived_from": {
+        "Description": "sample_id from which the sample is derived"
+    }
+}
+```
+
 ## Phenotypic and assessment data
 
 Template:
 
 ```Text
-phenotype/<measurement_tool_name>.tsv
-phenotype/<measurement_tool_name>.json
+phenotype/
+    <measurement_tool_name>.tsv
+    <measurement_tool_name>.json
 ```
 
 Optional: Yes
@@ -330,9 +401,10 @@ questionnaire).
 Template:
 
 ```Text
-sub-<label>/[ses-<label>/]
-    sub-<label>[_ses-<label>]_scans.tsv
-    sub-<label>[_ses-<label>]_scans.json
+sub-<label>/
+    [ses-<label>/]
+        sub-<label>[_ses-<label>]_scans.tsv
+        sub-<label>[_ses-<label>]_scans.json
 ```
 
 Optional: Yes
@@ -380,6 +452,33 @@ meg/sub-control01_task-rest_split-01_meg.nii.gz	1877-06-15T12:15:27
 meg/sub-control01_task-rest_split-02_meg.nii.gz	1877-06-15T12:15:27
 ```
 
+## Sessions file
+
+Template:
+
+```Text
+sub-<label>/
+    sub-<label>_sessions.tsv
+```
+
+Optional: Yes
+
+In case of multiple sessions there is an option of adding additional
+`sessions.tsv` files describing variables changing between sessions.
+In such case one file per participant SHOULD be added.
+These files MUST include a `session_id` column and describe each session by one and only one row.
+Column names in `sessions.tsv` files MUST be different from group level participant key column names in the
+[`participants.tsv` file](./03-modality-agnostic-files.md#participants-file).
+
+`_sessions.tsv` example:
+
+```Text
+session_id	acq_time	systolic_blood_pressure
+ses-predrug	2009-06-15T13:45:30	120
+ses-postdrug	2009-06-16T13:45:30	100
+ses-followup	2009-06-17T13:45:30	110
+```
+
 ## Code
 
 Template: `code/*`
@@ -399,15 +498,11 @@ code organization of these scripts at the moment.
 
 [bids uris]: ./02-common-principles.md#bids-uri-pointing-to-files-within-and-outside-of-bids-datasets
 
-[objects]: https://www.json.org/json-en.html
-
 [object]: https://www.json.org/json-en.html
 
-[string]: https://www.w3schools.com/js/js_json_syntax.asp
-
-[strings]: https://www.w3schools.com/js/js_json_syntax.asp
+[objects]: https://www.json.org/json-en.html
 
-[array]: https://www.w3schools.com/js/js_json_arrays.asp
+[string]: https://www.w3schools.com/js/js_json_syntax.asp
 
 [uri]: ./02-common-principles.md#uniform-resource-indicator