-
Notifications
You must be signed in to change notification settings - Fork 130
layout.get_collections drops user-added columns (and 2 bonus questions) #273
Description
As discussed on neurostars, here:
https://neurostars.org/t/replicable-scripts-bids-and-curating-data/2623
I'm trying to document and automate the runs that I'm including in first-level analyses by adding custom columns to the scans.tsv file in the BIDS subject directory.
I've added a few columns to document excluded runs, as seen in the attached scans file (renamed to .txt extension to make github happy).
sub-SAXEIB06_scans.txt
When I try to use
bvcSessList = layout.get_collections(level='session',subject=sub)
df = bvcSessList[0].to_df()
print(bvcSessList[0].variables)
one of the columns ("OtherExclusion") has been dropped from both the df and the variables list.
I'm pretty sure that's happening because the column is a duplicate of another exclusion column ("RepeatSubjectExclusion") that has the same values of False for all runs, and this line kills it:
Line 347 in 393310f
| _data = _data.T.drop_duplicates().T |
I can code around this problem in a few ways, but it seems like maybe not the ideal behavior for get_collections().
Bonus questions:
-
the scans.tsv filename is parsed into modality/run/type/subject/task correctly, and those columns show up in the dataframe, but I can't find them (or the original filename field) in the variables or entities dictionaries. I'd think that they should be available, no?
-
get_collections(level='session') only seems to return the func modality, and omits the anat session in the scans.tsv file. Is this intended behavior?
Thanks!
Todd