Describe the new feature or enhancement
The dataset fetching code inside mne/datasets/utils.py, mne/utils/fetching.py are actually very general. I was hoping to leverage them without copy/pasting the code, so I can make use of upstream possible bug fixes / performance improvements (if they ever occur).
However, in some cases, I would like to unit test against private data I have stored on Github, and they require an API token with the HTTP request. Eventually, then some of that data would be made public after say a publication, but it's then nice to build into a CI for myself for a private research project in the meantime.
Is it possible to add an optional "token" into the dataset fetcher? This would also enable MNE to leverage private repos. In addition, it would lessen the code dependency for anyone trying to implement a data fetcher without copying every single function from MNE.
Describe your proposed implementation
Add optional token=None kwarg to the following functions:
_download
_fetch_file
_get_http
Then one can easily add optional tokens in _data_path, depending on which dataset is being fetched. This would also enable any "mne" package, like mne-bids/connectivity/etc. to leverage private Github repo data that might get passed in via GH actions.
Describe possible alternatives
If we further refactor things, so that key, urls, archive_names, folder_origs, folder_names, md5_hashes are passed into _data_path, rather then set inside _data_path, then to create a MNE-fetcher, one simply needs to define a data_path that then passes these to _data_path, and they have a fully functional: mne_downstream_package.testing.data_path() that fetches their own datasets for testing without having to rely on MNE-Python for data fetching.
Additional Information
I think this also might be helpful in further cementing MNE-Python as a platform for developing neuroscience/clinical-neuroscience applications that sometimes might need data fetchers in their CI / testing pipeline for "private data".
Ref: https://chanzuckerberg.com/eoss/proposals/improving-usability-of-core-neuroscience-analysis-tools-with-mne-python/
Describe the new feature or enhancement
The dataset fetching code inside
mne/datasets/utils.py,mne/utils/fetching.pyare actually very general. I was hoping to leverage them without copy/pasting the code, so I can make use of upstream possible bug fixes / performance improvements (if they ever occur).However, in some cases, I would like to unit test against private data I have stored on Github, and they require an API token with the HTTP request. Eventually, then some of that data would be made public after say a publication, but it's then nice to build into a CI for myself for a private research project in the meantime.
Is it possible to add an optional "token" into the dataset fetcher? This would also enable MNE to leverage private repos. In addition, it would lessen the code dependency for anyone trying to implement a data fetcher without copying every single function from MNE.
Describe your proposed implementation
Add optional
token=Nonekwarg to the following functions:_download_fetch_file_get_httpThen one can easily add optional tokens in
_data_path, depending on which dataset is being fetched. This would also enable any "mne" package, like mne-bids/connectivity/etc. to leverage private Github repo data that might get passed in via GH actions.Describe possible alternatives
If we further refactor things, so that
key,urls,archive_names,folder_origs,folder_names,md5_hashesare passed into_data_path, rather then set inside_data_path, then to create a MNE-fetcher, one simply needs to define adata_paththat then passes these to_data_path, and they have a fully functional:mne_downstream_package.testing.data_path()that fetches their own datasets for testing without having to rely on MNE-Python for data fetching.Additional Information
I think this also might be helpful in further cementing MNE-Python as a platform for developing neuroscience/clinical-neuroscience applications that sometimes might need data fetchers in their CI / testing pipeline for "private data".
Ref: https://chanzuckerberg.com/eoss/proposals/improving-usability-of-core-neuroscience-analysis-tools-with-mne-python/