-
Notifications
You must be signed in to change notification settings - Fork 44
Description
Is your feature request related to a problem? Please describe.
This is not a problem yet, but it's something we are anticipating it may become. Currently we are working on reading datasets natively, and the original idea was to group them on one project, native6. However, in practice many of these datasets have their own folder structure or name convention, and it seems it makes more sense to implement as separate projects (see #494 for discussion on this topic). We anticipate that this policiy could increase the number of datasets enough that it clutters the configuration files. Furthermore, it requires users to keep track of the datasets and update their configuration file accordingly.
However, in practice many users share a configuration, as they work in a few HPC's. We are already anticipating this and handling it by having different DRS's for these machines in the config-developer.yml file, and by providing by default configurations that users can uncomment in config-user.yml.
This process could be simplified if users had only to define once in which machine they are working, so that DRS would be used by default. Any exception that a user would need could be specified separatedly.
E.g, from:
# Site-specific entries: Jasmin
# Uncomment the lines below to locate data on JASMIN
drs:
CMIP6: BADC
CMIP5: BADC
CMIP3: BADC
CORDEX: BADC
OBS: BADC
OBS6: BADC
obs4mips: BADC
ana4mips: BADC
to
drs:
# Site-specific entries:
# Uncomment the lines below to locate data on JASMIN
default: BADC
Exceptions could be handled by using the current format, which also would make it backwards compatible
drs:
default: DKRZ
CORDEX: BADC
A similar approach could be used with rootpaths, although it would require defining the routes somewhere else.
An alternative to the "default" field would be to use lists. For example, DKRZ currently has one main roothpath, two main DRS and a few exceptions:
rootpath:
CMIP6: /mnt/lustre02/work/ik1017/CMIP6/data/CMIP6
CMIP5: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/CMIP5_DKRZ
CMIP3: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/CMIP3
CORDEX: /mnt/lustre02/work/ik1017/C3SCORDEX/data/c3s-cordex/output
OBS: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/OBS
OBS6: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/OBS
obs4mips: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/OBS
ana4mips: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/OBS
native6: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/RAWOBS
RAWOBS: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/RAWOBS
drs:
CMIP6: DKRZ
CMIP5: DKRZ
CMIP3: DKRZ
CORDEX: BADC
obs4mips: default
ana4mips: default
OBS: default
OBS6: default
native6: default
Which could be expressed as:
rootpath:
CMIP6: /mnt/lustre02/work/ik1017/CMIP6/data/CMIP6
CMIP5: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/CMIP5_DKRZ
CMIP3: /mnt/lustre02/work/bd0854/DATA/ESMValTool2/CMIP3
CORDEX: /mnt/lustre02/work/ik1017/C3SCORDEX/data/c3s-cordex/output
[ OBS, OBS6, obs4mips, ana4mips]:
/mnt/lustre02/work/bd0854/DATA/ESMValTool2/OBS
[native6, RAWOBS]:
/mnt/lustre02/work/bd0854/DATA/ESMValTool2/RAWOBS
drs:
[CMIP6, CMIP5, CMIP3]: DKRZ
CORDEX: BADC
[obs4mips, ana4mips, OBS, OBS6, native6]: default
Personally I prefer having a default, so updates in the repository configuration files are immediately and seamlessly available to the user. I also find using lists a bit clunky and difficult to read.
Another solution can be found in #795, which also contains a collection of related issues and suggestions for a redesign of the configuration files.
Would you be able to help out?
I currently feel this is very low priority. As I said before, this is currently not an issue and I'm only opening this for discussion and record-keeping as we anticipate it may be.