What happens is that if the name until the underscore is repeated, it is flagged as an identical filename and everything behind the underscore is scrapped and replaced by the index.
For example, in the tests/data/hdf5/_generate_testdata.ipynb notebook, the "Generating 1ATN_ppi.hdf5" cell should adds these files:
pdb_paths = [
str(PATH_TEST / "data/pdb/1ATN/1ATN_1w.pdb"),
str(PATH_TEST / "data/pdb/1ATN/1ATN_2w.pdb"),
str(PATH_TEST / "data/pdb/1ATN/1ATN_3w.pdb"),
str(PATH_TEST / "data/pdb/1ATN/1ATN_4w.pdb")]
but then gives the following:
Query with ID residue-ppi:A-B:1ATN has already been added to the collection. Renaming it as residue-ppi:A-B:1ATN_2
Query with ID residue-ppi:A-B:1ATN has already been added to the collection. Renaming it as residue-ppi:A-B:1ATN_3
Query with ID residue-ppi:A-B:1ATN has already been added to the collection. Renaming it as residue-ppi:A-B:1ATN_4
This is likely due to add function in query.py not dealing with underscores in existing filenames and assumes them to result from index-numbering:
query_id_base = query_id.split("_")[0]
if query_id_base not in self.ids_count:
self.ids_count[query_id_base] = 1
else:
self.ids_count[query_id_base] += 1
new_id = query.model_id.split("_")[0] + "_" + str(self.ids_count[query_id_base])
query.model_id = new_id
What happens is that if the name until the underscore is repeated, it is flagged as an identical filename and everything behind the underscore is scrapped and replaced by the index.
For example, in the tests/data/hdf5/_generate_testdata.ipynb notebook, the "Generating 1ATN_ppi.hdf5" cell should adds these files:
but then gives the following:
This is likely due to
addfunction in query.py not dealing with underscores in existing filenames and assumes them to result from index-numbering: