Creating queries from pdb files with underscore in the filename gives unexpected query ids

What happens is that if the name until the underscore is repeated, it is flagged as an identical filename and everything behind the underscore is scrapped and replaced by the index.

For example, in the tests/data/hdf5/_generate_testdata.ipynb notebook, the "Generating 1ATN_ppi.hdf5" cell should adds these files:
```python =
pdb_paths = [
    str(PATH_TEST / "data/pdb/1ATN/1ATN_1w.pdb"),
    str(PATH_TEST / "data/pdb/1ATN/1ATN_2w.pdb"),
    str(PATH_TEST / "data/pdb/1ATN/1ATN_3w.pdb"),
    str(PATH_TEST / "data/pdb/1ATN/1ATN_4w.pdb")]
````

but then gives the following:

```bash =
Query with ID residue-ppi:A-B:1ATN has already been added to the collection. Renaming it as residue-ppi:A-B:1ATN_2
Query with ID residue-ppi:A-B:1ATN has already been added to the collection. Renaming it as residue-ppi:A-B:1ATN_3
Query with ID residue-ppi:A-B:1ATN has already been added to the collection. Renaming it as residue-ppi:A-B:1ATN_4
```

This is likely due to `add` function in query.py not dealing with underscores in existing filenames and assumes them to result from index-numbering:

```python =
query_id_base = query_id.split("_")[0]
if query_id_base not in self.ids_count:
    self.ids_count[query_id_base] = 1
else:
    self.ids_count[query_id_base] += 1
    new_id = query.model_id.split("_")[0] + "_" + str(self.ids_count[query_id_base])
    query.model_id = new_id
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating queries from pdb files with underscore in the filename gives unexpected query ids #411

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Creating queries from pdb files with underscore in the filename gives unexpected query ids #411

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions