Skip to content

problem with check_pssm  #428

@rgayatri

Description

@rgayatri

I am trying to preprocess variant data using SingleResidueVariantAtomicQuery.
Here is the error I get

Graph/Query with ID atomic-graph:A:14:Arginine->Cysteine:pdb2g98 ran into an Exception (ValueError: Amino acids in PSSM files do not match pdb file for pdb2g98.ent.), and it has not been written to the hdf5 file. More details below:
Amino acids in PSSM files do not match pdb file for pdb2g98.ent.
Traceback (most recent call last):
  File "/mnt/home2/gayatrir/deeprank-core/deeprankcore/query.py", line 65, in _check_pssm
    if pdb_truth[residue] != pssm_data[residue]:
KeyError: 'B0001'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/home2/gayatrir/deeprank-core/deeprankcore/query.py", line 244, in _process_one_query
    graph = query.build(feature_modules)
  File "/mnt/home2/gayatrir/deeprank-core/deeprankcore/query.py", line 570, in build
    structure = self._load_structure(self._pdb_path, self._pssm_paths, include_hydrogens, load_pssms)
  File "/mnt/home2/gayatrir/deeprank-core/deeprankcore/query.py", line 129, in _load_structure
    _check_pssm(pdb_path, pssm_paths)
  File "/mnt/home2/gayatrir/deeprank-core/deeprankcore/query.py", line 68, in _check_pssm
    raise ValueError(error_message) #pylint: disable = raise-missing-from
ValueError: Amino acids in PSSM files do not match pdb file for pdb2g98.ent.

My task does not involve inclusion of multiple chains.
Also, I note that residues in pdb and pssm are identical.

Code to reproduce:

from deeprankcore.query import QueryCollection, SingleResidueVariantAtomicQuery
from deeprankcore.features import components, conservation, contact, surfacearea
from deeprankcore.utils.grid import GridSettings, MapMethod
from deeprankcore.domain.aminoacidlist import (arginine, cysteine)
from deeprankcore.domain import targetstorage as targets

queries.add(SingleResidueVariantAtomicQuery(
    pdb_path = "/home/gayatrir/DATA/pdb/pdb2g98.ent",
    chain_id = "A",
    residue_number = 14,
    insertion_code = None,
    wildtype_amino_acid = arginine,
    variant_amino_acid = cysteine,
    pssm_paths = {"A": "/mnt/csb/DeepRank-Mut-DATA/pssm_update/2g98.A.pdb.pssm"},
    targets={targets.BINARY: 1},
    radius= 10.0,
    distance_cutoff= 4.5,
    ))


Metadata

Metadata

Labels

Queryquery module related issuesbugSomething isn't workingstaleissue not touched from too much time

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions