Skip to content

problem with get_pssm when preprocessing variants #429

@rgayatri

Description

@rgayatri

When check_pssm is ignored, I face another error with get_pssm


Traceback (most recent call last):
  File "/home/gayatrir/miniconda3/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/gayatrir/miniconda3/lib/python3.10/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/mnt/home2/gayatrir/deeprank-core/deeprankcore/query.py", line 244, in _process_one_query
    graph = query.build(feature_modules)
  File "/mnt/home2/gayatrir/deeprank-core/deeprankcore/query.py", line 609, in build
    feature_module.add_features(self._pdb_path, graph, variant)
  File "/mnt/home2/gayatrir/deeprank-core/deeprankcore/features/conservation.py", line 29, in add_features
    pssm_row = residue.get_pssm()
  File "/mnt/home2/gayatrir/deeprank-core/deeprankcore/molstruct/residue.py", line 54, in get_pssm
    raise FileNotFoundError(f'No pssm file found for Chain {self._chain}.')
FileNotFoundError: No pssm file found for Chain pdb2g98 B

My variant of interest is in chain A, thus the pssm_path I provide is for chain A.
Code to reproduce.

from deeprankcore.query import QueryCollection, SingleResidueVariantAtomicQuery
from deeprankcore.features import components, conservation, contact, surfacearea
from deeprankcore.utils.grid import GridSettings, MapMethod
from deeprankcore.domain.aminoacidlist import (arginine, cysteine)
from deeprankcore.domain import targetstorage as targets

queries.add(SingleResidueVariantAtomicQuery(
    pdb_path = "/home/gayatrir/DATA/pdb/pdb2g98.ent",
    chain_id = "A",
    residue_number = 14,
    insertion_code = None,
    wildtype_amino_acid = arginine,
    variant_amino_acid = cysteine,
    pssm_paths = {"A": "/mnt/csb/DeepRank-Mut-DATA/pssm_update/2g98.A.pdb.pssm"},
    targets={targets.BINARY: 1},
    radius= 10.0,
    distance_cutoff= 4.5,
    ))

feature_modules = [components, conservation, contact, surfacearea]

hdf5_paths = queries.process(
    "/home/gayatrir/coretrain.hdf5",
    feature_modules = feature_modules,
    grid_settings = GridSettings(
        points_counts = [20, 20, 20],
        sizes = [1.0, 1.0, 1.0]),
    grid_map_method = MapMethod.GAUSSIAN)


Metadata

Metadata

Labels

Queryquery module related issues

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions