Skip to content

compatibility issue with v3.6 #329

@micgrab

Description

@micgrab

Hello.

I have a problem running percolator v3.6 on a file that worked with v3.5.

Part 1

To reproduce the issue here is a toy data set:

SpecId	Label	ScanNr	ExpMass	Mass	feature_1	Peptide	Protein1
34292	-1	88	1103.59885	1103.59885	0.2029	KGPYLHPHR	_placeholder_
97071	-1	89	1103.59885	1103.59885	0.3172	KGPYLHPHR	_placeholder_
127175	1	110	1485.78140	1485.78140	0.1040	RAHERPPPHPHR	_placeholder_
181611	1	112	2469.91930	2469.91930	0.0123	STGHGGHCTNCQDNTDGAHCER	_placeholder_
113801	1	239	1646.73185	1646.73185	0.0065	EHTGKPTTSSSEACR	_placeholder_
237191	1	245	2041.96648	2041.96648	0.2799	KTEEERPQETTNQHSTK	_placeholder_
105065	-1	249	919.46241	919.46241	0.1106	RNPASYGR	_placeholder_
96772	1	256	1088.50982	1088.50982	0.0867	ESESTAAAPAR	_placeholder_
96773	-1	256	1444.78989	1444.78989	0.0169	GRPPKQEPAAAAPR	_placeholder_
127254	1	258	1488.77972	1488.77972	0.0047	SVQPQSHKPQPTR	_placeholder_
127255	1	258	1490.72260	1490.72260	0.0685	GTHDRDPSEKPPR	_placeholder_
141975	1	262	1488.77972	1488.77972	0.0040	SVQPQSHKPQPTR	_placeholder_
74393	1	263	1386.68516	1386.68516	0.4173	SPEQSRSSPEKR	_placeholder_
127960	1	264	1122.61456	1122.61456	0.4747	LSHPTTSRPK	_placeholder_
74104	1	266	1386.68516	1386.68516	0.0389	SPEQSRSSPEKR	_placeholder_
74105	-1	266	1844.87520	1844.87520	0.0768	KLKDSEETHETGAASDK	_placeholder_
87069	1	267	1764.80270	1764.80270	0.0012	NRPEPHSDENGSTTPK	_placeholder_
175837	1	268	2173.87443	2173.87443	0.0025	NHSGNDERDEEDEERESK	_placeholder_
49377	1	269	1498.70120	1498.70120	0.0016	ESRPENEEERPK	_placeholder_

The prcolator versions are percolator-v3-06-linux-amd64.deb and percolator-v3-05-linux-amd64.deb

The command I used was percolator -Y toy_data.tsv and it shows this error:
Couldn't find Protein header in tab-file

But there is a Protein header and the changelog does not mention that anything about the input format has changed. The Wiki also shows an example with "proteinId1". This header does also work with v3.5 but not with v3.6.

I looked into the code and it seems like that now the header name needs to be Proteins. I also noticed that even when you use Proteins as header but you misspelled Label as e.g. label, then you will also get the same error: Couldn't find Protein header in tab-file.

Part 2

So I changed the header and percolator works again, but now I get a lot of warnings:

...
Features:
Mass feature_1 
Warning: Set decoy prefix don't match
Warning: Set decoy prefix don't match
Warning: Set decoy prefix don't match
Warning: Set decoy prefix don't match
Warning: Set decoy prefix don't match
Found 19 PSMs
...

What does this error mean? What do I need to do to get rid of it?

Also, the result changed.
with v3.5 I get:

PSMId	score	q-value	posterior_error_prob	peptide	proteinIds
175837	0	0.125	0.205014	NHSGNDERDEEDEERESK	_placeholder_
...

with v3.6 I get:

PSMId	score	q-value	posterior_error_prob	peptide	proteinIds
175837	0	0.125	0.205014	NHSGNDERDEEDEERESK	
...

The entries of the proteins are missing here.


I hope you can help me.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions