Skip to content

Adapter for Fido (protein inference engine), solves #808#1027

Merged
timosachsenberg merged 48 commits intoOpenMS:developfrom
hendrikweisser:FidoAdapter
Nov 27, 2014
Merged

Adapter for Fido (protein inference engine), solves #808#1027
timosachsenberg merged 48 commits intoOpenMS:developfrom
hendrikweisser:FidoAdapter

Conversation

@hendrikweisser
Copy link
Contributor

This contains the adapter itself (which can run Fido with/without parameter estimation and now works properly even for inputs with multiple ID runs), adaptations to PeptideIndexer and ProteinQuantifier, and test cases (which may not be properly integrated into the build system yet - see my e-mail to the developers list).
Fido itself is not included, but sources that should work across platforms are available here: https://github.com/hendrikweisser/Fido

Thanks to @liangoaix for her contributions, including the test file.

Hendrik Weisser added 24 commits April 3, 2014 16:49
…arget/decoy tags in PeptideIndexer.

Also made cosmetic changes in PeptideIndexer documentation/log output and added a test for the protein annotation.
…OpenMS into FidoAdapter

Need the fix of the "prob_correct" option in IDPosteriorErrorProbability.
Conflicts:
	src/topp/PeptideIndexer.cpp
…tion; added saving of parameter estimation results
…dapter

Conflicts:
	src/tests/topp/CMakeLists.txt
	src/topp/PeptideIndexer.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timosachsenberg
Copy link
Contributor

nice

@lars20070
Copy link
Contributor

@liangoaix Thanks for the G3UZX1 example. Good point.

@liangoaix
Copy link
Contributor

When applying IDFilter (with flag Peptide/Protein score=0.05) after Fido, does anyone know why IDFilter produced the error here:
https://github.com/OpenMS/OpenMS/blob/develop/src/openms/source/FORMAT/IdXMLFile.cpp#L781
It seems to have problems storing ProteinGroup information after successfully filtering.

Hendrik Weisser added 7 commits November 25, 2014 10:51
…dapter

Conflicts:
	src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta
	src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.index
	src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.phr
	src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.pin
	src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.psd
	src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.psi
	src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.psq
	src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.mzML
	src/tests/topp/SEARCHENGINES/proteins.fasta
	src/tests/topp/SEARCHENGINES/proteins.fasta.index
	src/tests/topp/SEARCHENGINES/proteins.fasta.phr
	src/tests/topp/SEARCHENGINES/proteins.fasta.pin
	src/tests/topp/SEARCHENGINES/proteins.fasta.psd
	src/tests/topp/SEARCHENGINES/proteins.fasta.psi
	src/tests/topp/SEARCHENGINES/proteins.fasta.psq
	src/tests/topp/SEARCHENGINES/spectra.mzML
	src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta
	src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.index
	src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.phr
	src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.pin
	src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.psd
	src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.psi
	src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.psq
	src/tests/topp/THIRDPARTY/OMSSAAdapter_1.mzML
	src/tests/topp/THIRDPARTY/third_party_tests.cmake
	src/topp/PeptideIndexer.cpp
@hendrikweisser
Copy link
Contributor Author

@lars20070: I looked at your simplified example and found the problem (to some extent): The vast majority of your proteins get assigned a probability of zero by Fido and are thus removed from the results (unless you set the "keep_zero_group" flag). The cause of this seems to be that IDPEP produced very low probabilities (0.00...) for (almost?) all of your PSMs. So you/we would have to investigate what goes wrong in IDPEP.

@hendrikweisser
Copy link
Contributor Author

@liangoaix: I've added an option now to read the peptide probabilities from a "UserParam" in the input idXML. Can you check whether this works for you, and maybe provide me with a (small) file for a test case?

@liangoaix
Copy link
Contributor

@hendrikweisser I tested this OMSSA output: https://github.com/OpenMS/OpenMS/blob/develop/src/tests/topp/FalseDiscoveryRate_OMSSA.idXML
by running IDPEP (prob_correct) -> FDR -> Fido. It failed with error

Error: All protein hits must be annotated with target/decoy meta data. Run PeptideIndexer with the 'annotate_proteins' option to accomplish this.
Error: Unexpected internal error (Error: All protein hits must be annotated with target/decoy meta data. Run PeptideIndexer with the 'annotate_proteins' option to accomplish this.)

Another FDR output ( as the FidoAdapter input test file): https://www.dropbox.com/s/fnu8ntraa8jkvid/FidoAdapter_input_2.idXML?dl=0 works well.

@hendrikweisser
Copy link
Contributor Author

@liangoaix: Before I update the code with the correct default value for "prob_param", you could try setting the value manually and running the test again.

@liangoaix
Copy link
Contributor

@hendrikweisser Yes, after manually setting to "Posterior Probability_score", it worked well on this test file: https://www.dropbox.com/s/fnu8ntraa8jkvid/FidoAdapter_input_2.idXML?dl=0

@hendrikweisser
Copy link
Contributor Author

Everything should work correctly now. Thanks for the test file, Xiao!

@liangoaix
Copy link
Contributor

@lars20070 For the IDFilter issue, I think if you want to filter peptides (score:pep) after FidoAdapter, now it should work when setting keep_unrefernced_protein_hits= true. But I do it before FidoAdapter.

If we want to filter proteins afterwards, the IDFilter has bug with proein groups. e.g. IDFilter can possibly filter away protein hit PH_145 (with a low protein score) which is still kept here:

            <UserParam type="string" name="indistinguishable_proteins_103" value="0.6997149674,PH_145"/>

Then PH_145 can not find reference because it's filtered away... I think it's better to solve this IDFilter bug not in this pull-request.

@timosachsenberg
Copy link
Contributor

I would propose to merge but we currently have a travis problem.

@timosachsenberg
Copy link
Contributor

could you push again (needs small change)? travis hickups seems to be resolved

@timosachsenberg
Copy link
Contributor

no need to do this as I just learned that I can retrigger a travis build.
see http://stackoverflow.com/questions/17606874/trigger-a-travis-ci-rebuild-without-pushing-a-commit

@lars20070
Copy link
Contributor

Interesting. (But cannot find the restart button. I am logged in but probably still need the rights.)

FYI, @bgruening will generate first Galaxy wrappers once this PR is merged.

@hendrikweisser
Copy link
Contributor Author

I'm working on the IDFilter problem. Expect a separate pull request soon.

@timosachsenberg: Could you add Fido and FidoChooseParameters executables to the Travis machines to enable the tests (analogous to MSGFPlusAdapter)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants