Adapter for Fido (protein inference engine), solves #808#1027
Adapter for Fido (protein inference engine), solves #808#1027timosachsenberg merged 48 commits intoOpenMS:developfrom
Conversation
…arget/decoy tags in PeptideIndexer. Also made cosmetic changes in PeptideIndexer documentation/log output and added a test for the protein annotation.
… engine Fido). Work in progress.
…OpenMS into FidoAdapter Need the fix of the "prob_correct" option in IDPosteriorErrorProbability.
…o SILACAnalyzer/FeatureFinderRaw)
Conflicts: src/topp/PeptideIndexer.cpp
…ein_groups' to signify more general usage
…ier output when available
…tion; added saving of parameter estimation results
…dapter Conflicts: src/tests/topp/CMakeLists.txt src/topp/PeptideIndexer.cpp
…RDPARTY' to reflect more general use
…rameter 'separate_runs' for such cases
src/topp/FidoAdapter.cpp
Outdated
There was a problem hiding this comment.
|
nice |
|
@liangoaix Thanks for the G3UZX1 example. Good point. |
|
When applying IDFilter (with flag Peptide/Protein score=0.05) after Fido, does anyone know why IDFilter produced the error here: |
…dapter Conflicts: src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.index src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.phr src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.pin src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.psd src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.psi src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.fasta.psq src/tests/topp/SEARCHENGINES/OMSSAAdapter_1.mzML src/tests/topp/SEARCHENGINES/proteins.fasta src/tests/topp/SEARCHENGINES/proteins.fasta.index src/tests/topp/SEARCHENGINES/proteins.fasta.phr src/tests/topp/SEARCHENGINES/proteins.fasta.pin src/tests/topp/SEARCHENGINES/proteins.fasta.psd src/tests/topp/SEARCHENGINES/proteins.fasta.psi src/tests/topp/SEARCHENGINES/proteins.fasta.psq src/tests/topp/SEARCHENGINES/spectra.mzML src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.index src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.phr src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.pin src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.psd src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.psi src/tests/topp/THIRDPARTY/OMSSAAdapter_1.fasta.psq src/tests/topp/THIRDPARTY/OMSSAAdapter_1.mzML src/tests/topp/THIRDPARTY/third_party_tests.cmake src/topp/PeptideIndexer.cpp
…Identification::ProteinGroup')
…ated tests accordingly
|
@lars20070: I looked at your simplified example and found the problem (to some extent): The vast majority of your proteins get assigned a probability of zero by Fido and are thus removed from the results (unless you set the "keep_zero_group" flag). The cause of this seems to be that IDPEP produced very low probabilities (0.00...) for (almost?) all of your PSMs. So you/we would have to investigate what goes wrong in IDPEP. |
…' in the idXML (new parameter: 'prob_param')
|
@liangoaix: I've added an option now to read the peptide probabilities from a "UserParam" in the input idXML. Can you check whether this works for you, and maybe provide me with a (small) file for a test case? |
|
@hendrikweisser I tested this OMSSA output: https://github.com/OpenMS/OpenMS/blob/develop/src/tests/topp/FalseDiscoveryRate_OMSSA.idXML Another FDR output ( as the FidoAdapter input test file): https://www.dropbox.com/s/fnu8ntraa8jkvid/FidoAdapter_input_2.idXML?dl=0 works well. |
|
@liangoaix: Before I update the code with the correct default value for "prob_param", you could try setting the value manually and running the test again. |
|
@hendrikweisser Yes, after manually setting to "Posterior Probability_score", it worked well on this test file: https://www.dropbox.com/s/fnu8ntraa8jkvid/FidoAdapter_input_2.idXML?dl=0 |
|
Everything should work correctly now. Thanks for the test file, Xiao! |
|
@lars20070 For the IDFilter issue, I think if you want to filter peptides (score:pep) after FidoAdapter, now it should work when setting keep_unrefernced_protein_hits= true. But I do it before FidoAdapter. If we want to filter proteins afterwards, the IDFilter has bug with proein groups. e.g. IDFilter can possibly filter away protein hit PH_145 (with a low protein score) which is still kept here: Then PH_145 can not find reference because it's filtered away... I think it's better to solve this IDFilter bug not in this pull-request. |
|
I would propose to merge but we currently have a travis problem. |
|
could you push again (needs small change)? travis hickups seems to be resolved |
|
no need to do this as I just learned that I can retrigger a travis build. |
|
Interesting. (But cannot find the restart button. I am logged in but probably still need the rights.) FYI, @bgruening will generate first Galaxy wrappers once this PR is merged. |
|
I'm working on the IDFilter problem. Expect a separate pull request soon. @timosachsenberg: Could you add Fido and FidoChooseParameters executables to the Travis machines to enable the tests (analogous to MSGFPlusAdapter)? |
Adapter for Fido (protein inference engine), solves #808
This contains the adapter itself (which can run Fido with/without parameter estimation and now works properly even for inputs with multiple ID runs), adaptations to PeptideIndexer and ProteinQuantifier, and test cases (which may not be properly integrated into the build system yet - see my e-mail to the developers list).
Fido itself is not included, but sources that should work across platforms are available here: https://github.com/hendrikweisser/Fido
Thanks to @liangoaix for her contributions, including the test file.