-
Notifications
You must be signed in to change notification settings - Fork 45
Example
To verify that Percolator is installed correctly, we can process one of the data sets used in Percolator’s original publication available from the Noble lab web site.
First, install both the percolator and the percolator-converters package from the latest release (or when building from source, run cmake; make; make install in both the percolator root directory, as well as the src/converters directory). Then, simply run the command sequence below:
$ mkdir test; cd test
$ wget -q http://noble.gs.washington.edu/proj/percolator/data/yeast-01.sqt.tar.gz
$ tar xvzf yeast-01.sqt.tar.gz
$ sqt2pin -o pin.tab yeast-01.sqt yeast-01.shuffled.sqt
$ percolator -X pout.xml pin.tab > yeast-01.psms
Percolator version 1.15, Build Date Oct 21 2010 10:43:31
Copyright (c) 2006-9 University of Washington. All rights reserved.
Written by Lukas Käll (lukall@u.washington.edu) in the
Department of Genome Sciences at the University of Washington.
Issued command:
percolator -E pin.xml -X pout.xml
Started Thu Oct 21 13:22:36 2010 on unknown_host
Hyperparameters fdr=0.01, Cpos=0, Cneg=0, maxNiter=10
Train/test set contains 69705 positives and 69705 negatives, size ratio=1 and pi0=1
selecting cpos by cross validation
selecting cneg by cross validation
Estimating 7086 over q=0.01 in initial direction
Reading in data and feature calculation took 46.96 cpu seconds or 47 seconds wall time
---Training with Cpos selected by cross validation, Cneg selected by cross validation, fdr=0.01
Iteration 1 : After the iteration step, 10212[1] target PSMs with q<0.01 were estimated by cross validation
Iteration 2 : After the iteration step, 11256 target PSMs with q<0.01 were estimated by cross validation
Iteration 3 : After the iteration step, 11564 target PSMs with q<0.01 were estimated by cross validation
Iteration 4 : After the iteration step, 11667 target PSMs with q<0.01 were estimated by cross validation
Iteration 5 : After the iteration step, 11706 target PSMs with q<0.01 were estimated by cross validation
Iteration 6 : After the iteration step, 11714 target PSMs with q<0.01 were estimated by cross validation
Iteration 7 : After the iteration step, 11716 target PSMs with q<0.01 were estimated by cross validation
Iteration 8 : After the iteration step, 11741 target PSMs with q<0.01 were estimated by cross validation
Iteration 9 : After the iteration step, 11742 target PSMs with q<0.01 were estimated by cross validation
Iteration 10 : After the iteration step, 11747 target PSMs with q<0.01 were estimated by cross validation
Obtained weights (only showing weights of first cross validation set)
first line contains normalized weights, second line the raw weights
lnrSp deltLCn deltCn Xcorr Sp IonFrac Mass PepLen Charge1 Charge2 Charge3 enzN enzC enzInt lnNumSP dM absdM m0
-0.454 0.094 0.586 0.554[2] -0.044 -0.0211 0.757 -0.647 0.138 0.0316 -0.0599 1.19 1.22 -1.7 -0.0401 0.246 -0.222[3] -6.23
-0.244 0.562 6.71 0.921 -0.000166 -0.13 0.00125 -0.118 1.35 0.0632 -0.12 3.01 2.79 -1.27 -1.29 0.382 -0.601 7.92
After all training done, 11621 target PSMs with q<0.01 were found when measuring on the test set
Found 11621 target PSMs scoring over 1% FDR level on testset
Merging results from 3 datasets
Selecting pi_0=0.813 [4]
Calibrating statistics - calculating q values
New pi_0 estimate on merged list gives 11854 PSMs over q=0.01
Calibrating statistics - calculating Posterior error probabilities (PEPs)
Processing took 81.07 cpu seconds or 82 seconds wall time
Here we have labelled a couple of interesting features in the output, referenced with superscripts above:
[1] Here the number of PSMs over a q-value of 0.01 are estimated by cross-validation
[2] The weight of XCorr is positive – indicative of a high xcorr gives a better hit – that is a good indication
[3] The weight of the absdM is negative – indicative that large differences between observed and calculated mass gives a worse score – that is good
[4] We estimate that 81.3% of the PSMs are incorrect matches
A series of scripts that execute system tests are shipped with the source code, which verify that several frequently used modes of operations have the intended behavior and performance. However, they only work if the binaries were built from source with CMake and require Python to be installed. You can then run ctest -V in the corresponding build directory, e.g. <build-directory>/percolator or <build-directory>/converters and a test report will be generated.
Getting started
Home
Download and Install
Example
User guide
Command line options
Interface
Container
Advanced topics
Decoys
Post translational Modifications (PTMs)
Protein inference
PSM deduplication
Additional info
Licenses
How to cite Percolator
Software that use percolator