-
Notifications
You must be signed in to change notification settings - Fork 75
Closed
Description
Hi,
Please help me in resolving the error I get when running GRIDSS.
Here is the command I used:
../../gridss.sh -r ../Reference/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa \
-o VCF/BH004_grids.vcf.gz \
-a BAM/BH004_grids.bam \
--labels BH004 \
../BAM/BH004_ReSorted.bam
Here is the output displayed:
Fri Mar 19 03:15:27 EDT 2021: Full log file is: ./gridss.full.20210319_031527.node155.hpc.local.301999.log
which: no time in (/usr/bin)
Fri Mar 19 03:15:27 EDT 2021: Not found /usr/bin/time
Fri Mar 19 03:15:27 EDT 2021: Using GRIDSS jar /home/rajk/gridss-2.11.0-gridss-jar-with-dependencies.jar
Fri Mar 19 03:15:27 EDT 2021: Using reference genome "../Reference/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa"
Fri Mar 19 03:15:27 EDT 2021: Using output VCF VCF/BH004_grids.vcf.gz
Fri Mar 19 03:15:27 EDT 2021: Using assembly bam BAM/BH004_grids.bam
Fri Mar 19 03:15:27 EDT 2021: Using 8 worker threads.
Fri Mar 19 03:15:27 EDT 2021: Using no blacklist bed. The encode DAC blacklist is recommended for hg19.
Fri Mar 19 03:15:27 EDT 2021: Using JVM maximum heap size of 30g for assembly and variant calling.
Fri Mar 19 03:15:27 EDT 2021: Using input file ../BAM/BH004_ReSorted.bam
Fri Mar 19 03:15:27 EDT 2021: label is BH004
Fri Mar 19 03:15:27 EDT 2021: Found /opt/software/R/4.0.2/bin/Rscript
Fri Mar 19 03:15:27 EDT 2021: Found /opt/software/samtools/1.11/bin/samtools
Fri Mar 19 03:15:27 EDT 2021: Found /etc/alternatives/java_sdk_1.8.0/bin/java
Fri Mar 19 03:15:27 EDT 2021: Found /opt/software/bwa/0.7.10/bin/bwa
Fri Mar 19 03:15:27 EDT 2021: samtools version: 1.11+htslib-1.11
Fri Mar 19 03:15:27 EDT 2021: R version: R scripting front-end version 4.0.2 (2020-06-22)
Fri Mar 19 03:15:27 EDT 2021: bwa Version: 0.7.10-r789
which: no time in (/usr/bin)
Fri Mar 19 03:15:27 EDT 2021: bash version: GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)
Fri Mar 19 03:15:28 EDT 2021: java version: openjdk version "1.8.0_242" OpenJDK Runtime Environment (build 1.8.0_242-b08) OpenJDK 64-Bit Server VM (build 25.242-b08, mixed mode)
Fri Mar 19 03:15:28 EDT 2021: Max file handles: 4096
Fri Mar 19 03:15:28 EDT 2021: Running GRIDSS steps: setupreference, preprocess, assemble, call,
Fri Mar 19 03:15:28 EDT 2021: Start pre-processing ../BAM/BH004_ReSorted.bam
Fri Mar 19 03:15:28 EDT 2021: Running CollectGridssMetrics ../BAM/BH004_ReSorted.bam first 10000000 records
"$timecmd java -Xmx$otherjvmheap $jvm_args -cp $gridss_jar gridss.analysis.CollectGridssMetrics REFERENCE_SEQUENCE=$reference TMP_DIR=$dir ASSUME_SORTED=true I=$f O=$tmp_prefix THRESHOLD_COVERAGE=$maxcoverage FILE_EXTENSION=null GRIDSS_PROGRAM=null GRIDSS_PROGRAM=CollectIdsvMetrics PROGRAM=null PROGRAM=CollectInsertSizeMetrics STOP_AFTER=$metricsrecords $picardoptions" command completed with exit code 1.
*****
The underlying error message can be found in ./gridss.full.20210319_031527.node155.hpc.local.301999.log.
*****
Here is the text in the log file:
Fri Mar 19 03:15:27 EDT 2021: Full log file is: ./gridss.full.20210319_031527.node155.hpc.local.301999.log
Fri Mar 19 03:15:27 EDT 2021: Not found /usr/bin/time
Fri Mar 19 03:15:27 EDT 2021: Using GRIDSS jar /home/rajk/gridss-2.11.0-gridss-jar-with-dependencies.jar
Fri Mar 19 03:15:27 EDT 2021: Using reference genome "../Reference/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa"
Fri Mar 19 03:15:27 EDT 2021: Using output VCF VCF/BH004_grids.vcf.gz
Fri Mar 19 03:15:27 EDT 2021: Using assembly bam BAM/BH004_grids.bam
Fri Mar 19 03:15:27 EDT 2021: Using 8 worker threads.
Fri Mar 19 03:15:27 EDT 2021: Using no blacklist bed. The encode DAC blacklist is recommended for hg19.
Fri Mar 19 03:15:27 EDT 2021: Using JVM maximum heap size of 30g for assembly and variant calling.
Fri Mar 19 03:15:27 EDT 2021: Using input file ../BAM/BH004_ReSorted.bam
Fri Mar 19 03:15:27 EDT 2021: label is BH004
Fri Mar 19 03:15:27 EDT 2021: Found /opt/software/R/4.0.2/bin/Rscript
Fri Mar 19 03:15:27 EDT 2021: Found /opt/software/samtools/1.11/bin/samtools
Fri Mar 19 03:15:27 EDT 2021: Found /etc/alternatives/java_sdk_1.8.0/bin/java
Fri Mar 19 03:15:27 EDT 2021: Found /opt/software/bwa/0.7.10/bin/bwa
Fri Mar 19 03:15:27 EDT 2021: samtools version: 1.11+htslib-1.11
Fri Mar 19 03:15:27 EDT 2021: R version: R scripting front-end version 4.0.2 (2020-06-22)
Fri Mar 19 03:15:27 EDT 2021: bwa Version: 0.7.10-r789
Fri Mar 19 03:15:27 EDT 2021: bash version: GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)
Fri Mar 19 03:15:28 EDT 2021: java version: openjdk version "1.8.0_242" OpenJDK Runtime Environment (build 1.8.0_242-b08) OpenJDK 64-Bit Server VM (build 25.242-b08, mixed mode)
Fri Mar 19 03:15:28 EDT 2021: Max file handles: 4096
Fri Mar 19 03:15:28 EDT 2021: Running GRIDSS steps: setupreference, preprocess, assemble, call,
Fri Mar 19 03:15:28 EDT 2021: Start pre-processing ../BAM/BH004_ReSorted.bam
Fri Mar 19 03:15:28 EDT 2021: Running CollectGridssMetrics ../BAM/BH004_ReSorted.bam first 10000000 records
INFO 2021-03-19 03:15:28 Defaults Found file for property samjdk.reference_fasta: /home/rajk/SV_MAC_BH/GRIDSS/../Reference/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa
INFO 2021-03-19 03:15:28 CollectGridssMetrics
********** NOTE: Picard's command line syntax is changing.
**********
********** For more information, please see:
********** https://github.com/broadinstitute/picard/wiki/Command-Line-Syntax-Transition-For-Users-(Pre-Transition)
**********
********** The command line looks like this in the new syntax:
**********
********** CollectGridssMetrics -REFERENCE_SEQUENCE ../Reference/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa -TMP_DIR ./BH004_ReSorted.bam.gridss.working -ASSUME_SORTED true -I ../BAM/BH004_ReSorted.bam -O ./BH004_ReSorted.bam.gridss.working/tmp.BH004_ReSorted.bam -THRESHOLD_COVERAGE 50000 -FILE_EXTENSION null -GRIDSS_PROGRAM null -GRIDSS_PROGRAM CollectIdsvMetrics -PROGRAM null -PROGRAM CollectInsertSizeMetrics -STOP_AFTER 10000000
**********
03:15:28.683 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/rajk/gridss-2.11.0-gridss-jar-with-dependencies.jar!/com/intel/gkl/native/libgkl_compression.so
[Fri Mar 19 03:15:28 EDT 2021] CollectGridssMetrics GRIDSS_PROGRAM=[CollectIdsvMetrics] THRESHOLD_COVERAGE=50000 INPUT=../BAM/BH004_ReSorted.bam ASSUME_SORTED=true STOP_AFTER=10000000 OUTPUT=./BH004_ReSorted.bam.gridss.working/tmp.BH004_ReSorted.bam FILE_EXTENSION=null PROGRAM=[CollectInsertSizeMetrics] TMP_DIR=[./BH004_ReSorted.bam.gridss.working] REFERENCE_SEQUENCE=../Reference/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa METRIC_ACCUMULATION_LEVEL=[ALL_READS] INCLUDE_UNPAIRED=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Fri Mar 19 03:15:28 EDT 2021] Executing as rajk@node155 on Linux 3.10.0-1127.el7.x86_64 amd64; OpenJDK 64-Bit Server VM 1.8.0_242-b08; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.11.0-gridss
ERROR 2021-03-19 03:15:29 ReferenceCommandLineProgram Reference genome used by ../BAM/BH004_ReSorted.bam does not match reference genome ../Reference/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa. The reference supplied must match the reference used for every input.
[Fri Mar 19 03:15:29 EDT 2021] gridss.analysis.CollectGridssMetrics done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=2075918336
Exception in thread "main" htsjdk.samtools.util.SequenceUtil$SequenceListsDifferException: In files /home/rajk/SV_MAC_BH/GRIDSS/../BAM/BH004_ReSorted.bam and /home/rajk/SV_MAC_BH/GRIDSS/../Reference/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa
at htsjdk.samtools.util.SequenceUtil.assertSequenceDictionariesEqual(SequenceUtil.java:345)
at gridss.cmdline.ReferenceCommandLineProgram.ensureDictionaryMatches(ReferenceCommandLineProgram.java:117)
at gridss.analysis.CollectGridssMetrics.doWork(CollectGridssMetrics.java:75)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:196)
at gridss.analysis.CollectGridssMetrics.main(CollectGridssMetrics.java:57)
Caused by: htsjdk.samtools.util.SequenceUtil$SequenceListsDifferException: Sequences at index 1 don't match: 1/69331447/10 1/85426708/2/M5=526c549b204117f61cd292042a7127d2/UR=file:/home/rajk/SV_MAC_BH/GRIDSS/../Reference/Canis_lupus_familiaris.CanFam3.1.dna.toplevel.fa
at htsjdk.samtools.util.SequenceUtil.assertSequenceListsEqual(SequenceUtil.java:272)
at htsjdk.samtools.util.SequenceUtil.assertSequenceDictionariesEqual(SequenceUtil.java:334)
at htsjdk.samtools.util.SequenceUtil.assertSequenceDictionariesEqual(SequenceUtil.java:320)
at htsjdk.samtools.util.SequenceUtil.assertSequenceDictionariesEqual(SequenceUtil.java:343)
... 5 more
Appreciate all your help.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels