assign_lineage

Assigning viral lineage

Assembled SARS-CoV-2 viral genomes may be classified as belonging to a particular phylogenetic lineage using Pangolin. Lineage designations evolve and change as the virus does, meaning that a sequence may be assigned to different lineages at different points in time depending on the models in use at the time. For this reason, lineage assignment should be rerun regularly as Pangolin's underlying models are updated to ensure that those lineage assignments are up-to-date with the latest available information.

Input files

The input to the workflow is an array of paths to assembled SARS-CoV-2 genomes.

For details on preparing the inputs.json file, see the corresponding section in the README.

If the workflow is being run locally, you should first download the assembly files, and use absolute paths to the files in the inputs.json file (or paths relative to the directory from which you are executing the workflow).
If the workflow is being run on AWS (e.g. via the Amazon Genomics CLI), s3:// URLs to the files in the s3://dnastack-covid-19-sra-data can be used instead, and local download is not required. Note that if the workflow is run on AWS, cloud compute and storage costs will be incurred

Example `inputs.json` files

Each of the example_inputs.*.json files in this directory will run the workflow using 10 samples as input.

The example_inputs.local.json file should be used when running the workflow locally
The example_inputs.aws.json file should be used when running the workflow in AWS

To run locally, first download the files from s3:

mkdir data
for sample in ERR4082025 ERR4082026 ERR4082027 ERR4082028 ERR4082029 ERR4082030 ERR4082031 ERR4082032 ERR4082033 ERR4082034; do
    aws s3 cp s3://dnastack-covid-19-sra-data/Illumina_PE/fasta/${sample}/${sample}.consensus.fasta ./data/ --no-sign-request
done

Running on AWS will not require any additional steps; data will be pulled directly from s3 as part of workflow execution.

Output files

The workflow outputs a CSV file containing a lineage assignment for each of the assembled genomes input to the workflow.

Running the workflow

See the README for details on running the tutorial workflows, either locally or in the AWS cloud.

Name		Name	Last commit message	Last commit date
parent directory ..
MANIFEST.json		MANIFEST.json
README.md		README.md
assign_lineage.wdl		assign_lineage.wdl
example_inputs.aws.json		example_inputs.aws.json
example_inputs.local.json		example_inputs.local.json
inputs.json		inputs.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Assigning viral lineage

Input files

Example `inputs.json` files

Output files

Running the workflow

FilesExpand file tree

assign_lineage

Directory actions

More options

Directory actions

More options

Latest commit

History

assign_lineage

Folders and files

parent directory

README.md

Assigning viral lineage

Input files

Example inputs.json files

Output files

Running the workflow

Example `inputs.json` files