Table Of Contents

Previous topic

Sequence capture/resequencing pipeline

This Page

HaloPlex pipeline

Variant calling

ratatosk_run.py HaloPlex --indir inputdir --custom-config custom_config_file.yaml
HaloPlex pipeline

Figure 1. HaloPlex pipeline, part 1. The input consists of two read pairs from one sample, thus illustrating the merge operation at sample level. The figure has been partitioned for clarity.

HaloPlex pipeline

Figure 2. HaloPlex pipeline, continued.

Blue boxes mean active processes (the command was run with --workers 4). Note that we need to know what labels are applied to the file name (see issues). In this iteration, for the predefined pipelines the file names have been hardcoded.

Variant summary

ratatosk_run.py HaloPlexSummary --indir inputdir --custom-config custom_config_file.yaml
HaloPlexSummary pipeline

Figure 3. HaloPlexSummary pipeline. Often there are so many samples that this step needs to be performed separately from the HaloPlex task.

Combining variants

The task HaloPlexCombine combines variants from samples, genotyping them at the ‘union’ of candidate positions generated from sample-level variant calls.

ratatosk_run.py HaloPlexCombine --indir inputdir --custom-config custom_config_file.yaml
HaloPlexCombine pipeline

Figure 4. HaloPlexCombine pipeline. Uses GATK CombineVariants to combine sample vcf files to one output file. In the process, a master vcf is first generated, and all samples are genotyped given these alleles before combining to the final output.