Venus Pipeline#
Venus is a clinical tumor variant calling pipeline built on oxo-flow. It is included as a workspace crate (crates/venus/) and ships with pre-defined workflow files, environment specs, and report templates.
Overview#
Venus implements a complete somatic variant detection workflow for tumor samples. It is designed for clinical genomics laboratories that need:
- Validated, reproducible analysis
- Structured clinical reports
- Audit trails for regulatory compliance
- Container-packaged execution
Pipeline Steps#
graph TD
A[Quality Control] --> B[Read Trimming]
B --> C[Alignment]
C --> D[Duplicate Marking]
D --> E[Base Recalibration]
E --> F[Variant Calling]
F --> G[Variant Filtering]
G --> H[Annotation]
H --> I[Clinical Report]
| Step | Tools | Description |
|---|---|---|
| Quality Control | FastQC | Raw read quality assessment |
| Read Trimming | fastp | Adapter removal and quality filtering |
| Alignment | BWA-MEM2 | Read alignment to reference genome |
| Duplicate Marking | GATK MarkDuplicates | PCR duplicate identification |
| Base Recalibration | GATK BQSR | Base quality score recalibration |
| Variant Calling | GATK Mutect2 | Somatic variant detection |
| Variant Filtering | bcftools | Quality-based variant filtering |
| Annotation | VEP / SnpEff | Functional variant annotation |
| Clinical Report | oxo-flow report | Structured report generation |
Project Structure#
pipelines/venus/
├── rules/ # Individual step definitions
├── envs/ # Conda/container environment specs
├── schemas/ # Validation schemas for config
└── report/ # Report templates
The pipeline definitions live in pipelines/venus/ while the Rust crate at crates/venus/ provides programmatic access.
Running Venus#
With the CLI#
# Validate the pipeline
oxo-flow validate pipelines/venus/venus.oxoflow
# Dry-run
oxo-flow dry-run pipelines/venus/venus.oxoflow
# Execute
oxo-flow run pipelines/venus/venus.oxoflow -j 16
Configuration#
Venus expects a configuration file with sample and reference information:
[config]
reference = "/data/references/hg38/hg38.fa"
known_sites = "/data/references/hg38/known_sites.vcf.gz"
tumor_sample = "TUMOR_001"
normal_sample = "NORMAL_001"
results = "results/venus"
Clinical Reporting#
Venus generates clinical-grade reports using oxo-flow's reporting system:
Reports include:
- Patient/sample metadata — sample IDs, sequencing date, reference genome
- Quality metrics — read counts, coverage depth, duplication rates
- Variant summary — total variants, filtered variants, variant types
- Variant table — gene, position, consequence, allele frequency
- Methodology — tools used, versions, parameters
Container Packaging#
Package Venus into a self-contained container:
# Docker
oxo-flow package pipelines/venus/venus.oxoflow -o Dockerfile
docker build -t venus:1.0.0 .
# Singularity
oxo-flow package pipelines/venus/venus.oxoflow -f singularity -o venus.def
singularity build venus.sif venus.def
Customization#
Venus is designed to be customized for specific laboratory needs:
Add a new analysis step#
- Create a new rule in
pipelines/venus/rules/ - Add the environment spec to
pipelines/venus/envs/ - Include the rule in the main
.oxoflowfile - Add report sections for the new step
Modify variant filters#
Edit the filtering rule parameters in the workflow file to adjust quality thresholds, minimum depth, or allele frequency cutoffs.
Custom report templates#
Add Tera templates to pipelines/venus/report/ for custom report sections or layouts.
See Also#
- Variant Calling tutorial — build a similar pipeline step by step
- Reporting System — report architecture
- System Architecture — how Venus fits into the workspace