Introduction#

oxo-flow is a Rust-native bioinformatics pipeline engine built from first principles for performance, reproducibility, and clinical-grade rigor. It compiles workflows into a Directed Acyclic Graph, manages software environments automatically, and runs jobs in parallel — all from a single, fast binary with no external runtime required.

# Define your workflow in TOML
cat pipeline.oxoflow

# Execute it
oxo-flow run pipeline.oxoflow -j 8

oxo-flow 0.1.0 — Bioinformatics Pipeline Engine
DAG: 5 rules in execution order
  1. fastqc
  2. trim_reads
  3. bwa_align
  4. sort_bam
  5. call_variants
Done: 5 succeeded, 0 failed

What Is oxo-flow?#

oxo-flow is a high-performance workflow engine built from the ground up in Rust for bioinformatics and clinical genomics. You define pipelines in a clean TOML format (.oxoflow files), and oxo-flow handles dependency resolution, environment activation, parallel execution, and report generation — with compile-time safety guarantees and zero interpreter overhead.

Core capabilities#

Capability	Description
DAG engine	Automatic dependency resolution, topological sorting, cycle detection, and parallel execution groups
Environment management	First-class support for conda, pixi, docker, singularity, and Python venv — per rule
Clinical reporting	Generate structured HTML and JSON reports with Tera templates for clinical and research use
Web API	Built-in REST API (axum-based) for building, validating, and monitoring workflows remotely
Container packaging	Package entire workflows into Docker or Singularity images for portable, reproducible execution
Cluster backends	Submit jobs to SLURM, PBS, SGE, and LSF clusters with resource-aware scheduling
Wildcard expansion	`{sample}`, `{chr}` patterns that expand automatically from inputs or config
Venus pipeline	Built-in clinical tumor variant calling pipeline ready for somatic analysis

Who Is This For?#

Bioinformaticians who build and maintain analysis pipelines — oxo-flow gives you a faster, type-safe workflow engine with clear error messages, reproducibility guarantees, and no external runtime dependency.

Clinical laboratories running accredited genomics workflows — the reporting system produces structured, auditable reports, and container packaging ensures reproducibility across environments.

Researchers who need reproducible science — every workflow execution is deterministic, and environments are locked per rule so results are the same on any machine.

Core facility staff managing multi-sample, multi-assay workloads — the DAG engine and cluster backends handle parallelism and resource scheduling automatically.

How to Use This Guide#

This documentation follows the Diátaxis framework and is organized into four sections:

If you are new to oxo-flow#

Start with the Tutorials in order:

Installation — install the binary
Quick Start — run your first workflow in 5 minutes
Your First Workflow — build a pipeline from scratch
Variant Calling Pipeline — complete NGS analysis
Environment Management — use conda, docker, and more

If you want to learn by example#

Explore the Workflow Gallery — 8 curated workflows from hello-world to multi-omics integration, each with validation output, DAG visualizations, and scientific context:

Hello World ⭐ — Minimal rule structure
File Pipeline ⭐⭐ — Multi-rule dependencies
Parallel Samples ⭐⭐ — Wildcard expansion
Scatter-Gather ⭐⭐⭐ — Parallel chunk processing
Environment Management ⭐⭐⭐ — Per-rule isolation
RNA-seq Quantification ⭐⭐⭐⭐ — Transcriptomics pipeline
WGS Germline Calling ⭐⭐⭐⭐⭐ — GATK best practices
Multi-Omics Integration ⭐⭐⭐⭐⭐ — WGS + RNA-seq + Methylation

If you need to accomplish a specific task#

Jump to the How-to Guides:

If you need exact syntax and options#

See the Command Reference for all 12 CLI subcommands with usage, options, and examples.

If you want the full technical details#

See Architecture & Design for in-depth documentation of the DAG engine, environment system, .oxoflow format specification, web API, and Venus pipeline.

Quick Example#

Here is a complete workflow that aligns paired-end reads and sorts the output:

# align.oxoflow
[workflow]
name = "align-and-sort"
version = "1.0.0"

[config]
reference = "/data/ref/hg38.fa"

[defaults]
threads = 4
memory = "8G"

[[rules]]
name = "bwa_align"
input = ["{sample}_R1.fastq.gz", "{sample}_R2.fastq.gz"]
output = ["aligned/{sample}.bam"]
threads = 16
memory = "32G"
environment = { docker = "biocontainers/bwa:0.7.17" }
shell = "bwa mem -t {threads} {config.reference} {input} | samtools sort -o {output}"

[[rules]]
name = "index_bam"
input = ["aligned/{sample}.bam"]
output = ["aligned/{sample}.bam.bai"]
environment = { conda = "envs/samtools.yaml" }
shell = "samtools index {input}"

Run it:

# Validate the workflow
oxo-flow validate align.oxoflow

# Preview the execution plan
oxo-flow dry-run align.oxoflow

# Execute with 8 parallel jobs
oxo-flow run align.oxoflow -j 8

# Visualize the DAG
oxo-flow graph align.oxoflow | dot -Tpng -o dag.png

Project Status#

oxo-flow is under active development. The current release (v0.1.0) includes the complete core engine, CLI, web API, and Venus pipeline. See the Changelog for release history and the Contributing guide if you want to get involved.

Join the Community#

oxo-flow is a community-driven, open-source project licensed under Apache 2.0. Bug reports, feature requests, and contributions are welcome.

How to contribute	Link
🐛 Report a bug	Bug report
💡 Request a feature	Feature request
🤝 Contribute code	Contributing guide

Try it, break it, and tell us what happened. Even a short comment about what worked — or didn't — helps improve oxo-flow for everyone.