Skip to content

Introduction#

oxo-flow is a Rust-native bioinformatics pipeline engine built from first principles for performance, reproducibility, and clinical-grade rigor. It compiles workflows into a Directed Acyclic Graph, manages software environments automatically, and runs jobs in parallel — all from a single, fast binary with no external runtime required.

# Define your workflow in TOML
cat pipeline.oxoflow

# Execute it
oxo-flow run pipeline.oxoflow -j 8
oxo-flow 0.6.1 — Bioinformatics Pipeline Engine
DAG: 5 rules in execution order
  1. fastqc
  2. trim_reads
  3. bwa_align
  4. sort_bam
  5. call_variants
Done: 5 succeeded, 0 failed

What Is oxo-flow?#

oxo-flow is a high-performance workflow engine built from the ground up in Rust for bioinformatics and clinical genomics. You define pipelines in a clean TOML format (.oxoflow files), and oxo-flow handles dependency resolution, environment activation, parallel execution, and report generation — with compile-time safety guarantees and zero interpreter overhead.

Core capabilities#

Capability Description
DAG engine Automatic dependency resolution, topological sorting, cycle detection, and parallel execution groups
Environment management First-class support for conda, pixi, docker, singularity, and Python venv — per rule
Clinical reporting Generate structured HTML and JSON reports with Tera templates for clinical and research use
Web API Built-in REST API (axum-based) for building, validating, and monitoring workflows remotely
Container packaging Package entire workflows into Docker or Singularity images for portable, reproducible execution
Cluster backends Submit jobs to SLURM, PBS, SGE, and LSF clusters with resource-aware scheduling
Wildcard expansion {sample}, {chr} patterns that expand automatically from inputs or config

Who Is This For?#

Bioinformaticians who build and maintain analysis pipelines — oxo-flow gives you a faster, type-safe workflow engine with clear error messages, reproducibility guarantees, and no external runtime dependency.

Clinical laboratories running accredited genomics workflows — the reporting system produces structured, auditable reports, and container packaging ensures reproducibility across environments.

Researchers who need reproducible science — every workflow execution is deterministic, and environments are locked per rule so results are the same on any machine.

Core facility staff managing multi-sample, multi-assay workloads — the DAG engine and cluster backends handle parallelism and resource scheduling automatically.


How to Use This Guide#

This documentation follows the Diátaxis framework and is organized into four sections:

If you are new to oxo-flow#

Start with the Tutorials in order:

  1. Installation — install the binary
  2. Quick Start — run your first workflow in 5 minutes
  3. Your First Workflow — build a pipeline from scratch
  4. Variant Calling Pipeline — complete NGS analysis
  5. Environment Management — use conda, docker, and more

If you want to learn by example#

Explore the Workflow Gallery — 9 curated workflows from hello-world to multi-omics integration, each with validation output, DAG visualizations, and scientific context:

  1. Hello World ⭐ — Minimal rule structure
  2. File Pipeline ⭐⭐ — Multi-rule dependencies
  3. Parallel Samples ⭐⭐ — Wildcard expansion
  4. Scatter-Gather ⭐⭐⭐ — Parallel chunk processing
  5. Environment Management ⭐⭐⭐ — Per-rule isolation
  6. RNA-seq Quantification ⭐⭐⭐⭐ — Transcriptomics pipeline
  7. WGS Germline Calling ⭐⭐⭐⭐⭐ — GATK best practices
  8. Multi-Omics Integration ⭐⭐⭐⭐⭐ — WGS + RNA-seq + Methylation
  9. Single-Cell RNA-seq ⭐⭐⭐⭐ — scRNA-seq analysis

If you need to accomplish a specific task#

Jump to the How-to Guides:

If you need exact syntax and options#

See the Command Reference for all 31 CLI subcommands with usage, options, and examples.

If you want the full technical details#

See Architecture & Design for in-depth documentation of the DAG engine, environment system, .oxoflow format specification, and web API.


Quick Example#

Here is a complete workflow that aligns paired-end reads and sorts the output:

# align.oxoflow
[workflow]
name = "align-and-sort"
version = "1.0.0"

[config]
reference = "/data/ref/hg38.fa"

[defaults]
threads = 4
memory = "8G"

[[rules]]
name = "bwa_align"
input = ["{sample}_R1.fastq.gz", "{sample}_R2.fastq.gz"]
output = ["aligned/{sample}.bam"]
threads = 16
memory = "32G"
environment = { docker = "biocontainers/bwa:0.7.17" }
shell = "bwa mem -t {threads} {config.reference} {input} | samtools sort -o {output}"

[[rules]]
name = "index_bam"
input = ["aligned/{sample}.bam"]
output = ["aligned/{sample}.bam.bai"]
environment = { conda = "envs/samtools.yaml" }
shell = "samtools index {input}"

Run it:

# Validate the workflow
oxo-flow validate align.oxoflow

# Preview the execution plan
oxo-flow dry-run align.oxoflow

# Execute with 8 parallel jobs
oxo-flow run align.oxoflow -j 8

# Visualize the DAG
oxo-flow graph align.oxoflow | dot -Tpng -o dag.png

Key Concepts#

If you are new to pipeline engines, here are the three core concepts used in oxo-flow:

Workflow#

A Workflow is the entire pipeline definition (usually a .oxoflow file). It contains a collection of rules, configuration settings, and software requirements.

Rule#

A Rule is a single processing step. It defines:

  • Input: The files needed to run (e.g., raw reads).
  • Output: The files produced by the step (e.g., aligned BAM).
  • Command: The actual shell command to execute (e.g., bwa mem).
  • Environment: The software tools needed (e.g., a specific Conda environment).

DAG (Directed Acyclic Graph)#

A DAG is a mathematical representation of your workflow. It is a "map" that shows how rules are connected by their inputs and outputs. oxo-flow builds this map automatically to determine which rules can run in parallel and which must wait for others to finish.


Project Status#

oxo-flow is under active development. The current release (v0.6.1) includes the complete core engine, CLI, and web API. See the Changelog for release history and the Contributing guide if you want to get involved.


How to Cite#

If you use oxo-flow in academic research, please cite:

Jia Ding, Yun Peng, Ruochen Wei, Boquan Wang, Jian-Guo Zhou, Shixiang Wang, BLIT: an R package for seamless integration of command-line bioinformatics tool universe, Bioinformatics Advances, Volume 6, Issue 1, 2026, vbag088, https://doi.org/10.1093/bioadv/vbag088

A dedicated oxo-flow manuscript is in preparation.

Join the Community#

oxo-flow is a community-driven, open-source project licensed under Apache 2.0. Bug reports, feature requests, and contributions are welcome.

How to contribute Link
🐛 Report a bug Bug report
💡 Request a feature Feature request
🤝 Contribute code Contributing guide

Try it, break it, and tell us what happened. Even a short comment about what worked — or didn't — helps improve oxo-flow for everyone.