Introduction#
oxo-flow is a Rust-native bioinformatics pipeline engine built from first principles for performance, reproducibility, and clinical-grade rigor. It compiles workflows into a Directed Acyclic Graph, manages software environments automatically, and runs jobs in parallel — all from a single, fast binary with no external runtime required.
oxo-flow 0.6.1 — Bioinformatics Pipeline Engine
DAG: 5 rules in execution order
1. fastqc
2. trim_reads
3. bwa_align
4. sort_bam
5. call_variants
Done: 5 succeeded, 0 failed
What Is oxo-flow?#
oxo-flow is a high-performance workflow engine built from the ground up in Rust for bioinformatics and clinical genomics. You define pipelines in a clean TOML format (.oxoflow files), and oxo-flow handles dependency resolution, environment activation, parallel execution, and report generation — with compile-time safety guarantees and zero interpreter overhead.
Core capabilities#
| Capability | Description |
|---|---|
| DAG engine | Automatic dependency resolution, topological sorting, cycle detection, and parallel execution groups |
| Environment management | First-class support for conda, pixi, docker, singularity, and Python venv — per rule |
| Clinical reporting | Generate structured HTML and JSON reports with Tera templates for clinical and research use |
| Web API | Built-in REST API (axum-based) for building, validating, and monitoring workflows remotely |
| Container packaging | Package entire workflows into Docker or Singularity images for portable, reproducible execution |
| Cluster backends | Submit jobs to SLURM, PBS, SGE, and LSF clusters with resource-aware scheduling |
| Wildcard expansion | {sample}, {chr} patterns that expand automatically from inputs or config |
Who Is This For?#
Bioinformaticians who build and maintain analysis pipelines — oxo-flow gives you a faster, type-safe workflow engine with clear error messages, reproducibility guarantees, and no external runtime dependency.
Clinical laboratories running accredited genomics workflows — the reporting system produces structured, auditable reports, and container packaging ensures reproducibility across environments.
Researchers who need reproducible science — every workflow execution is deterministic, and environments are locked per rule so results are the same on any machine.
Core facility staff managing multi-sample, multi-assay workloads — the DAG engine and cluster backends handle parallelism and resource scheduling automatically.
How to Use This Guide#
This documentation follows the Diátaxis framework and is organized into four sections:
If you are new to oxo-flow#
Start with the Tutorials in order:
- Installation — install the binary
- Quick Start — run your first workflow in 5 minutes
- Your First Workflow — build a pipeline from scratch
- Variant Calling Pipeline — complete NGS analysis
- Environment Management — use conda, docker, and more
If you want to learn by example#
Explore the Workflow Gallery — 9 curated workflows from hello-world to multi-omics integration, each with validation output, DAG visualizations, and scientific context:
- Hello World ⭐ — Minimal rule structure
- File Pipeline ⭐⭐ — Multi-rule dependencies
- Parallel Samples ⭐⭐ — Wildcard expansion
- Scatter-Gather ⭐⭐⭐ — Parallel chunk processing
- Environment Management ⭐⭐⭐ — Per-rule isolation
- RNA-seq Quantification ⭐⭐⭐⭐ — Transcriptomics pipeline
- WGS Germline Calling ⭐⭐⭐⭐⭐ — GATK best practices
- Multi-Omics Integration ⭐⭐⭐⭐⭐ — WGS + RNA-seq + Methylation
- Single-Cell RNA-seq ⭐⭐⭐⭐ — scRNA-seq analysis
If you need to accomplish a specific task#
Jump to the How-to Guides:
If you need exact syntax and options#
See the Command Reference for all 31 CLI subcommands with usage, options, and examples.
If you want the full technical details#
See Architecture & Design for in-depth documentation of the DAG engine, environment system, .oxoflow format specification, and web API.
Quick Example#
Here is a complete workflow that aligns paired-end reads and sorts the output:
# align.oxoflow
[workflow]
name = "align-and-sort"
version = "1.0.0"
[config]
reference = "/data/ref/hg38.fa"
[defaults]
threads = 4
memory = "8G"
[[rules]]
name = "bwa_align"
input = ["{sample}_R1.fastq.gz", "{sample}_R2.fastq.gz"]
output = ["aligned/{sample}.bam"]
threads = 16
memory = "32G"
environment = { docker = "biocontainers/bwa:0.7.17" }
shell = "bwa mem -t {threads} {config.reference} {input} | samtools sort -o {output}"
[[rules]]
name = "index_bam"
input = ["aligned/{sample}.bam"]
output = ["aligned/{sample}.bam.bai"]
environment = { conda = "envs/samtools.yaml" }
shell = "samtools index {input}"
Run it:
# Validate the workflow
oxo-flow validate align.oxoflow
# Preview the execution plan
oxo-flow dry-run align.oxoflow
# Execute with 8 parallel jobs
oxo-flow run align.oxoflow -j 8
# Visualize the DAG
oxo-flow graph align.oxoflow | dot -Tpng -o dag.png
Key Concepts#
If you are new to pipeline engines, here are the three core concepts used in oxo-flow:
Workflow#
A Workflow is the entire pipeline definition (usually a .oxoflow file). It contains a collection of rules, configuration settings, and software requirements.
Rule#
A Rule is a single processing step. It defines:
- Input: The files needed to run (e.g., raw reads).
- Output: The files produced by the step (e.g., aligned BAM).
- Command: The actual shell command to execute (e.g.,
bwa mem). - Environment: The software tools needed (e.g., a specific Conda environment).
DAG (Directed Acyclic Graph)#
A DAG is a mathematical representation of your workflow. It is a "map" that shows how rules are connected by their inputs and outputs. oxo-flow builds this map automatically to determine which rules can run in parallel and which must wait for others to finish.
Project Status#
oxo-flow is under active development. The current release (v0.6.1) includes the complete core engine, CLI, and web API. See the Changelog for release history and the Contributing guide if you want to get involved.
How to Cite#
If you use oxo-flow in academic research, please cite:
Jia Ding, Yun Peng, Ruochen Wei, Boquan Wang, Jian-Guo Zhou, Shixiang Wang, BLIT: an R package for seamless integration of command-line bioinformatics tool universe, Bioinformatics Advances, Volume 6, Issue 1, 2026, vbag088, https://doi.org/10.1093/bioadv/vbag088
A dedicated oxo-flow manuscript is in preparation.
Join the Community#
oxo-flow is a community-driven, open-source project licensed under Apache 2.0. Bug reports, feature requests, and contributions are welcome.
| How to contribute | Link |
|---|---|
| 🐛 Report a bug | Bug report |
| 💡 Request a feature | Feature request |
| 🤝 Contribute code | Contributing guide |
Try it, break it, and tell us what happened. Even a short comment about what worked — or didn't — helps improve oxo-flow for everyone.