System Architecture#

oxo-flow is organized as a Cargo workspace with four crates that form a layered architecture.

Workspace Layout#

oxo-flow/
├── crates/
│   ├── oxo-flow-core/    # Core library
│   ├── oxo-flow-cli/     # CLI binary
│   ├── oxo-flow-web/     # Web API server
│   └── venus/            # Clinical pipeline
├── pipelines/            # Pipeline definitions
├── examples/             # Example workflows
└── tests/                # Integration tests

Crate Dependencies#

graph TD
    CLI[oxo-flow-cli] --> Core[oxo-flow-core]
    CLI --> Web[oxo-flow-web]
    Web --> Core
    Venus[oxo-flow-venus] --> Core

oxo-flow-core is the foundation — all other crates depend on it
oxo-flow-cli is the user-facing binary that ties everything together
oxo-flow-web provides the REST API layer on top of core
oxo-flow-venus is a domain-specific pipeline crate built on core

Core Library Modules#

The oxo-flow-core crate is organized into focused modules:

Module	Responsibility
`config`	Parse `.oxoflow` TOML files into `WorkflowConfig`
`rule`	Rule definitions: inputs, outputs, shell, resources, environment
`dag`	Build and validate the dependency DAG, topological sorting
`executor`	Execute rules locally with checkpointing
`scheduler`	Resource-aware job scheduling
`environment`	Resolve and activate conda, docker, singularity, pixi, venv
`wildcard`	Expand `{sample}` patterns in file paths
`report`	Generate HTML and JSON reports from templates
`container`	Generate Dockerfile and Singularity definitions
`cluster`	Generate SLURM, PBS, SGE, LSF job scripts
`error`	Unified error types (`OxoFlowError`)
`format`	Output formatting utilities

Data Flow#

A typical workflow execution follows this path:

sequenceDiagram
    participant User
    participant CLI
    participant Config
    participant DAG
    participant Scheduler
    participant Executor
    participant Environment

    User->>CLI: oxo-flow run pipeline.oxoflow -j 4
    CLI->>Config: WorkflowConfig::from_file()
    Config-->>CLI: WorkflowConfig
    CLI->>DAG: WorkflowDag::from_rules()
    DAG-->>CLI: WorkflowDag
    CLI->>DAG: execution_order()
    DAG-->>CLI: Vec<String> (topological order)
    loop For each rule
        CLI->>Executor: execute_rule()
        Executor->>Environment: resolve & activate
        Environment-->>Executor: activated
        Executor->>Executor: run shell command
        Executor-->>CLI: JobRecord
    end
    CLI->>User: Done: N succeeded, M failed

Key Design Decisions#

DAG-first execution#

All workflows are compiled into a Directed Acyclic Graph before any execution begins. This ensures:

Dependencies are resolved up front
Cycles are detected before compute is wasted
Parallel execution groups are identified
The execution order is deterministic

Environment isolation#

Every rule can declare its own software environment. The executor resolves the environment specification, activates it, runs the command, and deactivates it. This prevents tool version conflicts between pipeline steps.

Error types#

The core library uses thiserror for typed errors:

pub enum OxoFlowError {
    Config(String),
    Dag(String),
    Execution(String),
    Environment(String),
    // ...
}

The CLI uses anyhow for ergonomic error handling at the binary level.

Async runtime#

The executor uses tokio for async task execution. Each rule runs as a tokio task, enabling concurrent execution up to the -j limit.

Serialization#

All configuration is TOML-based, parsed with serde and the toml crate. Report output supports both HTML (via Tera templates) and JSON (via serde_json).

Technology Stack#

Component	Technology
Language	Rust (edition 2024)
Async runtime	tokio
CLI framework	clap (derive macros)
Web framework	axum
Serialization	serde + toml
Logging	tracing
Error handling	thiserror (lib) + anyhow (bin)
Templating	Tera
Graph algorithms	petgraph