Run on a Cluster#
This guide explains how to execute oxo-flow workflows on HPC clusters using SLURM, PBS, SGE, and LSF backends.
Overview#
oxo-flow's cluster module translates each rule into a cluster job submission. Resource requirements declared in the .oxoflow file (threads, memory, gpu, disk, time_limit) are mapped to the appropriate scheduler directives.
Supported Schedulers#
| Scheduler | Status | Directive prefix |
|---|---|---|
| SLURM | Supported | #SBATCH |
| PBS/Torque | Supported | #PBS |
| SGE | Supported | #$ |
| LSF | Supported | #BSUB |
Declaring Resources#
Set resource requirements per rule:
[[rules]]
name = "align"
input = ["{sample}_R1.fastq.gz"]
output = ["aligned/{sample}.bam"]
threads = 16
memory = "32G"
[rules.resources]
gpu = 0
disk = "100G"
time_limit = "24h"
environment = { singularity = "docker://biocontainers/bwa:0.7.17" }
shell = "bwa mem -t {threads} ref.fa {input} | samtools sort -o {output}"
Resource fields#
| Field | Type | Example | Description |
|---|---|---|---|
threads |
Integer | 16 |
Number of CPU cores |
memory |
String | "32G" |
RAM allocation |
gpu |
Integer | 1 |
Number of GPUs |
disk |
String | "100G" |
Local disk space |
time_limit |
String | "24h" |
Wall-time limit |
SLURM Example#
oxo-flow generates SLURM job scripts automatically. For the align rule above, the generated script looks like:
#!/bin/bash
#SBATCH --job-name=align
#SBATCH --cpus-per-task=16
#SBATCH --mem=32G
#SBATCH --time=24:00:00
#SBATCH --output=logs/align_%j.out
#SBATCH --error=logs/align_%j.err
# Environment activation
singularity exec docker://biocontainers/bwa:0.7.17 \
bwa mem -t 16 ref.fa sample1_R1.fastq.gz | samtools sort -o aligned/sample1.bam
PBS Example#
#!/bin/bash
#PBS -N align
#PBS -l ncpus=16
#PBS -l mem=32gb
#PBS -l walltime=24:00:00
#PBS -o logs/align.out
#PBS -e logs/align.err
cd $PBS_O_WORKDIR
bwa mem -t 16 ref.fa sample1_R1.fastq.gz | samtools sort -o aligned/sample1.bam
SGE Example#
#!/bin/bash
#$ -N align
#$ -pe smp 16
#$ -l h_vmem=2G
#$ -l h_rt=24:00:00
#$ -o logs/align.out
#$ -e logs/align.err
#$ -cwd
bwa mem -t 16 ref.fa sample1_R1.fastq.gz | samtools sort -o aligned/sample1.bam
Best Practices#
Use Singularity on clusters
Most HPC clusters do not allow Docker. Use Singularity instead — oxo-flow handles the conversion automatically when you specify singularity = "docker://...".
Set realistic time limits
Generous wall-time limits prevent premature job termination but may lower scheduling priority. Profile your jobs first.
Use --keep-going for large batches
When running hundreds of samples, use oxo-flow run -k so that a single failure does not abort the entire run.
Check resource availability
Use sinfo (SLURM), pbsnodes (PBS), or qhost (SGE) to verify available resources before submitting.
Monitoring Jobs#
After submission, use your cluster's native tools:
Or use oxo-flow's status command with a checkpoint file:
See Also#
- Architecture: Cluster backends — internal cluster module design
- Environment System — Singularity and Docker on HPC