Security Considerations#

This page documents the security model, threat mitigations, and privacy considerations for oxo-call deployments.

Threat Model#

oxo-call generates and optionally executes shell commands using LLM output. The primary security concerns are:

LLM output trust: Generated commands are executed via shell — malicious or incorrect LLM output could cause harm
API token exposure: LLM API tokens stored in config files
Data sent to LLM providers: Task descriptions and tool documentation are sent to external APIs
Supply chain: Dependencies must be audited for vulnerabilities
License tampering: License files could be forged or modified

Mitigations#

Input Validation#

Tool name sanitization: validate_tool_name() in src/docs.rs rejects path traversal attempts (../, /), empty names, and names with invalid characters
URL validation: Remote documentation URLs are restricted to http:// and https:// schemes only
LLM response format: Strict ARGS:/EXPLANATION: format validation with retry — malformed responses are rejected, not executed

Data Anonymization#

The sanitize module (src/sanitize.rs) provides two anonymization functions applied before sending data to LLM providers:

redact_paths(): Replaces absolute file paths (e.g., /home/user/data/patient.bam) with <PATH>, preserving relative paths and filenames that have semantic value for command generation
redact_env_tokens(): Redacts environment variable values containing TOKEN=, KEY=, or SECRET= patterns, replacing the value with <REDACTED> while preserving the variable name

What Data Is Sent to the LLM#

When you run oxo-call run or dry-run, the following is sent to the LLM provider:

Sent	Not Sent
Tool name (e.g., "samtools")	Actual file contents
Your task description (natural language)	License file
Tool `--help` output (cached)	Config file / API tokens for other providers
Skill content (concepts, pitfalls, examples)	Command execution output
System prompt rules	History entries

For maximum privacy, use Ollama with local models — no data leaves your machine. See Switch LLM Provider.

Dry-Run Mode#

Always use dry-run to preview generated commands before execution:

# Preview without executing
oxo-call dry-run samtools "sort input.bam"

# Execute with confirmation prompt
oxo-call run --ask samtools "sort input.bam"

API Token Security#

API tokens are stored in config.toml or passed via environment variables
Tokens are never logged, included in history, or sent to other providers
Use environment variables in shared or multi-user environments to avoid storing tokens in files

License Security#

Offline verification: Ed25519 signature verification requires no network calls
Tamper-proof: The public key is compiled into the binary — license files cannot be forged without the private signing key
No phone-home: License verification is entirely local

Supply Chain Security#

cargo audit runs in CI to detect known vulnerabilities in dependencies
SHA256 checksums (SHA256SUMS.txt) are published with each release for binary integrity verification
Minimal dependencies: The project uses well-audited crates (ed25519-dalek, reqwest, tokio)

Deployment Recommendations#

Single-User Workstation#

Default configuration is appropriate. Consider using Ollama for privacy-sensitive data.

Shared HPC Cluster#

Use environment variables for API tokens (not config files)
Set OXO_CALL_LICENSE to a shared license path
Consider running Ollama as a shared service on a dedicated node

Clinical / Regulated Environment#

Use Ollama exclusively (no external API calls)
Use --ask flag for all commands (human-in-the-loop confirmation)
Audit command history via oxo-call history list
Keep license.oxo.json in a secure, access-controlled directory

License System — Ed25519 verification details
LLM Integration — provider configuration and prompt architecture
Configuration — API token setup