roz-worker binary runs on the robot’s onboard computer (Jetson, Raspberry Pi, or any Linux ARM/x86 system), connects to a roz server over NATS for task dispatch, and executes WASM controllers locally at 100Hz through the Copper runtime. A separate roz-safety daemon monitors heartbeats and enforces emergency stop.
Architecture
roz-worker
The worker binary is the edge runtime. It connects to a roz server over NATS, receives task assignments, runs the agent reasoning loop locally, and executes WASM controllers through the Copper runtime.Installation
Build from source for your target architecture:target/release/roz-worker (or target/aarch64-unknown-linux-gnu/release/roz-worker).
Configuration
Configure the worker with environment variables orroz.toml:
Running
Copper Runtime
The Copper runtime drives the control loop at 100Hz. Each cycle:- Read sensor state from the hardware abstraction layer.
- Execute the active WASM controller in the wasmtime sandbox.
- Write motor commands through the channel interface.
- Enforce safety limits (velocity, acceleration, position clamps).
roz-safety Daemon
The safety daemon runs as a separate process from the worker. This isolation ensures that if the worker crashes, hangs, or enters an unexpected state, the safety system remains operational.What It Does
- Heartbeat monitoring — The worker sends heartbeats to the safety daemon at a configurable interval. If heartbeats stop (worker crash, network partition, deadlock), the safety daemon triggers an emergency stop.
- E-stop enforcement — On any safety event (missed heartbeat, limit violation, external e-stop signal), the daemon immediately commands all actuators to a safe state: zero velocity, brakes engaged, controllers halted.
- Independent watchdog — The daemon monitors system health metrics (CPU temperature, memory pressure, actuator faults) and can trigger protective shutdown independently of the agent.
Running
roz-safety before roz-worker.
Crash Recovery
The worker uses write-ahead log (WAL) persistence to survive crashes and power cycles:- Session state is journaled to disk. On restart, the worker resumes the active session from the last checkpoint rather than starting from scratch.
- Controller state is persisted. If a WASM controller was active at the time of the crash, the worker reloads it and resumes execution.
- NATS reconnection is automatic. The worker reconnects to the server and re-registers itself without manual intervention.
Network Resilience
Edge robots operate in environments where network connectivity is intermittent or unavailable. The worker handles disconnection gracefully:- Ollama fallback — If the network drops and the configured cloud LLM provider (Anthropic, OpenAI, Google) becomes unreachable, the worker falls back to a local Ollama instance for agent reasoning. Configure the fallback model in
roz.toml:
- Autonomous operation — Active WASM controllers continue executing at 100Hz regardless of network state. The control loop is entirely local.
- Task queuing — If the worker receives a task while disconnected from the LLM provider and no local fallback is available, the task is queued and executed when connectivity is restored.
Systemd Service
For production deployments, run both the worker and safety daemon as systemd services:Next Steps
- Safety Architecture — details on the tiered safety system.
- WASM Controllers — how the agent generates and deploys real-time control code.
- Self-Hosting — set up the server that the worker connects to.