Workflow
8-Step Orchestration
Step 1 — Decompose
Break the user's task into 2-5 independent subtasks. Each subtask must be self-contained, produce a concrete artifact, and not depend on other workers' output (unless you plan sequential phases).
Step 2 — Spawn Workers
Launch up to 6 workers across two provider pools. OpenCode workers are spawned via execute_code with Python subprocess.Popen and stdin piping. Codex workers use terminal(pty=true).
Step 3 — Monitor
OpenCode workers run inside execute_code, so monitoring is batch-style — you get all output when the subprocess exits. Codex workers on PTY can be polled with process() tools if spawned via terminal(background=true).
Look for: progress markers, error patterns (Error:, Exception:, timeout), completion signals (apply_patch Success, Wrote to...).
Step 4 — Evaluate Output
Every worker is scored against a 4-criterion rubric. PASS requires Presence=Pass AND average >= 3.0.
| Criterion | Scale | Description |
|---|---|---|
| Presence | Pass / Fail | Did the worker produce any output file or result? |
| Correctness | 1-5 | Does output match what the subtask asked for? |
| Completeness | 1-5 | All parts of the subtask addressed? |
| Quality | 1-5 | Professional standard, no obvious errors? |
Step 5 — Feedback & Respawn
Failed workers receive specific corrective feedback and are respawned. Max 2 respawns per subtask (3 total attempts). After 3 failures, synthesize what you have or flag for user review.
Circuit-breaker respawns (for provider rate-limiting) bypass quality analysis — the failure is infrastructure, not task quality. Immediately respawn on the alternate provider.
Step 6 — Dynamic Worker Spawning
The orchestrator does not stop after the first batch. If a worker's output reveals new subtasks or gaps, spawn additional workers. If user feedback arrives mid-orchestration, spawn workers to address it.
Finished workers no longer count toward the 6-worker global cap or the 3-per-provider cap.
Step 7 — Provider Rebalancing
Treat OpenCode and Codex as two concurrent worker pools, not a fallback chain. If one provider is saturated, place the next eligible subtask on the other. If a worker fails, decide whether the failure was prompt-specific (retry same provider) or provider-specific (reassign to other pool).
Step 8 — Synthesize
Combine all worker outputs into a single coherent deliverable. Read terminal logs and specific output files. Ignore node_modules, vendor dirs, and auto-generated noise. Resolve conflicts between worker outputs and credit which worker produced what.
Spawning Methods
How Workers Are Launched
OpenCode — execute_code + subprocess
The reliable method uses Python subprocess.Popen with stdin piping inside an execute_code block. This avoids shell-quoting fragility with multi-line prompts.
Trade-off: Batch output only — no live process() monitoring. You get all output when the process exits.
Codex — terminal(pty=true)
Codex is an interactive CLI app that requires a PTY. Use terminal(pty=true) for reliable execution.
Note: Codex may have TTY issues in background mode. Foreground is more reliable.
Resilience
Circuit Breaker
When OpenCode or Codex hits a provider rate limit (e.g., OpenAI 429), the process may block indefinitely. The circuit breaker detects this and recommends immediate action.
| Severity | Reset Time | Action |
|---|---|---|
| Critical | > 5 minutes | Kill worker, respawn on alternate provider immediately |
| High | > 1 minute | Kill worker, respawn on alternate provider immediately |
| Medium | < 1 minute | Brief 30s pause, retry same provider once; if still limited, respawn on alternate |
| Both limited | — | Queue the task, synthesize what you have, flag remaining for user review |
Circuit-breaker respawns count against the max 2 respawns per subtask limit.
Tools
Helper Scripts
| Script | Purpose |
|---|---|
scripts/evaluate_output.py | Score worker output against criteria (Presence / Correctness / Completeness / Quality) |
scripts/evaluate_worker_logs.py | Parse worker terminal output into structured JSON findings |
scripts/circuit_breaker.py | Detect rate-limit / API exhaustion errors and recommend respawn actions |
scripts/synthesize_outputs.py | Merge multiple worker outputs into one deliverable |
Lessons Learned
Known Pitfalls
terminal(background=true) with opencode run hangs indefinitely
The opencode TUI blocks on a pseudo-TTY that never receives input. Output stream stays empty and the process never completes. Always spawn OpenCode workers via execute_code with Python subprocess.Popen.
Shell quoting with opencode run is fragile
Complex prompts containing quotes, newlines, or backticks break shell parsing when passed directly to terminal(command="opencode run '...'"). Prefer stdin piping via execute_code for all non-trivial prompts.
Multiple workers modifying the same config files will overwrite each other
If Worker A updates package.json and Worker B updates package.json, copying both sequentially causes the second copy to overwrite the first. Always diff and merge config files rather than blind copy.
Worker sandboxing is strict
OpenCode cannot read files outside its workdir — confirmed by permission denied errors. Always cp -r project/* workdir/ before spawning.
Synthesis picks up node_modules noise
When workers copy entire projects, synthesize_outputs.py will ingest node_modules READMEs. Always instruct workers to write findings to a specific file (e.g., review.md) and read only that file during synthesis.
Git init timing matters
If you git init an empty workdir before copying project files, git won't track the copied files as modifications. Either copy files first then git init && git add -A, or use diff original_file modified_file instead of git diff.
Lean copied workdirs can break npm/TypeScript verification
If you exclude node_modules to keep worker copies small, commands like npm run lint or tsc --noEmit may fail inside the worker workdir even though the real repo validates fine. Do not treat missing local toolchain as proof the patch is bad.
Provider API rate limits hang workers silently
When OpenCode or Codex hits a provider rate limit (e.g., OpenAI 429), the process may block indefinitely rather than exiting. There is no automatic circuit breaker. Mitigations: pre-check provider quota, set aggressive timeouts (60-90s when limits are tight), respawn on alternate provider immediately after detecting 429.
Evaluation
E2E Test — 2026-04-21
A live end-to-end test spawning 2 OpenCode workers + 1 Codex worker in parallel. Honest results — no sugarcoating.
Worker A (OpenCode) — CSV-to-JSON CLI
Outcome: TIMEOUT after 300s
Root cause: OpenAI API rate limit (429 usage_limit_reached)
Verdict: Infrastructure PASS (orchestration, skill loading, sandboxing worked). Provider FAIL (API exhaustion).
Worker B (OpenCode) — Trigger Optimization Audit
Outcome: TIMEOUT after 300s (same API limit)
Verdict: Same as Worker A — infrastructure healthy, provider blocked.
Worker C (Codex) — Release Packaging
Status: Not spawned. Blocked by A/B timeouts; would have same API risk.
Key Findings
- Worker isolation works — opencode correctly ran in
/tmp/worker-awithout escaping. - Skill auto-loading works —
worker-orchestratorskill was discovered and evaluated. - Prompt-file pattern is correct — no shell-quoting failures observed.
- Provider rate-limiting is the critical failure mode. When API limits hit, workers hang indefinitely.
- Monitoring gap —
execute_code+subprocess.Popengives no live visibility.