Complete Reference

Workflow, Limits & Pitfalls

Everything the skill actually does, exactly as documented. No invented metrics, no fake benchmarks.

Workflow

8-Step Orchestration

Step 1 — Decompose

Break the user's task into 2-5 independent subtasks. Each subtask must be self-contained, produce a concrete artifact, and not depend on other workers' output (unless you plan sequential phases).

Step 2 — Spawn Workers

Launch up to 6 workers across two provider pools. OpenCode workers are spawned via execute_code with Python subprocess.Popen and stdin piping. Codex workers use terminal(pty=true).

OpenCode max: 3 active | Codex max: 3 active | Global max: 6 active

Step 3 — Monitor

OpenCode workers run inside execute_code, so monitoring is batch-style — you get all output when the subprocess exits. Codex workers on PTY can be polled with process() tools if spawned via terminal(background=true).

Look for: progress markers, error patterns (Error:, Exception:, timeout), completion signals (apply_patch Success, Wrote to...).

Step 4 — Evaluate Output

Every worker is scored against a 4-criterion rubric. PASS requires Presence=Pass AND average >= 3.0.

CriterionScaleDescription
PresencePass / FailDid the worker produce any output file or result?
Correctness1-5Does output match what the subtask asked for?
Completeness1-5All parts of the subtask addressed?
Quality1-5Professional standard, no obvious errors?

Step 5 — Feedback & Respawn

Failed workers receive specific corrective feedback and are respawned. Max 2 respawns per subtask (3 total attempts). After 3 failures, synthesize what you have or flag for user review.

Circuit-breaker respawns (for provider rate-limiting) bypass quality analysis — the failure is infrastructure, not task quality. Immediately respawn on the alternate provider.

Step 6 — Dynamic Worker Spawning

The orchestrator does not stop after the first batch. If a worker's output reveals new subtasks or gaps, spawn additional workers. If user feedback arrives mid-orchestration, spawn workers to address it.

Finished workers no longer count toward the 6-worker global cap or the 3-per-provider cap.

Step 7 — Provider Rebalancing

Treat OpenCode and Codex as two concurrent worker pools, not a fallback chain. If one provider is saturated, place the next eligible subtask on the other. If a worker fails, decide whether the failure was prompt-specific (retry same provider) or provider-specific (reassign to other pool).

Step 8 — Synthesize

Combine all worker outputs into a single coherent deliverable. Read terminal logs and specific output files. Ignore node_modules, vendor dirs, and auto-generated noise. Resolve conflicts between worker outputs and credit which worker produced what.

Spawning Methods

How Workers Are Launched

OpenCode — execute_code + subprocess

The reliable method uses Python subprocess.Popen with stdin piping inside an execute_code block. This avoids shell-quoting fragility with multi-line prompts.

import subprocess, os, pathlib prompt_path = pathlib.Path("/tmp/worker-a/prompt.txt") prompt_path.write_text("YOUR MULTI-LINE PROMPT HERE") proc = subprocess.Popen( ["opencode", "run", "--dir", "/tmp/worker-a"], cwd="/tmp/worker-a", stdin=open(prompt_path, "r"), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True, env=os.environ.copy() ) stdout, _ = proc.communicate(timeout=300)

Trade-off: Batch output only — no live process() monitoring. You get all output when the process exits.

Codex — terminal(pty=true)

Codex is an interactive CLI app that requires a PTY. Use terminal(pty=true) for reliable execution.

terminal(command="codex exec 'YOUR_TASK_PROMPT_HERE'", workdir="/tmp/worker-b", pty=true)

Note: Codex may have TTY issues in background mode. Foreground is more reliable.

Resilience

Circuit Breaker

When OpenCode or Codex hits a provider rate limit (e.g., OpenAI 429), the process may block indefinitely. The circuit breaker detects this and recommends immediate action.

SeverityReset TimeAction
Critical> 5 minutesKill worker, respawn on alternate provider immediately
High> 1 minuteKill worker, respawn on alternate provider immediately
Medium< 1 minuteBrief 30s pause, retry same provider once; if still limited, respawn on alternate
Both limitedQueue the task, synthesize what you have, flag remaining for user review

Circuit-breaker respawns count against the max 2 respawns per subtask limit.

Tools

Helper Scripts

ScriptPurpose
scripts/evaluate_output.pyScore worker output against criteria (Presence / Correctness / Completeness / Quality)
scripts/evaluate_worker_logs.pyParse worker terminal output into structured JSON findings
scripts/circuit_breaker.pyDetect rate-limit / API exhaustion errors and recommend respawn actions
scripts/synthesize_outputs.pyMerge multiple worker outputs into one deliverable

Lessons Learned

Known Pitfalls

terminal(background=true) with opencode run hangs indefinitely

The opencode TUI blocks on a pseudo-TTY that never receives input. Output stream stays empty and the process never completes. Always spawn OpenCode workers via execute_code with Python subprocess.Popen.

Shell quoting with opencode run is fragile

Complex prompts containing quotes, newlines, or backticks break shell parsing when passed directly to terminal(command="opencode run '...'"). Prefer stdin piping via execute_code for all non-trivial prompts.

Multiple workers modifying the same config files will overwrite each other

If Worker A updates package.json and Worker B updates package.json, copying both sequentially causes the second copy to overwrite the first. Always diff and merge config files rather than blind copy.

Worker sandboxing is strict

OpenCode cannot read files outside its workdir — confirmed by permission denied errors. Always cp -r project/* workdir/ before spawning.

Synthesis picks up node_modules noise

When workers copy entire projects, synthesize_outputs.py will ingest node_modules READMEs. Always instruct workers to write findings to a specific file (e.g., review.md) and read only that file during synthesis.

Git init timing matters

If you git init an empty workdir before copying project files, git won't track the copied files as modifications. Either copy files first then git init && git add -A, or use diff original_file modified_file instead of git diff.

Lean copied workdirs can break npm/TypeScript verification

If you exclude node_modules to keep worker copies small, commands like npm run lint or tsc --noEmit may fail inside the worker workdir even though the real repo validates fine. Do not treat missing local toolchain as proof the patch is bad.

Provider API rate limits hang workers silently

When OpenCode or Codex hits a provider rate limit (e.g., OpenAI 429), the process may block indefinitely rather than exiting. There is no automatic circuit breaker. Mitigations: pre-check provider quota, set aggressive timeouts (60-90s when limits are tight), respawn on alternate provider immediately after detecting 429.

Evaluation

E2E Test — 2026-04-21

A live end-to-end test spawning 2 OpenCode workers + 1 Codex worker in parallel. Honest results — no sugarcoating.

Worker A (OpenCode) — CSV-to-JSON CLI

Outcome: TIMEOUT after 300s

Root cause: OpenAI API rate limit (429 usage_limit_reached)

Verdict: Infrastructure PASS (orchestration, skill loading, sandboxing worked). Provider FAIL (API exhaustion).

Worker B (OpenCode) — Trigger Optimization Audit

Outcome: TIMEOUT after 300s (same API limit)

Verdict: Same as Worker A — infrastructure healthy, provider blocked.

Worker C (Codex) — Release Packaging

Status: Not spawned. Blocked by A/B timeouts; would have same API risk.

Key Findings

  • Worker isolation works — opencode correctly ran in /tmp/worker-a without escaping.
  • Skill auto-loading works — worker-orchestrator skill was discovered and evaluated.
  • Prompt-file pattern is correct — no shell-quoting failures observed.
  • Provider rate-limiting is the critical failure mode. When API limits hit, workers hang indefinitely.
  • Monitoring gap — execute_code + subprocess.Popen gives no live visibility.