Rinne: composing local AI coding agents into a verified generator–evaluator loop
Rinne is the first product shipped under GIKSN Research. It is a terminal-first orchestration harness that plans work into a JSON DAG, distributes that work across the AI coding tools and model APIs a user already has installed, and drives the DAG to completion through a generator–evaluator loop with critique. This paper argues the design: a decoupled conductor for cheap planning, a durable loop for long-running work, and a filesystem blackboard as the shared substrate that lets heterogeneous workers collaborate without a bespoke protocol.
Most engineers with a coding-agent subscription end up locked to one vendor and one model. The frontier moves week-to-week, but the harness you sit inside does not. Rinne is an attempt to invert that: keep the harness stable, keep the workers swappable, and route each subtask to the tool that is best suited to it, whether that tool is a subscription-backed CLI like claude-code or a raw OpenAI-compatible API call.
This is the first product GIKSN Research has shipped. It is v0.1, actively built, single-machine, single-user. There is no hosted component, no telemetry, and no account. The rest of this paper argues the choices behind that shape.
Problem
A user with one or more coding-agent subscriptions, or with raw API keys, or with both, cannot easily compose those assets into a single workflow. Each harness has its own login, its own context model, and its own idea of what a task looks like. Multi-model orchestration exists commercially, but it charges again on top of what the user has already paid for, and it introduces a hosted middleman.
Behind that surface complaint sits a research question: what is the minimum interface a heterogeneous pool of AI coding tools needs to share so that a planner can compose them into an ad-hoc team for a single task and verify the result? Rinne argues the answer is small.
Approach
Rinne unifies two ideas from the last two years of AI-agent research. The first is the conductor: a small, cheap model that composes a team of larger workers for a given task instead of one monolithic worker doing everything. The second is the loop: durable state on disk, generator into evaluator, critique fed back until the goal is met or a budget runs out.
The synthesis is a three-layer harness. A conductor plans a JSON DAG from the goal, a digest of the blackboard, and the live worker registry. A loop engine schedules that DAG across workers, gates each node with an evaluator, and re-plans on failure. A blackboard on the filesystem holds the plan, the progress, and the working outputs, and it is the only channel workers need to share.
The conductor composes a per-task team, the loop drives verification, and the filesystem is the substrate that lets heterogeneous workers collaborate.
The worker contract
Rinne defines one worker interface and two families that implement it. A harness worker wraps a native headless call to an existing coding-agent CLI (claude -p, codex exec, opencode, , , , ). It is autonomous: give it a chunky self-contained subtask and it does its own reading and editing. An API worker is a direct OpenAI-compatible model call on the user's own key. It is raw: give it a precise instruction and inlined context and it returns one focused result.
