LLM as a constrained decision engine¶

In reyn the LLM is not the orchestrator. It is a decision-making node the OS calls between transitions. The OS hands it a small, finite set of choices; the LLM picks one. Anything outside that set is rejected.

What the LLM is allowed to choose¶

For every phase visit, the OS builds a context frame containing:

the current phase's instructions and input artifact,
the candidate outputs — for each allowed next phase (or end), the input schema it expects,
the available Control IR ops — what side effects are unlocked for this phase.

The LLM responds with a single JSON object:

control — pick one candidate (transition to a phase, or finish),
artifact — data that conforms to the chosen target's input schema,
control_ir — zero or more side-effect ops drawn from the available list.

That's the contract. There is no other channel.

Why this is not "the LLM is a tool the OS calls"¶

It would be more accurate to flip the framing: the LLM is the decision policy, and the OS provides the constrained action space. The OS is not the LLM's tool — it's the rule-keeper that bounds what the LLM can do.

This bounding is what gives reyn its three guarantees:

Replayable. A saved event log fully captures the workflow; a re-run on the same inputs follows the same edges (modulo the LLM's stochasticity within each phase).
Validatable. Every artifact is checked against the target schema before the OS commits the transition. A malformed output triggers a re-prompt, not a crash and not silent drift.
Extensible. Because the LLM only picks from the OS-injected candidate set, adding a new phase or new control op never requires retraining or prompt-engineering — the OS just exposes one more option.

The "what if the LLM is wrong?" cases¶

The LLM emits…	The OS does
A `next_phase` not in the graph	Reject; emit `validation_error`; re-prompt
An `artifact` whose `type` doesn't match	Reject; emit `validation_error`; re-prompt
Required schema fields missing	Reject; emit `validation_error`; re-prompt
Control IR ops the phase didn't declare	Reject; emit `permission_denied`
Free-form text outside the JSON contract	Normalizer attempts a recovery; if it fails, emit `normalization_error`

After a configurable number of failed re-prompts the run aborts. The OS never silently fixes up the LLM's output.

Why not give the LLM more freedom?¶

Unconstrained LLM control flow is unstable in three measurable ways:

Drift over long runs. Each free choice is a chance to wander off-task. Bounding the choice set keeps the trajectory in the workflow's design.
Untestability. "Will this prompt eventually finish?" is undecidable for a free agent and trivially decidable on a finite graph.
No clean re-entry point. When something goes wrong, you want to point to the failing phase. Free-form orchestration has no phases to point at.

So reyn pays the cost of writing skill graphs explicitly and gets predictability in return.

LLM as a constrained decision engine¶

What the LLM is allowed to choose¶

Why this is not "the LLM is a tool the OS calls"¶

The "what if the LLM is wrong?" cases¶

Why not give the LLM more freedom?¶

See also¶