Skip to content

For Reyn developers

Orientation for contributors to the Reyn OS core. If you're adding a new op kind, fixing a runtime bug, or extending the event system, start here.

If you're building workflows on top of Reyn rather than modifying the OS itself, see the workflow authoring docs instead.


Read first

CLAUDE.md (in the repo root) — the invariants every code-writing agent (and human contributor) must follow. P1–P8 are hard constraints, not guidelines.

CLAUDE.md — the why behind P1–P8, with worked examples.

principles-and-code.md — P1–P8 mapped to the exact files and classes that enforce them. Read this when you need to find where something lives.


The OS in one paragraph

User → Agent → Workflow → OS → Phase → Workspace

The OS (kernel/runtime.py) is the only thing that calls the LLM, executes Control IR ops, validates outputs, and emits events. Workflows describe what to do; the OS does how. A new workflow must never require an OS change (P7).


How-tos

Adding capabilities

  • Add a new op kind — register a new Control IR operation. Three touch points: model, registry, handler.
  • Write LLMReplay tests — test LLM-dependent behaviour deterministically without live API calls.

Benchmarking

  • Run SWE-bench — run Reyn against SWE-bench: solve a single instance, run a batch, and the optional-dep / honest-skip scoring gotcha.

Understanding the system


Key source files

File What it does
src/reyn/core/op_runtime/registry.py Single source of truth for op kinds, Pydantic models
src/reyn/core/context_builder.py Builds the ContextFrame injected into every LLM call (P4 candidates here)
src/reyn/schemas/models.py All Pydantic models — Phase, Skill, SkillGraph, Op
src/reyn/core/events/events.py Append-only EventLog (P6)
src/reyn/data/workspace/workspace.py Workspace read/write with permission gating (P5)

Testing policy

Read deep-dives/contributing/testing.md before writing any test. Key rules:

  • Tests belong to exactly one Tier (1: Contract / 2: OS invariant / 3: LLM-replay).
  • Never use MagicMock / AsyncMock / patch on collaborators. Use real instances or LLMReplay.
  • Never assert on private state. Use public surface or snapshot().
  • Tier 4 ("doesn't fit a tier") → don't write it.

The full rationale is in the testing doc — the rules are non-obvious and violation is easy.


See also