Skip to content

Add a Python preprocessor step

Goal: Run a Python function before the LLM call to enrich the input artifact with a deterministically computed field (statistics, normalization, structured parses).

When to use

  • The computation is deterministic and you want it run identically every time.
  • It's expensive or error-prone for the LLM (numerical stats, regex parsing, JSON shape transforms).
  • You'd rather pay code-review cost once than prompt-engineering cost forever.

Two modes

Mode Sandboxing Use for
safe AST-validated, restricted builtins, allowlisted imports, subprocess Standard math/stats/regex work
unsafe None — full Python File I/O, custom packages, anything safe blocks

Default to safe. Reach for unsafe only when safe blocks something you actually need.

Step 1 — write the function

<skill_dir>/stats.py:

def compute(artifact):
    text = artifact["data"].get("text", "")
    return {"word_count": len(text.split())}

The function takes the input artifact and returns a JSON-serializable dict.

Step 2 — declare it in the phase

phases/draft.md:

---
type: phase
name: draft
input: user_message
preprocessor:
  - python:
      module: stats
      function: compute
      mode: safe
      output_schema:
        type: object
        required: [word_count]
        properties:
          word_count: { type: integer }
      into: stats
---

Use `stats.word_count` to decide whether to summarize or expand the
text.

output_schema is required — the LLM needs to know the shape, and reyn won't run user code at compile time to infer it.

Step 3 — declare permissions

In the phase frontmatter:

permissions:
  python:
    - module: stats
      function: compute
      mode: safe
      timeout: 30

The module/function must match the preprocessor step.

Step 4 — approve at startup

safe mode steps still need approval the first time:

# reyn.yaml — pre-approve project-wide
permissions:
  python:
    safe: allow

For unsafe:

permissions:
  python:
    unsafe: allow

…and run with --allow-unsafe-python.

What safe mode disallows

  • open, eval, exec, __import__, compile, globals, locals
  • subprocess and other risky modules
  • Imports outside the curated allowlist (math, statistics, json, re, random, time, datetime, …)

Extend the allowlist via reyn.yaml's python.allowed_modules.

See also