`eval_builder`¶

Auto-generate an eval spec (eval.md) for a skill.

Entry¶

analyze_skill

Final output¶

eval_spec_result — path to the generated eval.md, case count, criterion count, and a summary.

How it works¶

Reads the target skill's skill.md and phase files, infers test cases that exercise the graph, and proposes per-phase quality criteria. The user runs the spec separately with reyn eval <eval_md_path>.

When phases use Python preprocessors¶

eval_builder writes DO/DON'T templates for criteria when a phase has a Python step — this avoids "vacuously true" criteria like "char_count is correct" that the LLM judge can't actually verify.

Example¶

reyn run eval_builder "build an eval for my_explainer"

Source¶

src/reyn/stdlib/skills/eval_builder/skill.md

eval_builder¶