eval_builder¶
Auto-generate an eval spec (eval.md) for a skill.
Entry¶
analyze_skill
Final output¶
eval_spec_result — path to the generated eval.md, case count, criterion count, and a summary.
How it works¶
Reads the target skill's skill.md and phase files, infers test cases that exercise the graph, and proposes per-phase quality criteria. The user runs the spec separately with reyn eval <eval_md_path>.
When phases use Python preprocessors¶
eval_builder writes DO/DON'T templates for criteria when a phase has a Python step — this avoids "vacuously true" criteria like "char_count is correct" that the LLM judge can't actually verify.