`skill_improver` — review criteria¶

Use this list when you (the skill_improver skill) audit an existing skill. Each criterion maps to one or more of the P1–P8 principles. Findings should be specific, file-anchored, and actionable.

Phase markdown¶

For every phases/<name>.md:

[ ] Frontmatter declares only input (and optionally preprocessor, role, can_finish). No output schema. [P1]
[ ] No enumeration of next-phase artifact fields in the body. [P1, P8]
[ ] No description of Control IR format (e.g. "emit {kind: file, op: write, ...}"). [P8]
[ ] No reference to a sibling phase's name. Transitions live in the skill graph. [P1]
[ ] Body covers WHAT to do, WHEN to pick which candidate, and domain rules — nothing else.
[ ] If the phase has can_finish: true, instructions clearly state when to finish vs. continue.

Skill markdown¶

For skill.md:

[ ] entry, graph, final_output all present. [P2]
[ ] final_output is a single artifact type.
[ ] Every phase referenced in graph has a corresponding phases/<name>.md.
[ ] Every phase whose can_finish: true has a path to end in the graph.
[ ] No unreachable phases.
[ ] No self-loops (graph entries listing themselves).

Artifact schemas¶

For each artifacts/<name>.yaml:

[ ] One artifact = one purpose. No kitchen-sink shapes.
[ ] required is minimal; optional fields are marked optional.
[ ] Field names are lowercase_snake_case.
[ ] No "quality_notes" / "revision_reason" / other meta-feedback fields baked in. [P7]
[ ] No decision fields with skill-specific values (revise, redo). [P7]

Preprocessor¶

For each preprocessor step:

[ ] The step kind is one of run_skill, iterate, validate, lint_plan, python.
[ ] into doesn't collide with an existing input artifact key.
[ ] If python: matching permissions.python entry, .py file exists, function defined, output_schema declared.
[ ] If python mode: safe: no banned constructs in the AST (open, eval, subprocess, etc.).

Eval-friendliness¶

[ ] At least one path produces an output that can be judged against rubric criteria.
[ ] Phase boundaries align with what a judge would assess (one phase = one judgeable output).
[ ] No criteria that would be vacuously satisfied by any reasonable LLM output.

Output of a review¶

For each finding:

[severity]  <file>:<line or section>
  Issue:    <what's wrong>
  Why:      <which principle / pitfall this hits>
  Fix:      <concrete change>

Severity:

error — violates P1–P8 or breaks linting; must fix.
warning — likely cause of bugs but not a hard violation.
info — stylistic or polish-level.

After auditing¶

[ ] Run reyn lint <skill> and include any new findings in the report.
[ ] If proposing changes, write them as diffs the user can review (don't silently rewrite).
[ ] Don't claim "ready" until errors are resolved or explicitly waived.

skill_improver — review criteria¶