Figure Gallery

PI Distillation Research - All Generated Visualizations

Main Results

Main Null Result

PI conditions (online RL, offline distillation, hybrid) yield equivalent final performance across domains.

fig1_main_results.png
Mechanism Decomposition

Mechanism Decomposition

Comparison of training with and without failure signal, isolating the critic's contribution.

fig2_mechanism_decomposition.png
Method Hierarchy

4-Level Method Hierarchy

Hierarchy of PI methods from simple rejection sampling through full async critic loops.

fig3_hierarchy.png
Cross-Domain Scatter

Cross-Domain Scatter

Performance gains scatter across Lean, MATH, and code domains showing domain-agnostic patterns.

fig4_cross_domain.png
K Sweep

Retry Count Inverted-U

Performance as a function of retry count K (1, 2, 3, 5), showing diminishing and then negative returns.

fig_k_sweep.png
Baseline Competence Inverted-U

Baseline Competence Inverted-U

7-point curve showing PI gains peak at intermediate baseline competence, vanishing at extremes.

fig_inverted_u_complete.png
Mechanism Waterfall

Mechanism Quantification Waterfall

Waterfall chart decomposing total PI gain into constituent mechanism contributions.

fig_mechanism_waterfall.png
Per-Problem Overlap

Cross-Seed Agreement

Overlap analysis showing per-problem solve consistency across independent training seeds.

per_problem_overlap.png
Amortized Inference

Amortized Inference Gap

Analysis of the gap between online search cost and amortized (distilled) inference performance.

amortized_inference.png