Retry actually hurts in Lean! Unlike code, Lean's type system makes "just try again" counterproductive. The failure mode is conceptual, not stochastic.
Learning rate sensitivity resolved: lr=1e-6 is optimal for 8B. Too high destabilizes, too low undertrains. Standard hyperparameter finding.
Running the retry loop for 2 iterations compounds gains. Each round filters harder, pushing the model further along the success distribution.
Quantified the frontier-rejection contribution at +1.2pp. Confirms that rejection sampling at the capability boundary adds a modest but real boost.
Temperature annealing helps, but primarily through increased failure signal (higher temp = more failures = stronger selective pressure), not through diversity.
Dynamic curriculum scheduling underperforms static selection. Adaptive difficulty doesn't help when the mechanism is binary pass/fail filtering.
Both code and Lean show similar path diversity metrics. The difference in retry effectiveness isn't about solution space structure but about error signal informativeness.