Monitorability

Analyzing and Improving Chain-of-Thought Monitorability Through Information Theory

We use information theory to analyze and improve chain-of-thought monitorability, proposing training methods that improve monitor accuracy while preventing CoT degeneration.