☆ Bonus / exploring anomalies

Problem Setup: Parsing or rephrasing the problem
Plan Generation: Stating or deciding on a plan of action
Fact Retrieval: Recalling facts, formulas, problem details
Active Computation: Algebra, calculations, manipulations
Uncertainty Management: Expressing confusion, re-evaluating, backtracking
Result Consolidation: Aggregating intermediate results, summarizing
Self Checking: Verifying previous steps
Final Answer Emission: Explicitly stating the final answer

Here are some directions for further exploration:

On-Policy vs Off-Policy Interventions

The Thought Branches paper compares: - On-policy: Resample from the model (what we've been doing) - Off-policy: Hand-edit the text or transplant from another model

They find that off-policy interventions are less stable - the model often "notices" the edit and behaves unexpectedly.

Faithfulness Analysis

The paper also studies unfaithfulness - cases where the model's CoT doesn't reflect its actual reasoning. They use "hinted" MMLU problems where a teacher provides a (possibly wrong) hint.

Open-Ended Exploration

Apply these techniques to your own model/domain
Investigate the relationship between model size and thought anchor patterns
Look for receiver heads in other model architectures
Compare thought anchor patterns across different reasoning tasks