Monthly Algorithmic Problems

This is the homepage for the ARENA Monthly Algorithmic Problems sequence. These challenges were designed in the spirit of Stephen Casper's challenges, but with the more specific aim of working well in the context of the rest of the ARENA material, and helping people put into practice all the things they've learned so far.

These are a series of 7 algorithmic problems, run periodically between mid 2023 and late 2024. They are designed to test some of the skills and tools you'll have gathered during the rest of this section. Note that these are better thought of as fun challenges / hackathon-type problems, as opposed to opportunities to learn about specific topics or tools, so we recommend not attempting them while you're working through the ARENA material during any kind of structured program (except as a hackathon).

Available Problems

Each problem below includes both a problem description and detailed solutions. The problems are listed in reverse chronological order (newest first).

Problem	Date	Description	Colab Links
Trigrams	Nov 2024	Predict the next token in sequences containing special trigram patterns	problem
Caesar Cipher	Jan 2024	Decode sequences encrypted with a Caesar cipher	problem \| solutions
Cumulative Sum	Nov 2023	Compute running cumulative sums of sequences	problem \| solutions
Sorted List	Oct 2023	Determine if a sequence of numbers is sorted	problem \| solutions
4-Digit Sum	Sep 2023	Add two 4-digit numbers together	problem \| solutions
First Unique Character	Aug 2023	Find the first unique character in a string	problem \| solutions
Palindromes	Jul 2023	Classify whether a sequence is a palindrome	problem \| solutions

Prerequisites

The following ARENA material should be considered essential for all problems:

[1.1] Transformer from scratch (sections 1-3)
[1.2] Intro to Mech Interp (sections 1-3)

The following material isn't essential, but is recommended:

[1.2] Intro to Mech Interp (section 4)
[1.5.1] Balanced Bracket Classifier (all sections)

Motivation

Neel Nanda's post 200 COP in MI: Interpreting Algorithmic Problems does a good job explaining the motivation behind solving algorithmic problems such as these. I'd strongly recommend reading the whole post, because it also gives some high-level advice for approaching such problems.

The main purpose of these challenges isn't to break new ground in mech interp, rather they're designed to help you practice using & develop better understanding for standard MI tools (e.g. interpreting attention, direct logit attribution), and more generally working with libraries like TransformerLens.

Also, they're hopefully pretty fun, because why shouldn't we have some fun while we're learning?

What counts as a solution?

Going through the exercises in [1.5.1] Balanced Bracket Classifier should give you a good idea of what a full solution looks like. In particular, I'd expect you to:

Describe a mechanism for how the model solves the task, in the form of the QK and OV circuits of various attention heads (and possibly any other mechanisms the model uses, e.g. the direct path, or nonlinear effects from layernorm)
Provide evidence for your mechanism, e.g. with tools like attention plots, targeted ablation / patching, or direct logit attribution
(Optional) Include additional detail, e.g. identifying the linear subspaces that the model uses for certain forms of information transmission, or using your understanding of the model's behaviour to construct adversarial examples