4-Digit Sum (September 2023)

Colab: problem | solutions

Difficulty

This problem is probably a step up in difficulty from the August problem. The solution is more involved, and may require different / more creative interpretability approaches.

Task & Dataset

The model takes in a sequence of 11 tokens: 4 digits, a + sign, 4 digits, an = sign, and then 1 token representing the first digit of the sum. The model outputs the next 4 characters of the sum (one for each of the last 4 positions). In other words, the model is trained to add up 2 4-digit numbers.

Note that the input is given in "little-endian" format (i.e. the units digit is first, then the tens digit, etc). Same for the output. This is because it makes the model easier to train.

Here is an example:

Input:  1234 + 5678 = ?
Output: ????0

Model

The model is a 3-layer transformer with 3 attention heads, and causal attention. It includes layernorm, but no MLP layers.