ARENA
Content
Chapter 0: Fundamentals Chapter 1: Transformer Interpretability Chapter 2: Reinforcement Learning Chapter 3: LLM Evaluations Chapter 4: Alignment Science
Planner Setup Instructions FAQ

RL

  • 2.1 Intro to RL
  • 2.2 DQN & VPG
  • 2.3 PPO
  • 2.4 RLHF
In this section
    On this page
    1. Content
    2. RL

    Chapter 2: Reinforcement Learning

    Take a whirlwind tour through RL, starting from tabular learning and Atari, and ending with some of the cutting-edge techniques used in current LLM post-training.

    Sections

    2.1 Intro to RL RL fundamentals: MDPs, policies, value functions, and multi-armed bandits.
    2.2 DQN & VPG Implement DQN and Vanilla Policy Gradient for CartPole and beyond.
    2.3 PPO Build a PPO agent from scratch and train it to master CartPole.
    2.4 RLHF Implement RLHF end-to-end, applying PPO to language model finetuning.
    Select Context
    Select exercise content to download as text files, or to provide as context when asking questions below.
    Sections
    File type
    Visit a chapter section to download content
    Ask a Question
    Ask questions about the exercises. The model will use the context you've selected above to provide relevant answers.
    Ask questions about the exercises...

    ARENA - Alignment Research Engineer Accelerator