[3.4] - LLM Agents

Colab: exercises | solutions

Please send any problems / bugs on the #errata channel in the Slack group, and ask any questions on the dedicated channels for this chapter of material.

If you want to change to dark mode, you can do this by clicking the three horizontal lines in the top-right, then navigating to Settings → Theme.

Links to all other chapters: (0) Fundamentals, (1) Transformer Interpretability, (2) RL.

Introduction

This set of exercises can last up to 2 days, and involves building and working with LLM agents using Inspect. LLM agents consist of a scaffolding program interacting with an LLM API. We'll also build and analyse two tasks for LLM agents to complete, a simple and a complex one, in order to see how LLM agents act.

We'll begin by building a simple Arithmetic Task and Arithmetic Agent. This should teach you the basics of function calling using Inspect. Then, once we're comfortable with function calling and the general setup of LLM agents and tasks, we'll move on to building a more complex agent that plays the Wikipedia Game.

Then we'll explore a variety of elicitation methods. These are methods for getting the best capabilities out of models, and are crucial for evaluating LLM agents. Looking at model performance with elicitation method help us to answer the question "Can the model do this?" Unfortunately, we'll almost never be able to prove that the model doesn't have a capability, and will only be able to say that with some effort, we couldn't get the model to demonstrate this capability. This means we'll have to put a lot of effort into trying to exhibit the behavior in models (to have the highest confidence when we make a claim that the model can't exhibit this behavior). This will involve:

Improving our prompting
Improving our tools
Improving the way the relevant information is stored
Ensuring the model can access good information

Each exercise will have a difficulty and importance rating out of 5, as well as an estimated maximum time you should spend on these exercises and sometimes a short annotation. You should interpret the ratings & time estimates relatively (e.g. if you find yourself spending about 50% longer on the exercises than the time estimates, adjust accordingly). Please do skip exercises / look at solutions if you don't feel like they're important enough to be worth doing, and you'd rather get to the good stuff!

For a lecture on the material today, which provides some high-level understanding before you dive into the material, watch the video below:

Content & Learning Objectives

1️⃣ Intro to LLM Agents

Learning Objectives

Understand why we want to evaluate LLM agents.

Read resources about LLM agent evaluations to understand the current state of the field.

Understand the common failure modes of LLM agents.

2️⃣ Building a Simple Arithmetic Agent

Learning Objectives

Understand that a LLM agent is just a "glorified for-loop" (of the scaffolding program interacting with the LLM API).

Learn how to use function calling to allow LLMs to use external tools.

Understand the main functionalities of an LLM agent.

3️⃣ Building a more Complex Agent: WikiGame

Learning Objectives

Get comfortable building a more complex task, with noisy and imperfect tool outputs

Understand how to build a more complex agent that implements dynamic decision-making

Observe the failure modes of a more complex agent

4️⃣ Elicitation

Learning Objectives

Understand the importance of elicitation in evaluating LLM agents

Understand the different methods of elicitation

Understand how to improve prompting, tools, history storage, and information access in LLM agents

import os
import re
import sys
from pathlib import Path
from typing import Literal

import wikipedia
from dotenv import load_dotenv
from inspect_ai import Task, eval, task
from inspect_ai.agent import Agent, AgentState, agent, as_solver
from inspect_ai.dataset import Sample
from inspect_ai.model import (
    ChatMessageSystem,
    ChatMessageTool,
    ChatMessageUser,
    execute_tools,
    get_model,
)
from inspect_ai.tool import Tool, tool
from openai import OpenAI
from wikipedia import DisambiguationError, PageError, WikipediaPage

# Make sure exercises are in the path
chapter = "chapter3_llm_evals"
section = "part4_llm_agents"
root_dir = next(p for p in Path.cwd().parents if (p / chapter).exists())
exercises_dir = root_dir / chapter / "exercises"
section_dir = exercises_dir / section
if str(exercises_dir) not in sys.path:
    sys.path.append(str(exercises_dir))

import part4_llm_agents.tests as tests
from part4_llm_agents.utils import evaluate_expression, execute_tools, extract_answer

EVAL_MODEL = "openai/gpt-4o-mini"
os.environ["INSPECT_EVAL_MODEL"] = EVAL_MODEL
MAIN = __name__ == "__main__"

load_dotenv()

assert os.getenv("OPENAI_API_KEY") is not None, "You must set your OpenAI API key - see instructions in dropdown"

# OPENAI_API_KEY

openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))