4️⃣ Elicitation
Learning Objectives
- Understand the different methods of elicitation
- Understand how to improve prompting, tools, history storage, and information access in LLM agents
You may have observed that while our initial implementation of WikiAgent succeeds at these relatively challenging games, if we increase the difficulty slightly, then the agent will fail (one possible example is the game: Joinery → Amethyst; our agent will usually fail on this task). However, this doesn't mean that GPT-4o-mini does not have the capability to perform better on this task, but this capability might be blocked because we:
- Prompted the model poorly or ineffectively.
- Stored and presented the task history poorly.
- Didn't give the model sufficient tools to accomplish the task.
In general, it is hard to show that a model does not have a certain capability, even if we've failed to demonstrate this capability. For example, it took 3.5 years after the release of GPT-2 (and 2.5 years after the release of GPT-3) for people to discover that chain-of-thought reasoning massively improves model performance, which enabled the same models to complete significantly harder tasks. Dangerous capability evaluations for LLM agents require us to elicit the best capabilities possible, until we feel we've managed to gain evidence of absence, not just absence of evidence.
Broadly speaking, there are two categories of elicitation:
- Narrow elicitation: Task-specific methods that improve model performance on a particular task or small class of tasks, but likely won't impact model performance in general across many tasks.
- E.g. A tool that gives the model access to the content of arbitrary wikipedia articles. This will improve performance on this task significantly, but wouldn't generalize to other tasks.
- General elicitation: Task-agnostic methods that improve model performance on a wide array of possible tasks.
- E.g. Chain-of-thought prompting: This tends to improve model performance on a wide array of tasks. These sorts of elicitation methods are the ones we're most interested in. If researchers find an improvement to models that is roughly as easy and effective as chain-of-thought prompting, then we would see a very rapid increase in risk from AI.
The elicitation methods we'll try in this section will mostly revolve around prompting in order to obtain better performance, including chain-of-thought prompting, the ReAct Framework; as well as some more exotic methods, like a lookahead tool.
Tip - How to find wikipedia pages to test on
You might start having a hard time coming up with wikipedia pages to test on. Luckily, there are websites which generate random pages for this purpose, one good website is accessible via: https://wikispeedruns.com/ (you may want to change the "Random Article Generator Settings" to sample from the most popular 100,000 wikipedia pages, as the default setting of 3000 will generally generate paths that are too easy to test our agent). We've also provided you with a list of 18 wikipedia pairs, stored as wiki_pairs. These are ordered approximately in increasing difficulty.
To test whether two pages are connected via links, use this free online tool to see the possible paths between pages: https://www.sixdegreesofwikipedia.com/ (be somewhat careful with this though, as the paths that this website believes are accessible may not be accessible to our agent).
In this section, we'll use the gpt-4o-mini-2024-07-18 model to gauge whether our elicitation methods are effective since OpenAI will occasionally release small updates to gpt-4o-mini which change its behaviour. However, if you're curious, you can try testing your elicitation methods on the newest gpt-4o-mini model. What you will most likely notice is that your elicitation methods improve the model significantly less, and the model performs much better at the task without needing as much elicitation. This is because their most recent models are generally more capable, and so saturate the evaluation of "How well can a model play the Wikipedia game." For a real agent evaluation, you'd want to have increasingly difficult tasks, so that even as models improve, we can find tasks that are too difficult for them to achieve (e.g. the 16-hour tasks on METR's Measuring AI Ability to Complete Long Tasks)
As you should already know, prompting can have a large impact on model performance. There are many changes you could make for prompts in this task. You should experiment first with more general elicitation methods such as getting the agent to think more deeply, and output plans in different ways. After this, you might try more narrow elicitation methods, such as:
- Telling the agent how many pages it's visited.
- Telling the agent if it's already visited the page it's on (and how many times).
- Schedule different prompts and planning methods for the "zoom out" and "zoom in" sections of the game, since we know that a good general strategy for playing the wikipedia game is:
Narrow article (with few links) -> General article (with many links) -> Narrow article (with few links)
Exercise - Engineer prompts
```yaml Difficulty: 🔴🔴⚪⚪⚪ Importance: 🔵🔵🔵⚪⚪
You should spend up to 20-35 mins on this exercise. ``` Try and design prompts that improve the performance of the wikipedia agent. You may have to do a decent amount of experimentation here. Remember that your prompts will have to be robust to:
- Different tasks within the wikipedia game,
- Different states within those tasks,
- Different failure-modes the agent could encounter.
See if you can significantly improve performance. There's a test task below that you should aim to be able to solve with improved prompting.
class WikiAgentPrompting(WikiAgent):
"""
Inherits from WikiAgent and adds improved prompting.
"""
@property
def system_instruction(self):
"""
Provide improved starting instructions for the game.
Returns:
dict: The starting instructions. "role" is "system" for system messages.
"""
raise NotImplementedError("You need to implement a new system_instruction property")
@property
def on_page_instruction(self):
"""
Provide improved instructions for the current page.
Returns:
dict: The instructions for the current page. "role" is "user" for user messages.
"""
raise NotImplementedError("You need to implement a new on_page_instruction property")
@property
def next_step_instruction(self):
"""
Provide improved instructions for the next step.
Returns:
dict: The instructions for the next step. "role" is "user" for user messages.
"""
raise NotImplementedError("You need to implement a new next_step_instruction property")
Solution
This isn't a perfect solution, but is an example of improved prompting compared to that in the original WikiGame class solution code. You may be able to do even better!
class WikiAgentPrompting(WikiAgent):
"""
Inherits from WikiAgent and adds improved prompting.
"""
@property
def system_instruction(self):
"""
Provide improved starting instructions for the game.
Returns:
dict: The starting instructions. "role" is "system" for system messages.
"""
return {
"role": "system",
"content": f"You are a wikipedia-racing AI. Your goal is to reach {self.task.goal_page.title} by accessing links from wikipedia pages. Your current page is {self.task.current_page.title}.",
}
@property
def on_page_instruction(self):
"""
Provide improved instructions for the current page.
Returns:
dict: The instructions for the current page. "role" is "user" for user messages.
"""
return {
"role": "user",
"content": f"""You are currently on page: {self.task.current_page.title}. Make sure you start by reasoning about what steps you should take to get to the article on {self.task.goal_page.title}. When coming up with a strategy, make sure to pay attention to the path you have already taken, and if your current strategy doesn't seem to be working out, try something else. In case you're unsure, {self.task.goal_page.title} has the following summary:\n\n[Begin Summary]\n{self.task.get_page_summary(self.task.goal_page)}\n[End Summary]\n\nThe path you have taken so far is {" -> ".join(self.task.page_history)}.
""",
}
@property
def next_step_instruction(self):
"""
Provide improved instructions for the next step.
Returns:
dict: The instructions for the next step. "role" is "user" for user messages.
"""
return {
"role": "user",
"content": f"""What's your next step to reach {self.task.goal_page.title}? Make sure to think carefully about what steps you should take to get there.""",
}
LLM agents can be quite random - as you might have noticed - as a result of the default temperature being 1, and agents operating over a much longer horizon than usual for LLMs. So we'll do our testing at temperature = 0. This impacts performance noticeably, but better elicitation methods still have a noticeable effect.
Your original WikiAgent may not reliably be able to solve the example path Mandate of Heaven -> Doric Greek at temperature 0 (although it may occasionally get lucky). However, with sufficiently improved prompting, you should be able to get the agent to solve this task reliably.
# Run game with original WikiAgent
game = WikiGame("Mandata of Heaven", "Doric Greek")
agent = WikiAgent(game, tools=wiki_game_tools, model="gpt-4o-mini-2024-07-18", temperature=0)
agent_loop(agent, 30)
# Run game with improved WikiAgentPrompting class
game = WikiGame("Mandate of Heaven", "Doric Greek")
agent = WikiAgentPrompting(
game, tools=wiki_game_tools, model="gpt-4o-mini-2024-07-18", temperature=0
)
agent_loop(agent, 30)
Exercise - Implement the ReAct framework
```yaml Difficulty: 🔴🔴⚪⚪⚪ Importance: 🔵🔵🔵⚪⚪
You should spend up to 15-20 mins on this exercise. ``` The ReAct framework is an extension of chain-of-thought reasoning. Instead of prompting the model to think step-by-step, it separates this into two steps, especially designed for agent-based tasks:
- Reasoning: The model is asked to reason about its current situation, and what sort of actions it should consider taking.
- Action: Then, the model is asked to perform an action based on its outputted reasoning.
Note that during the reasoning step, when you're calling the model without tools, OpenAI won't provide the model with a description of the tools. However, we definitely want the model to have information about the available tools when it's reasoning about what actions to take. So, we'll have to ensure that the tool descriptions are in the system_instruction we provide. (This will lead to some redundancy when the model takes an action, but redundancy is usually alright with LLMs). This means that from now on we will have to pass the list of tools to both the task and the agent.
class WikiAgentReAct(WikiAgentPrompting):
"""
Inherits from WikiAgent and adds the ReAct framework.
Attributes:
model (str): The model used for generating responses (inherited)
client (OpenAI): OpenAI client for API calls (inherited)
task (WikiGame): The current task being executed (inherited)
chat_history (list[dict]): History of interactions (inherited)
tools (list[Any]): List of tools (implemented below)
Methods:
get_response(use_tool: bool = True) -> ChatCompletionMessage: Get response from the model
(inherited)
execute_tool_calls(message: ChatCompletionMessage) -> list[str]: Execute tool calls from the
model's response (inherited)
run(with_tool: bool = True) -> bool: Run one loop of the Wikipedia agent (inherited)
update_history(message): Update self.chat_history and self.full_chat_history with a message
or list of messages. (inherited)
reset_history(): Empty self.chat_history of the agent. (inherited)
handle_tool_calls(response: ChatCompletionMessage): Handles tool_calls in the wikipedia game
context. (inherited)
handle_refusal(response: ChatCompletionMessage): Handles refusals in the wikipedia game
context. (inherited)
start(): A function to put the starting instructions in agent.chat_history when the agent
starts a new page or starts the game. (inherited)
run(): This function runs the agent in the wikipedia game context. (inherited)
"""
@property
def system_instruction(self):
"""
Provided a description of the tools in the system message. When generate is called with
tools this is redundant, but when generate is called without tools, this is useful.
Returns:
dict: The starting instructions. "role" is "system" for system messages.
"""
raise NotImplementedError("You need to implement the new system_instruction property")
def generate_reason(self) -> ChatCompletionMessage:
"""
Generate a reason for the agent to take an action. This function should:
- Get the model to reason about the current state of the game (without tools)
- Return the response from the model
Returns:
message (ChatCompletionMessage): The response from the model
"""
raise NotImplementedError("You need to implement the generate_reason method")
def generate_action(self) -> ChatCompletionMessage:
"""
Generate an action for the agent to take. This function should:
- Get the model to generate an action for the agent to take (with tools)
- Return the response from the model
Returns:
message (ChatCompletionMessage): The response from the model
"""
raise NotImplementedError("You need to implement the generate_action method")
def generate_reason_and_action(self) -> ChatCompletionMessage:
"""
Generate a Reason and Action for the agent to take. This function should:
- Generate a Reason
- Add the Reason to the chat history
- Generate an Action
- Return the Action so that tool calls can be handled
Returns:
message (ChatCompletionMessage): The action from the model
"""
raise NotImplementedError("You need to implement the generate_reason_and_action method")
def run(self):
"""
Run one loop of the agent.
This function should:
- Generate a Reason and Action
- Handle the tool calls, refusals, and no tool calls in the model response
"""
raise NotImplementedError("You need to implement the new run method")
Solution
class WikiAgentReAct(WikiAgentPrompting):
"""
Inherits from WikiAgent and adds the ReAct framework.
Attributes:
model (str): The model used for generating responses (inherited)
client (OpenAI): OpenAI client for API calls (inherited)
task (WikiGame): The current task being executed (inherited)
chat_history (list[dict]): History of interactions (inherited)
tools (list[Any]): List of tools (implemented below)
Methods:
get_response(use_tool: bool = True) -> ChatCompletionMessage: Get response from the model
(inherited)
execute_tool_calls(message: ChatCompletionMessage) -> list[str]: Execute tool calls from the
model's response (inherited)
run(with_tool: bool = True) -> bool: Run one loop of the Wikipedia agent (inherited)
update_history(message): Update self.chat_history and self.full_chat_history with a message
or list of messages. (inherited)
reset_history(): Empty self.chat_history of the agent. (inherited)
handle_tool_calls(response: ChatCompletionMessage): Handles tool_calls in the wikipedia game
context. (inherited)
handle_refusal(response: ChatCompletionMessage): Handles refusals in the wikipedia game
context. (inherited)
start(): A function to put the starting instructions in agent.chat_history when the agent
starts a new page or starts the game. (inherited)
run(): This function runs the agent in the wikipedia game context. (inherited)
"""
@property
def system_instruction(self):
"""
Provided a description of the tools in the system message. When generate is called with
tools this is redundant, but when generate is called without tools, this is useful.
Returns:
dict: The starting instructions. "role" is "system" for system messages.
"""
tool_descriptions = "\n".join(
[
tool.description["function"]["name"]
+ ": "
+ tool.description["function"]["description"]
for tool in self.tools
]
)
return {
"role": "system",
"content": f"""You are a wikipedia-racing AI. Your goal is to reach {self.task.goal_page.title} by accessing links from wikipedia pages. You should avoid list pages, as the links that you would expect from the list are not accessible to you. Your current page is {self.task.current_page.title}. You have access to {str(len(self.tools))} tools, which are:\n{tool_descriptions}""",
}
def generate_reason(self) -> ChatCompletionMessage:
"""
Generate a reason for the agent to take an action. This function should:
- Get the model to reason about the current state of the game (without tools)
- Return the response from the model
Returns:
message (ChatCompletionMessage): The response from the model
"""
# Get the model to reason about the current state of the game and add the response to the
# messages (you may not want to give it tools for this).
self.update_history(
{
"role": "user",
"content": f"Think carefully about your current situation and what actions you want to take to get closer to {self.task.goal_page.title}.",
}
)
if self.verbose:
print(
f"\nUSER: Think carefully about your current situation and what actions you want to take to get closer to {self.task.goal_page.title}."
)
response = self.get_response(use_tool=False)
return response
def generate_action(self) -> ChatCompletionMessage:
"""
Generate an action for the agent to take. This function should:
- Get the model to generate an action for the agent to take (with tools)
- Return the response from the model
Returns:
message (ChatCompletionMessage): The response from the model
"""
# Get the model to generate an action based on the reasoning and add the response to the messages
self.update_history(
{"role": "user", "content": "Now, what action will you take based on your reasoning?"}
)
if self.verbose:
print("\nUSER: Now, what action will you take based on your reasoning?")
response = self.get_response(use_tool=True)
return response
def generate_reason_and_action(self) -> ChatCompletionMessage:
"""
Generate a Reason and Action for the agent to take. This function should:
- Generate a Reason
- Add the Reason to the chat history
- Generate an Action
- Return the Action so that tool calls can be handled
Returns:
message (ChatCompletionMessage): The action from the model
"""
reason = self.generate_reason()
self.update_history({"role": "assistant", "content": reason.content})
print("\nModel response ('Reason'):", reason.content)
action = self.generate_action()
return action
def run(self):
"""
Run one loop of the agent.
This function should:
- Generate a Reason and Action
- Handle the tool calls, refusals, and no tool calls in the model response
"""
response = self.generate_reason_and_action()
if response.tool_calls:
self.handle_tool_calls(response)
elif response.refusal:
self.handle_refusal(response)
else:
self.update_history({"role": "assistant", "content": response.content})
print("\nModel response ('Action'):", response.content)
Now run your Wikipedia ReAct agent (we haven't given tests to check that the model works, since your precise implementation may deviate from ours, by running the agent, and checking its chat_history, you should be able to tell whether your ReAct framework is operating correctly). You should be able to notice an improved reasoning process each time the model runs, and might notice that on some paths the model performs more effectively (although this is hard to demonstrate conclusively).
However, you'll also likely notice that the difference between effective prompting, and the ReAct method we've implemented here isn't massive. Using the ReAct framework is similar to chain-of-thought prompting in this case, and prompting can make a difference only up to a point. However, ReAct does tend to make the agent more reliable on higher temperatures (although this is impossible to identify in just a single run).
# Run the game with WikiAgentPrompting
game = WikiGame("Balto-Slavic languages", "Netscape Navigator 9")
agent = WikiAgentPrompting(
task=game, tools=wiki_game_tools, model="gpt-4o-mini-2024-07-18", temperature=0
)
agent_loop(agent, 30)
# Run the game with WikiAgentReact
game = WikiGame("Balto-Slavic languages", "Netscape Navigator 9")
agent = WikiAgentReAct(
task=game, tools=wiki_game_tools, model="gpt-4o-mini-2024-07-18", temperature=0
)
agent_loop(agent, 30)
Exercise - Let the LLM see its entire chat history
```yaml Difficulty: 🔴🔴⚪⚪⚪ Importance: 🔵🔵⚪⚪⚪
You should spend up to 10-15 mins on this exercise. ```
You may have noticed that the agent performs significantly worse as a result of the fact that we decided to reset the chat history every time the agent encounters a new page. For example, it will occasionally come up with good plans and not follow through on them, since its in-context memory has been erased. We can fix this issue by letting the agent see the entirety of its chat history.
The main obstacle to allowing the agent to see its entire history is the capacity of its context window -- specifically due to the length of wikipedia articles that the agent has to retrieve in order to play the game. However, we can fix this issue by resetting only the outputs of the get_content() function each time the agent moves to a new page, instead of resetting the entire chat history.
When we reset this content, we should still let the agent know that Wikipedia content was output in that location, as otherwise it will confuse the agent about the get_content tool. You should replace the content with "Wikipedia content was output here. Wikipedia page: {page_title}" so that the agent knows that the get_content tool works as expected.
Modify the reset_history function in the WikiAgentReAct class to accomplish this.
class WikiAgentChatHistory(WikiAgentReAct):
"""
Inherits from WikiAgentReAct and adds the ability to store and retrieve chat history.
Attributes:
model (str): The model used for generating responses (inherited)
tools (list[Any]): List of tools (inherited)
client (OpenAI): OpenAI client for API calls (inherited)
task (WikiGame): The current task being executed (inherited)
chat_history (list[dict]): History of interactions (inherited)
full_chat_history (list[dict]): Full history of interactions
Methods:
get_response(use_tool: bool = True) -> ChatCompletionMessage: Get response from the model
(inherited)
execute_tool_calls(message: ChatCompletionMessage) -> list[str]: Execute tool calls from the
model's response (inherited)
run(with_tool: bool = True) -> bool: Run one loop of the Wikipedia agent (inherited)
update_history(message): Update self.chat_history and self.full_chat_history with a message
or list of messages. (inherited)
reset_history(): Empty self.chat_history of the agent. (modified below)
handle_tool_calls(response: ChatCompletionMessage): Handles tool_calls in the wikipedia game
context. (inherited)
handle_refusal(response: ChatCompletionMessage): Handles refusals in the wikipedia game
context. (inherited)
start(): A function to put the starting instructions in agent.chat_history when the agent
starts a new page or starts the game. (inherited)
run(): This function runs the agent in the wikipedia game context. (inherited)
store_chat_history(): Store the current chat history in the full chat history.
retrieve_chat_history(): Retrieve the full chat history.
"""
def reset_history(self):
"""
Replace the output of get_content tool with an indication that wikipedia content was output
when the agent moves to a new page, instead of clearing the entire chat_history.
"""
raise NotImplementedError("You need to implement the new reset_history method")
tests.test_WikiAgentChatHistory(WikiAgentChatHistory)
Solution
class WikiAgentChatHistory(WikiAgentReAct):
"""
Inherits from WikiAgentReAct and adds the ability to store and retrieve chat history.
Attributes:
model (str): The model used for generating responses (inherited)
tools (list[Any]): List of tools (inherited)
client (OpenAI): OpenAI client for API calls (inherited)
task (WikiGame): The current task being executed (inherited)
chat_history (list[dict]): History of interactions (inherited)
full_chat_history (list[dict]): Full history of interactions
Methods:
get_response(use_tool: bool = True) -> ChatCompletionMessage: Get response from the model
(inherited)
execute_tool_calls(message: ChatCompletionMessage) -> list[str]: Execute tool calls from the
model's response (inherited)
run(with_tool: bool = True) -> bool: Run one loop of the Wikipedia agent (inherited)
update_history(message): Update self.chat_history and self.full_chat_history with a message
or list of messages. (inherited)
reset_history(): Empty self.chat_history of the agent. (modified below)
handle_tool_calls(response: ChatCompletionMessage): Handles tool_calls in the wikipedia game
context. (inherited)
handle_refusal(response: ChatCompletionMessage): Handles refusals in the wikipedia game
context. (inherited)
start(): A function to put the starting instructions in agent.chat_history when the agent
starts a new page or starts the game. (inherited)
run(): This function runs the agent in the wikipedia game context. (inherited)
store_chat_history(): Store the current chat history in the full chat history.
retrieve_chat_history(): Retrieve the full chat history.
"""
def reset_history(self):
"""
Replace the output of get_content tool with an indication that wikipedia content was output
when the agent moves to a new page, instead of clearing the entire chat_history.
"""
for message in self.chat_history:
if isinstance(message, dict):
if (
message["role"] == "tool"
and message["name"] == "get_content"
and message["content"][0:21] != "Wikipedia content was"
):
message["content"] = (
f"Wikipedia content was output here, Wikipedia page: {self.task.current_page.title}"
)
else:
pass
else:
pass
Now see how your agent performs now that it can see its entire chat history. Let's see how it compares to our ReAct agent on the more difficult path Blavatnik School of Government -> Free Thai Movement. We find that the ReAct agent occasionally gets stuck on loops when attempting this path,although it may take more than one run to see this behaviour. We see the ReAct agent succeed only on 3/10 runs. When the model is provided with the full chat history, the agent avoids loops much more easily, and can accomplish this path very reliably.
# Run the game with the WikiAgentReAct class
game = WikiGame("Blavatnik School of Government", "Free Thai Movement")
agent = WikiAgentReAct(
task=game, tools=wiki_game_tools, model="gpt-4o-mini-2024-07-18", temperature=0
)
agent_loop(agent, 30)
# Run game with WikiAgentChatHistory class
game = WikiGame("Blavatnik School of Government", "Free Thai Movement")
agent = WikiAgentChatHistory(
task=game, tools=wiki_game_tools, model="gpt-4o-mini-2024-07-18", temperature=0
)
agent_loop(agent, 30)
Exercise - Implement a reflexion tool
```yaml Difficulty: 🔴🔴🔴⚪⚪ Importance: 🔵🔵🔵⚪⚪
You should spend up to 25-35 mins on this exercise. ```
The reflexion paper proposes a method that improves performance by getting LLMs to do self-reflection. The original paper looks at LLM agents in a RL set-up, where getting a reward signal on the agent's signal is slow and expensive. The key idea is to get quick cheap feedback from an evaluator on every proposed action, then to reflect on this feedback before taking the next action, as opposed to waiting for the final outcome. In their case, the evaluator was a heuristic function that estimated the reward function.
We will borrow and modify this idea by building a tool that allows our agent to perform a lookahead, and then gives feedback on our agent's proposed actions. We allow the agent to suggest candidate paths, then the tool will check if these paths work and inform the model either that the path works, or where the path goes wrong.
We don't want to provide the agent the links or content of every page when it does this lookahead, as then we'd just be reimplementing a smaller version of the game inside the game. Instead, we'll let the agent suggest paths without seeing any content or links, and then let it know if this path works. It's very likely that a suggested link will — at some point — not be accessible from one of the pages, but this tool will still be useful to help the agent plan.
class TestPathTool:
"""
Implements a tool that allows the agent to test paths from the current state of the game.
Attributes:
name (str): The name of the tool
Methods:
execute(task: WikiGame, path: str) -> str: Test if a given path is valid.
description -> dict: Provides the description of the test_path tool for the API
"""
name = "test_path"
def execute(self, task: WikiGame, path: str) -> str:
"""
Test if a given path is valid.
Args:
path (str): A string representing a path, formatted as follows "Barack Obama ->
Indonesia -> India"
task (WikiGame): The current task being run.
Returns:
str: A message indicating whether the path is valid or where it fails.
"""
raise NotImplementedError("You need to implement the execute method for the TestPathTool")
@property
def description(self):
raise NotImplementedError("You need to implement the description property for the TestPathTool")
tests.test_test_path_tool(TestPathTool)
TestPathTool_inst = TestPathTool()
wiki_game_tools = [GetContentTool_inst, MovePageTool_inst, TestPathTool_inst]
Solution
class TestPathTool:
"""
Implements a tool that allows the agent to test paths from the current state of the game.
Attributes:
name (str): The name of the tool
Methods:
execute(task: WikiGame, path: str) -> str: Test if a given path is valid.
description -> dict: Provides the description of the test_path tool for the API
"""
name = "test_path"
def execute(self, task: WikiGame, path: str) -> str:
"""
Test if a given path is valid.
Args:
path (str): A string representing a path, formatted as follows "Barack Obama ->
Indonesia -> India"
task (WikiGame): The current task being run.
Returns:
str: A message indicating whether the path is valid or where it fails.
"""
path_nodes = [node.strip() for node in path.split("->")]
if not path_nodes:
return "ERROR: Empty path provided."
if len(path_nodes) == 1:
return "ERROR: Path should have at least two pages"
if path_nodes[0] != task.current_page.title:
return f"ERROR: The path should start with the current page: {task.current_page.title}"
for i in range(len(path_nodes) - 1):
current_node = path_nodes[i]
next_node = path_nodes[i + 1]
permitted_links = (
link.lower() for link in get_permitted_links(get_page(path_nodes[i]))
)
if next_node.lower() not in permitted_links:
return f"This path works until the page {next_node}, which is not accessible from the page {current_node}"
return "This path is valid."
@property
def description(self):
return {
"type": "function",
"function": {
"name": "test_path",
"description": 'Accepts a test path string in the form "current_page -> page1 -> page2" and if the path does not work, then it returns where the path goes wrong, if the path does work it returns "success." Be careful that path titles can be sensitive to plurals or rephrasings. This tool is especially useful to check longer plans.',
"parameters": {
"type": "object",
"properties": {
"path": {
"type": "string",
"description": 'The path you want to test, formatted as " current_page -> page1 -> page2".',
},
},
"required": ["path"],
},
},
}
You should also edit the system_instruction and on_page_instruction to include an indication to the model to use the test_path tool (since this tool isn't strictly necessary to accomplish the task, the agent will often not use it at all). You can do this in the code cell below.
def new_system_instruction(self):
raise NotImplementedError("You need to implement the new system_instruction property")
def new_on_page_instruction(self):
raise NotImplementedError("You need to implement the new_user_instruction property")
WikiAgentChatHistory.system_instruction = property(new_system_instruction)
WikiAgentChatHistory.on_page_instruction = property(new_on_page_instruction)
Help! My agent isn't using the TestPathTool
If your agent isn't using the test path tool, you may want to modify your prompting. You could just include a strong indication in the on_page_instruction that the agent should use this tool before moving page. This may lea to overuse of the tool, so you may want to include clear instructions about how often and how much the model should use the tool in the system instruction.
Solution
def new_system_instruction(self):
tool_descriptions = "\n".join(
[
tool.description["function"]["name"] + ":" + tool.description["function"]["description"]
for tool in self.tools
]
)
return {
"role": "system",
"content": f"""You are a wikipedia-racing AI. Your goal is to reach {self.task.goal_page.title} by accessing links from wikipedia pages. You should avoid list pages, as the links that you would expect from the list are not accessible to you. Your current page is {self.task.current_page.title}. You have access to {str(len(self.tools))} tools, which are:\n{tool_descriptions}.\n\n You always start by getting the content of the current page. You use the test_path tool to help you plan your future path, you always plan exactly 2 pages into the future e.g. current page -> page 1 -> page 2. You use the move_page tool in order to move to a different wikipedia page, and make progress towards the goal page.""",
}
def new_on_page_instruction(self):
return {
"role": "user",
"content": f"""You are currently on page: {self.task.current_page.title}. Make sure you start by reasoning about what steps you should take to get to the article on {self.task.goal_page.title}. When coming up with a strategy, make sure to pay attention to the path you have already taken, and if your current strategy doesn't seem to be working out, try something else. In case you're unsure, {self.task.goal_page.title} has the following summary:\n\n[Begin Summary]\n{self.task.get_page_summary(self.task.goal_page)}\n[End Summary]\n\nThe path you have taken so far is {" -> ".join(self.task.page_history)}.
""",
}
WikiAgentChatHistory.system_instruction = property(new_system_instruction)
WikiAgentChatHistory.on_page_instruction = property(new_on_page_instruction)
Now see how your agent performs with the tool:
game = WikiGame("Drupe", "17th parallel north")
agent = WikiAgentChatHistory(game, model="gpt-4o-mini", tools=wiki_game_tools)
agent_loop(agent, 40)
You'll likely see that the agent often doesn't use this tool effectively, and when it does, will make suboptimal decisions based on the output of this tool:
- One common failure mode is that the model will try a promising path, be told by the tool that it goes wrong somewhere, and then abandon the entire path for a much less promising path (without doing any further testing).
- Another common issue is that the agent will only use the tool to test whether it is possible to move a single page ahead, which is not the intended use of the tool (as the agent should be able to work out which pages it can move to in one step by looking at the page's content).
Although it may be tempting to continue adding additional tools to agents, if they're not capable of using them correctly and effectively, then these tools may actually harm performance. There are tasks where a 'lookahead' tool could be used effectively, however it turns out that the Wikipedia game task isn't one of them.