Skip to content

Commit aaf129c

Browse files
authored
[agents] remove agents 🧹 (#37368)
1 parent 69e6ddf commit aaf129c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+4
-10916
lines changed

SECURITY.md

-7
Original file line numberDiff line numberDiff line change
@@ -27,13 +27,6 @@ These models require the `trust_remote_code=True` parameter to be set when using
2727
the content of the modeling files when using this argument. We recommend setting a revision in order to ensure you
2828
protect yourself from updates on the repository.
2929

30-
#### Tools
31-
32-
Through the `Agent` framework, remote tools can be downloaded to be used by the Agent. You're to specify these tools
33-
yourself, but please keep in mind that their code will be run on your machine if the Agent chooses to run them.
34-
35-
Please inspect the code of the tools before passing them to the Agent to protect your runtime and local setup.
36-
3730
## Reporting a Vulnerability
3831

3932
Feel free to submit vulnerability reports to [[email protected]](mailto:[email protected]), where someone from the HF security team will review and recommend next steps. If reporting a vulnerability specific to open source, please note [Huntr](https://huntr.com) is a vulnerability disclosure program for open source software.

conftest.py

-2
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,6 @@
6666
"ModelTester::test_pipeline_",
6767
"/repo_utils/",
6868
"/utils/",
69-
"/agents/",
7069
}
7170

7271
# allow having multiple repository checkouts and not needing to remember to rerun
@@ -83,7 +82,6 @@ def pytest_configure(config):
8382
config.addinivalue_line("markers", "is_pipeline_test: mark test to run only when pipelines are tested")
8483
config.addinivalue_line("markers", "is_staging_test: mark test to run only in the staging environment")
8584
config.addinivalue_line("markers", "accelerate_tests: mark test that require accelerate")
86-
config.addinivalue_line("markers", "agent_tests: mark the agent tests that are run on their specific schedule")
8785
config.addinivalue_line("markers", "not_device_test: mark the tests always running on cpu")
8886

8987

docs/source/ar/_toctree.yml

-4
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,6 @@
2323
title: تحميل النماذج المخصصة وتدريبها باستخدام 🤗 PEFT
2424
- local: model_sharing
2525
title: مشاركة نموذجك
26-
- local: agents
27-
title: الوكلاء
2826
- local: llm_tutorial
2927
title: التوليد باستخدام LLMs
3028
- local: conversations
@@ -252,8 +250,6 @@
252250
title: أطر مفاهيمية
253251
# - sections:
254252
# - sections:
255-
# - local: main_classes/agent
256-
# title: الوكلاء والأدوات
257253
# - local: model_doc/auto
258254
# title: فئات يتم إنشاؤها ديناميكيًا
259255
# - local: main_classes/backbones

docs/source/ar/agents.md

-539
This file was deleted.

docs/source/de/_toctree.yml

+1-3
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,6 @@
2323
title: Laden und Trainieren von Adaptern mit 🤗 PEFT
2424
- local: model_sharing
2525
title: Ein Modell teilen
26-
- local: transformers_agents
27-
title: Agents
2826
- local: llm_tutorial
2927
title: Generation with LLMs
3028
title: Tutorials
@@ -39,4 +37,4 @@
3937
title: Testen
4038
- local: pr_checks
4139
title: Überprüfung einer Pull Request
42-
title: Contribute
40+
title: Contribute

docs/source/de/transformers_agents.md

-323
This file was deleted.

docs/source/en/_toctree.yml

-2
Original file line numberDiff line numberDiff line change
@@ -308,8 +308,6 @@
308308
- isExpanded: false
309309
sections:
310310
- sections:
311-
- local: main_classes/agent
312-
title: Agents and Tools
313311
- local: model_doc/auto
314312
title: Auto Classes
315313
- local: main_classes/backbones

docs/source/en/agents.md

+1-280
Original file line numberDiff line numberDiff line change
@@ -15,283 +15,4 @@ rendered properly in your Markdown viewer.
1515
-->
1616

1717
> [!WARNING]
18-
> Agents and tools are being spun out into the standalone [smolagents](https://huggingface.co/docs/smolagents/index) library. These docs will be deprecated in the future!
19-
20-
# Agents
21-
22-
[[open-in-colab]]
23-
24-
An agent is a system where a large language model (LLM) can execute more complex tasks through *planning* and using *tools*.
25-
26-
- Planning helps a LLM reason its way through a task by breaking it down into smaller subtasks. For example, [`CodeAgent`] plans a series of actions to take and then generates Python code to execute all the actions at once.
27-
28-
Another planning method is by self-reflection and refinement of its previous actions to improve its performance. The [`ReactJsonAgent`] is an example of this type of planning, and it's based on the [ReAct](https://hf.co/papers/2210.03629) framework. This agent plans and executes actions one at a time based on the feedback it receives from each action.
29-
30-
- Tools give a LLM access to external functions or APIs that it can use to help it complete a task. For example, [gradio-tools](https://github.com/freddyaboulton/gradio-tools) gives a LLM access to any of the [Gradio](https://www.gradio.app/) apps available on Hugging Face [Spaces](https://hf.co/spaces). These apps can be used for a wide range of tasks such as image generation, video generation, audio transcription, and more.
31-
32-
To use agents in Transformers, make sure you have the extra `agents` dependencies installed.
33-
34-
```bash
35-
!pip install transformers[agents]
36-
```
37-
38-
Create an agent instance (refer to the [Agents](./main_classes/agent#agents) API for supported agents in Transformers) and a list of tools available for it to use, then [`~ReactAgent.run`] the agent on your task. The example below demonstrates how a ReAct agent reasons through a task.
39-
40-
```py
41-
from transformers import ReactCodeAgent
42-
43-
agent = ReactCodeAgent(tools=[])
44-
agent.run(
45-
"How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?",
46-
)
47-
```
48-
49-
```bash
50-
======== New task ========
51-
How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?
52-
==== Agent is executing the code below:
53-
bert_layers = 12 # BERT base encoder has 12 layers
54-
attention_layers = 6 # Encoder in Attention is All You Need has 6 layers
55-
layer_diff = bert_layers - attention_layers
56-
print("The difference in layers between BERT base encoder and Attention is All You Need is", layer_diff)
57-
====
58-
Print outputs:
59-
The difference in layers between BERT base encoder and Attention is All You Need is 6
60-
61-
==== Agent is executing the code below:
62-
final_answer("BERT base encoder has {} more layers than the encoder from Attention is All You Need.".format(layer_diff))
63-
====
64-
Print outputs:
65-
66-
>>> Final answer:
67-
BERT base encoder has 6 more layers than the encoder from Attention is All You Need.
68-
```
69-
70-
This guide will walk you through in more detail how to initialize an agent.
71-
72-
## LLM
73-
74-
An agent uses a LLM to plan and execute a task; it is the engine that powers the agent. To choose and build your own LLM engine, you need a method that:
75-
76-
1. the input uses the [chat template](./chat_templating) format, `List[Dict[str, str]]`, and it returns a string
77-
2. the LLM stops generating outputs when it encounters the sequences in `stop_sequences`
78-
79-
```py
80-
def llm_engine(messages, stop_sequences=["Task"]) -> str:
81-
response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000)
82-
answer = response.choices[0].message.content
83-
return answer
84-
```
85-
86-
Next, initialize an engine to load a model. To run an agent locally, create a [`TransformersEngine`] to load a preinitialized [`Pipeline`].
87-
88-
However, you could also leverage Hugging Face's powerful inference infrastructure, [Inference API](https://hf.co/docs/api-inference/index) or [Inference Endpoints](https://hf.co/docs/inference-endpoints/index), to run your model. This is useful for loading larger models that are typically required for agentic behavior. In this case, load the [`HfApiEngine`] to run the agent.
89-
90-
The agent requires a list of tools it can use to complete a task. If you aren't using any additional tools, pass an empty list. The default tools provided by Transformers are loaded automatically, but you can optionally set `add_base_tools=True` to explicitly enable them.
91-
92-
<hfoptions id="engine">
93-
<hfoption id="TransformersEngine">
94-
95-
```py
96-
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, TransformersEngine, CodeAgent
97-
98-
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B-Instruct")
99-
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct").to("cuda")
100-
pipeline = pipeline("text-generation", model=model, tokenizer=tokenizer)
101-
llm_engine = TransformersEngine(pipeline)
102-
agent = CodeAgent(tools=[], llm_engine=llm_engine)
103-
agent.run(
104-
"What causes bread to rise?",
105-
)
106-
```
107-
108-
</hfoption>
109-
<hfoption id="HfApiEngine">
110-
111-
```py
112-
from transformers import CodeAgent, HfApiEngine
113-
114-
llm_engine = HfApiEngine(model="meta-llama/Meta-Llama-3-70B-Instruct")
115-
agent = CodeAgent(tools=[], llm_engine=llm_engine)
116-
agent.run(
117-
"Could you translate this sentence from French, say it out loud and return the audio.",
118-
sentence="Où est la boulangerie la plus proche?",
119-
)
120-
```
121-
122-
</hfoption>
123-
</hfoptions>
124-
125-
The agent supports [constrained generation](https://hf.co/docs/text-generation-inference/conceptual/guidance) for generating outputs according to a specific structure with the `grammar` parameter. The `grammar` parameter should be specified in the `llm_engine` method or you can set it when initializing an agent.
126-
127-
Lastly, an agent accepts additional inputs such as text and audio. In the [`HfApiEngine`] example above, the agent accepted a sentence to translate. But you could also pass a path to a local or remote file for the agent to access. The example below demonstrates how to pass a path to an audio file.
128-
129-
```py
130-
from transformers import ReactCodeAgent
131-
132-
agent = ReactCodeAgent(tools=[], llm_engine=llm_engine)
133-
agent.run("Why doesn't he know many people in New York?", audio="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/recording.mp3")
134-
```
135-
136-
## System prompt
137-
138-
A system prompt describes how an agent should behave, a description of the available tools, and the expected output format.
139-
140-
Tools are defined by the `<<tool_descriptions>>` token which is dynamically replaced during runtime with the actual tool. The tool description is derived from the tool name, description, inputs, output type, and a Jinja2 template. Refer to the [Tools](./tools) guide for more information about how to describe tools.
141-
142-
The example below is the system prompt for [`ReactCodeAgent`].
143-
144-
```py
145-
You will be given a task to solve as best you can.
146-
You have access to the following tools:
147-
<<tool_descriptions>>
148-
149-
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and 'Observation:' sequences.
150-
151-
At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task, then the tools that you want to use.
152-
Then in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with '/End code' sequence.
153-
During each intermediate step, you can use 'print()' to save whatever important information you will then need.
154-
These print outputs will then be available in the 'Observation:' field, for using this information as input for the next step.
155-
156-
In the end you have to return a final answer using the `final_answer` tool.
157-
158-
Here are a few examples using notional tools:
159-
---
160-
{examples}
161-
162-
Above example were using notional tools that might not exist for you. You only have access to those tools:
163-
<<tool_names>>
164-
You also can perform computations in the python code you generate.
165-
166-
Always provide a 'Thought:' and a 'Code:\n```py' sequence ending with '```<end_code>' sequence. You MUST provide at least the 'Code:' sequence to move forward.
167-
168-
Remember to not perform too many operations in a single code block! You should split the task into intermediate code blocks.
169-
Print results at the end of each step to save the intermediate results. Then use final_answer() to return the final result.
170-
171-
Remember to make sure that variables you use are all defined.
172-
173-
Now Begin!
174-
```
175-
176-
The system prompt can be tailored to the intended task. For example, you can add a better explanation of the output format or you can overwrite the system prompt template entirely with your own custom system prompt as shown below.
177-
178-
> [!WARNING]
179-
> If you're writing a custom system prompt, make sure to include `<<tool_descriptions>>` in the template so the agent is aware of the available tools.
180-
181-
```py
182-
from transformers import ReactJsonAgent
183-
from transformers.agents import PythonInterpreterTool
184-
185-
agent = ReactJsonAgent(tools=[PythonInterpreterTool()], system_prompt="{your_custom_prompt}")
186-
```
187-
188-
## Code execution
189-
190-
For safety, only the tools you provide (and the default Transformers tools) and the `print` function are executed. The interpreter doesn't allow importing modules that aren't on a safe list.
191-
192-
To import modules that aren't on the list, add them as a list to the `additional_authorized_imports` parameter when initializing an agent.
193-
194-
```py
195-
from transformers import ReactCodeAgent
196-
197-
agent = ReactCodeAgent(tools=[], additional_authorized_imports=['requests', 'bs4'])
198-
agent.run("Could you get me the title of the page at url 'https://huggingface.co/blog'?")
199-
```
200-
201-
Code execution stops if a tool isn't on the safe list, it isn't authorized, or if the code generated by the agent returns a Python error.
202-
203-
> [!WARNING]
204-
> A LLM can generate any arbitrary code that can be executed, so don't add any unsafe imports!
205-
206-
## Multi-agent
207-
208-
[Multi-agent](https://hf.co/papers/2308.08155) refers to multiple agents working together to solve a task. Performance is typically better because each agent is specialized for a particular subtask.
209-
210-
Multi-agents are created through a [`ManagedAgent`] class, where a *manager agent* oversees how other agents work together. The manager agent requires an agent and their name and description. These are added to the manager agents system prompt which lets it know how to call and use them.
211-
212-
The multi-agent example below creates a web search agent that is managed by another [`ReactCodeAgent`].
213-
214-
```py
215-
from transformers.agents import ReactCodeAgent, HfApiEngine, DuckDuckGoSearchTool, ManagedAgent
216-
217-
llm_engine = HfApiEngine()
218-
web_agent = ReactCodeAgent(tools=[DuckDuckGoSearchTool()], llm_engine=llm_engine)
219-
managed_web_agent = ManagedAgent(
220-
agent=web_agent,
221-
name="web_search",
222-
description="Runs web searches for you. Give it your query as an argument."
223-
)
224-
manager_agent = ReactCodeAgent(
225-
tools=[], llm_engine=llm_engine, managed_agents=[managed_web_agent]
226-
)
227-
manager_agent.run("Who is the CEO of Hugging Face?")
228-
```
229-
230-
## Gradio integration
231-
232-
[Gradio](https://www.gradio.app/) is a library for quickly creating and sharing machine learning apps. The [gradio.Chatbot](https://www.gradio.app/docs/gradio/chatbot) supports chatting with a Transformers agent with the [`stream_to_gradio`] function.
233-
234-
Load a tool and LLM with an agent, and then create a Gradio app. The key is to use [`stream_to_gradio`] to stream the agents messages and display how it's reasoning through a task.
235-
236-
```py
237-
import gradio as gr
238-
from transformers import (
239-
load_tool,
240-
ReactCodeAgent,
241-
HfApiEngine,
242-
stream_to_gradio,
243-
)
244-
245-
# Import tool from Hub
246-
image_generation_tool = load_tool("m-ric/text-to-image")
247-
llm_engine = HfApiEngine("meta-llama/Meta-Llama-3-70B-Instruct")
248-
249-
# Initialize the agent with the image generation tool
250-
agent = ReactCodeAgent(tools=[image_generation_tool], llm_engine=llm_engine)
251-
252-
def interact_with_agent(task):
253-
messages = []
254-
messages.append(gr.ChatMessage(role="user", content=task))
255-
yield messages
256-
for msg in stream_to_gradio(agent, task):
257-
messages.append(msg)
258-
yield messages + [
259-
gr.ChatMessage(role="assistant", content="⏳ Task not finished yet!")
260-
]
261-
yield messages
262-
263-
with gr.Blocks() as demo:
264-
text_input = gr.Textbox(lines=1, label="Chat Message", value="Make me a picture of the Statue of Liberty.")
265-
submit = gr.Button("Run illustrator agent!")
266-
chatbot = gr.Chatbot(
267-
label="Agent",
268-
type="messages",
269-
avatar_images=(
270-
None,
271-
"https://em-content.zobj.net/source/twitter/53/robot-face_1f916.png",
272-
),
273-
)
274-
submit.click(interact_with_agent, [text_input], [chatbot])
275-
276-
if __name__ == "__main__":
277-
demo.launch()
278-
```
279-
280-
## Troubleshoot
281-
282-
For a better idea of what is happening when you call an agent, it is always a good idea to check the system prompt template first.
283-
284-
```py
285-
print(agent.system_prompt_template)
286-
```
287-
288-
If the agent is behaving unexpectedly, remember to explain the task you want to perform as clearly as possible. Every [`~Agent.run`] is different and minor variations in your system prompt may yield completely different results.
289-
290-
To find out what happened after a run, check the following agent attributes.
291-
292-
- `agent.logs` stores the finegrained agent logs. At every step of the agents run, everything is stored in a dictionary and appended to `agent.logs`.
293-
- `agent.write_inner_memory_from_logs` only stores a high-level overview of the agents run. For example, at each step, it stores the LLM output as a message and the tool call output as a separate message. Not every detail from a step is transcripted by `write_inner_memory_from_logs`.
294-
295-
## Resources
296-
297-
Learn more about ReAct agents in the [Open-source LLMs as LangChain Agents](https://hf.co/blog/open-source-llms-as-agents) blog post.
18+
> Agents and tools were spun out into the standalone [smolagents](https://huggingface.co/docs/smolagents/index) library. They were removed from `transformers` in v4.52.

0 commit comments

Comments
 (0)