You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: SECURITY.md
-7
Original file line number
Diff line number
Diff line change
@@ -27,13 +27,6 @@ These models require the `trust_remote_code=True` parameter to be set when using
27
27
the content of the modeling files when using this argument. We recommend setting a revision in order to ensure you
28
28
protect yourself from updates on the repository.
29
29
30
-
#### Tools
31
-
32
-
Through the `Agent` framework, remote tools can be downloaded to be used by the Agent. You're to specify these tools
33
-
yourself, but please keep in mind that their code will be run on your machine if the Agent chooses to run them.
34
-
35
-
Please inspect the code of the tools before passing them to the Agent to protect your runtime and local setup.
36
-
37
30
## Reporting a Vulnerability
38
31
39
32
Feel free to submit vulnerability reports to [[email protected]](mailto:[email protected]), where someone from the HF security team will review and recommend next steps. If reporting a vulnerability specific to open source, please note [Huntr](https://huntr.com) is a vulnerability disclosure program for open source software.
Copy file name to clipboardExpand all lines: docs/source/en/agents.md
+1-280
Original file line number
Diff line number
Diff line change
@@ -15,283 +15,4 @@ rendered properly in your Markdown viewer.
15
15
-->
16
16
17
17
> [!WARNING]
18
-
> Agents and tools are being spun out into the standalone [smolagents](https://huggingface.co/docs/smolagents/index) library. These docs will be deprecated in the future!
19
-
20
-
# Agents
21
-
22
-
[[open-in-colab]]
23
-
24
-
An agent is a system where a large language model (LLM) can execute more complex tasks through *planning* and using *tools*.
25
-
26
-
- Planning helps a LLM reason its way through a task by breaking it down into smaller subtasks. For example, [`CodeAgent`] plans a series of actions to take and then generates Python code to execute all the actions at once.
27
-
28
-
Another planning method is by self-reflection and refinement of its previous actions to improve its performance. The [`ReactJsonAgent`] is an example of this type of planning, and it's based on the [ReAct](https://hf.co/papers/2210.03629) framework. This agent plans and executes actions one at a time based on the feedback it receives from each action.
29
-
30
-
- Tools give a LLM access to external functions or APIs that it can use to help it complete a task. For example, [gradio-tools](https://github.com/freddyaboulton/gradio-tools) gives a LLM access to any of the [Gradio](https://www.gradio.app/) apps available on Hugging Face [Spaces](https://hf.co/spaces). These apps can be used for a wide range of tasks such as image generation, video generation, audio transcription, and more.
31
-
32
-
To use agents in Transformers, make sure you have the extra `agents` dependencies installed.
33
-
34
-
```bash
35
-
!pip install transformers[agents]
36
-
```
37
-
38
-
Create an agent instance (refer to the [Agents](./main_classes/agent#agents) API for supported agents in Transformers) and a list of tools available for it to use, then [`~ReactAgent.run`] the agent on your task. The example below demonstrates how a ReAct agent reasons through a task.
39
-
40
-
```py
41
-
from transformers import ReactCodeAgent
42
-
43
-
agent = ReactCodeAgent(tools=[])
44
-
agent.run(
45
-
"How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?",
46
-
)
47
-
```
48
-
49
-
```bash
50
-
======== New task ========
51
-
How many more blocks (also denoted as layers) in BERT base encoder than the encoder from the architecture proposed in Attention is All You Need?
52
-
==== Agent is executing the code below:
53
-
bert_layers = 12 # BERT base encoder has 12 layers
54
-
attention_layers = 6 # Encoder in Attention is All You Need has 6 layers
55
-
layer_diff = bert_layers - attention_layers
56
-
print("The difference in layers between BERT base encoder and Attention is All You Need is", layer_diff)
57
-
====
58
-
Print outputs:
59
-
The difference in layers between BERT base encoder and Attention is All You Need is 6
60
-
61
-
==== Agent is executing the code below:
62
-
final_answer("BERT base encoder has {} more layers than the encoder from Attention is All You Need.".format(layer_diff))
63
-
====
64
-
Print outputs:
65
-
66
-
>>> Final answer:
67
-
BERT base encoder has 6 more layers than the encoder from Attention is All You Need.
68
-
```
69
-
70
-
This guide will walk you through in more detail how to initialize an agent.
71
-
72
-
## LLM
73
-
74
-
An agent uses a LLM to plan and execute a task; it is the engine that powers the agent. To choose and build your own LLM engine, you need a method that:
75
-
76
-
1. the input uses the [chat template](./chat_templating) format, `List[Dict[str, str]]`, and it returns a string
77
-
2. the LLM stops generating outputs when it encounters the sequences in `stop_sequences`
Next, initialize an engine to load a model. To run an agent locally, create a [`TransformersEngine`] to load a preinitialized [`Pipeline`].
87
-
88
-
However, you could also leverage Hugging Face's powerful inference infrastructure, [Inference API](https://hf.co/docs/api-inference/index) or [Inference Endpoints](https://hf.co/docs/inference-endpoints/index), to run your model. This is useful for loading larger models that are typically required for agentic behavior. In this case, load the [`HfApiEngine`] to run the agent.
89
-
90
-
The agent requires a list of tools it can use to complete a task. If you aren't using any additional tools, pass an empty list. The default tools provided by Transformers are loaded automatically, but you can optionally set `add_base_tools=True` to explicitly enable them.
91
-
92
-
<hfoptionsid="engine">
93
-
<hfoptionid="TransformersEngine">
94
-
95
-
```py
96
-
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, TransformersEngine, CodeAgent
"Could you translate this sentence from French, say it out loud and return the audio.",
118
-
sentence="Où est la boulangerie la plus proche?",
119
-
)
120
-
```
121
-
122
-
</hfoption>
123
-
</hfoptions>
124
-
125
-
The agent supports [constrained generation](https://hf.co/docs/text-generation-inference/conceptual/guidance) for generating outputs according to a specific structure with the `grammar` parameter. The `grammar` parameter should be specified in the `llm_engine` method or you can set it when initializing an agent.
126
-
127
-
Lastly, an agent accepts additional inputs such as text and audio. In the [`HfApiEngine`] example above, the agent accepted a sentence to translate. But you could also pass a path to a local or remote file for the agent to access. The example below demonstrates how to pass a path to an audio file.
agent.run("Why doesn't he know many people in New York?", audio="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/recording.mp3")
134
-
```
135
-
136
-
## System prompt
137
-
138
-
A system prompt describes how an agent should behave, a description of the available tools, and the expected output format.
139
-
140
-
Tools are defined by the `<<tool_descriptions>>` token which is dynamically replaced during runtime with the actual tool. The tool description is derived from the tool name, description, inputs, output type, and a Jinja2 template. Refer to the [Tools](./tools) guide for more information about how to describe tools.
141
-
142
-
The example below is the system prompt for [`ReactCodeAgent`].
143
-
144
-
```py
145
-
You will be given a task to solve as best you can.
146
-
You have access to the following tools:
147
-
<<tool_descriptions>>
148
-
149
-
To solve the task, you must plan forward to proceed in a series of steps, in a cycle of 'Thought:', 'Code:', and'Observation:' sequences.
150
-
151
-
At each step, in the 'Thought:' sequence, you should first explain your reasoning towards solving the task, then the tools that you want to use.
152
-
Then in the 'Code:' sequence, you should write the code in simple Python. The code sequence must end with'/End code' sequence.
153
-
During each intermediate step, you can use 'print()' to save whatever important information you will then need.
154
-
These print outputs will then be available in the 'Observation:' field, for using this information asinputfor the next step.
155
-
156
-
In the end you have to return a final answer using the `final_answer` tool.
157
-
158
-
Here are a few examples using notional tools:
159
-
---
160
-
{examples}
161
-
162
-
Above example were using notional tools that might not exist for you. You only have access to those tools:
163
-
<<tool_names>>
164
-
You also can perform computations in the python code you generate.
165
-
166
-
Always provide a 'Thought:'and a 'Code:\n```py' sequence ending with'```<end_code>' sequence. You MUST provide at least the 'Code:' sequence to move forward.
167
-
168
-
Remember to not perform too many operations in a single code block! You should split the task into intermediate code blocks.
169
-
Print results at the end of each step to save the intermediate results. Then use final_answer() to return the final result.
170
-
171
-
Remember to make sure that variables you use are all defined.
172
-
173
-
Now Begin!
174
-
```
175
-
176
-
The system prompt can be tailored to the intended task. For example, you can add a better explanation of the output formator you can overwrite the system prompt template entirely with your own custom system prompt as shown below.
177
-
178
-
> [!WARNING]
179
-
> If you're writing a custom system prompt, make sure to include `<<tool_descriptions>>` in the template so the agent is aware of the available tools.
180
-
181
-
```py
182
-
from transformers import ReactJsonAgent
183
-
from transformers.agents import PythonInterpreterTool
For safety, only the tools you provide (and the default Transformers tools) and the `print` function are executed. The interpreter doesn't allow importing modules that aren't on a safe list.
191
-
192
-
To import modules that aren't on the list, add them as a list to the `additional_authorized_imports` parameter when initializing an agent.
agent.run("Could you get me the title of the page at url 'https://huggingface.co/blog'?")
199
-
```
200
-
201
-
Code execution stops if a tool isn't on the safe list, it isn't authorized, orif the code generated by the agent returns a Python error.
202
-
203
-
> [!WARNING]
204
-
> A LLM can generate any arbitrary code that can be executed, so don't add any unsafe imports!
205
-
206
-
## Multi-agent
207
-
208
-
[Multi-agent](https://hf.co/papers/2308.08155) refers to multiple agents working together to solve a task. Performance is typically better because each agent is specialized for a particular subtask.
209
-
210
-
Multi-agents are created through a [`ManagedAgent`] class, where a *manager agent* oversees how other agents work together. The manager agent requires an agent and their name and description. These are added to the manager agents system prompt which lets it know how to call and use them.
211
-
212
-
The multi-agent example below creates a web search agent that is managed by another [`ReactCodeAgent`].
213
-
214
-
```py
215
-
from transformers.agents import ReactCodeAgent, HfApiEngine, DuckDuckGoSearchTool, ManagedAgent
manager_agent.run("Who is the CEO of Hugging Face?")
228
-
```
229
-
230
-
## Gradio integration
231
-
232
-
[Gradio](https://www.gradio.app/) is a library for quickly creating and sharing machine learning apps. The [gradio.Chatbot](https://www.gradio.app/docs/gradio/chatbot) supports chatting with a Transformers agent with the [`stream_to_gradio`] function.
233
-
234
-
Load a tool andLLMwith an agent, and then create a Gradio app. The key is to use [`stream_to_gradio`] to stream the agents messages and display how it's reasoning through a task.
For a better idea of what is happening when you call an agent, it is always a good idea to check the system prompt template first.
283
-
284
-
```py
285
-
print(agent.system_prompt_template)
286
-
```
287
-
288
-
If the agent is behaving unexpectedly, remember to explain the task you want to perform as clearly as possible. Every [`~Agent.run`] is different and minor variations in your system prompt may yield completely different results.
289
-
290
-
To find out what happened after a run, check the following agent attributes.
291
-
292
-
-`agent.logs` stores the finegrained agent logs. At every step of the agents run, everything is stored in a dictionary and appended to `agent.logs`.
293
-
-`agent.write_inner_memory_from_logs` only stores a high-level overview of the agents run. For example, at each step, it stores the LLM output as a message and the tool call output as a separate message. Not every detail from a step is transcripted by `write_inner_memory_from_logs`.
294
-
295
-
## Resources
296
-
297
-
Learn more about ReAct agents in the [Open-source LLMs as LangChain Agents](https://hf.co/blog/open-source-llms-as-agents) blog post.
18
+
> Agents and tools were spun out into the standalone [smolagents](https://huggingface.co/docs/smolagents/index) library. They were removed from `transformers` in v4.52.
0 commit comments