Guardrails Langchain integration with streamable AgentExecutor #651

adreo00 · 2024-03-15T22:18:49Z

Discussed in #644

Description

Unable to integrate guardrails with a langchain Agent Executor. This issue can be reproduced by running the follow code after installing RegexMatch:

guardrails hub install hub://guardrails/regex_match

from langchain import hub
from langchain.agents import AgentExecutor, create_openai_tools_agent
from langchain_core.utils.function_calling import convert_to_openai_tool
from langchain.tools import tool
from langchain_openai import ChatOpenAI
from langchain_core.documents.base import Document
from guardrails.hub import RegexMatch
from guardrails import Guard

from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain.agents.format_scratchpad.openai_tools import (
    format_to_openai_tool_messages,
)
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser
prompt = hub.pull("hwchase17/openai-tools-agent")

@tool
def get_retriever_docs(query: str) -> list[Document]:
    """Returns a list of documents from the retriever."""
    return [Document(page_content="# test file\n\nThis is a test file with a secret code of 'blue-green-apricot-brownie-cake-mousepad'.", metadata={'source': './test.md'})]

# Set up a Guard
topic = "apricot"
guard = Guard().use(RegexMatch(topic, match_type="search", on_fail="filter"))

llm = ChatOpenAI(temperature=0, streaming=True)# | guard # adding guard here causes: TypeError: RunnableSequence._transform() got an unexpected keyword argument 'tools'
tools = [get_retriever_docs]


############################################ this is a copy-paste from langchain.agents.create_openai_tools_agent
llm_with_tools = llm.bind(tools=[convert_to_openai_tool(tool) for tool in tools])

agent = (
    RunnablePassthrough.assign(
        agent_scratchpad=lambda x: format_to_openai_tool_messages(
            x["intermediate_steps"]
        )
    )
    | prompt 
    | llm_with_tools
    # | guard # adding this here causes ValueError: API must be provided.
    | OpenAIToolsAgentOutputParser()
)
############################################

agent_executor = AgentExecutor(agent=agent, tools=tools).with_config(
    {"run_name": "Agent"}
)
query="call get_retriever_docs and tell me a secret from the docs"
agent_executor.invoke({'input':query})

partial System info (let me know have a complete minimal example would be helpful)

python==3.11.7
langchain==0.1.12
langchain-community==0.0.28
langchain-core==0.1.31
langchain-openai==0.0.8
langchainhub==0.1.15
guardrails-ai==0.4.1

The text was updated successfully, but these errors were encountered:

zsimjee · 2024-03-19T23:26:11Z

Hi, I'm looking into this one now.

zsimjee · 2024-03-19T23:57:12Z

I've got reason to believe this is actually not a Guardrails specific problem but rather a chaining one in general - when I swap out the | guard line for this

topic = "*apricot*"
guard = Guard().use(RegexMatch(topic, match_type="search", on_fail="filter"))
model = ChatOpenAI(temperature=0, streaming=False)
llm = model | StrOutputParser()

I get the same RunnableSequence error.

I'm diving deeper to see what the actual issue is.

zsimjee · 2024-03-20T00:22:56Z

ok the second use seems to be the right place to chain in (where we got the 'API must be provided' error). I attached a debugger, and found that when the guard invoke function is called, it's passed an empty input. I think something else is going on here, where LCEL is executing guard validation before it has results ready.

zsimjee · 2024-03-20T00:27:28Z

When I chain the guard after the OpenAIToolsAgentOutputParser(), a value is passed to the guard finally, however, the value includes the function calls. The problem here is that the agent is not executed by the time the results are passed to teh guard. Figuring out if there's a way to fix that

[OpenAIToolAgentAction(tool='get_retriever_docs', tool_input={'query': 'secret'}, log="\nInvoking: get_retriever_docs with {'query': 'secret'}\n\n\n", message_log=[AIMessageChunk(content='', additional_kwargs={'tool_calls': [{'index': 0, 'id': 'call_3WbE44YBc9F7snOpUwLjMhgK', 'function': {'arguments': '{"query":"secret"}', 'name': 'get_retriever_docs'}, 'type': 'function'}]}, response_metadata={'finish_reason': 'tool_calls'})], tool_call_id='call_3WbE44YBc9F7snOpUwLjMhgK')]

zsimjee · 2024-03-21T00:22:14Z

Where we chain is very important - the place to chain the guard with output validation will be after the agent executes. See the code sample below

from langchain import hub
from langchain.agents import AgentExecutor
from langchain_core.utils.function_calling import convert_to_openai_tool
from langchain.tools import tool
from langchain_openai import ChatOpenAI
from langchain_core.documents.base import Document
from guardrails.hub import RegexMatch
from guardrails import Guard

from langchain_core.runnables import RunnablePassthrough
from langchain.agents.format_scratchpad.openai_tools import (
    format_to_openai_tool_messages,
)
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser

prompt = hub.pull("hwchase17/openai-tools-agent")


@tool
def get_retriever_docs(query: str) -> list[Document]:
    """Returns a list of documents from the retriever."""
    return [
        Document(
            page_content="# test file\n\nThis is a test file with a secret code of 'blue-green-apricot-brownie-cake-mousepad'.",
            metadata={"source": "./test.md"},
        )
    ]


# Set up a Guard
topic = "apricot"
guard = Guard().use(RegexMatch(topic, match_type="search", on_fail="filter"))
model = ChatOpenAI(temperature=0, streaming=False)
llm = model
tools = [get_retriever_docs]


############################################ this is a copy-paste from langchain.agents.create_openai_tools_agent
llm_with_tools = llm.bind(tools=[convert_to_openai_tool(tool) for tool in tools])



agent = (
    RunnablePassthrough.assign(
        agent_scratchpad=lambda x: format_to_openai_tool_messages(
            x["intermediate_steps"]
        )
    )
    | prompt
    | llm_with_tools
    | OpenAIToolsAgentOutputParser()
)
############################################

agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True).with_config(
    {"run_name": "Agent"}
)

chain = agent_executor | guard

query = "call get_retriever_docs and tell me a secret from the docs"
print(chain.invoke({"input": query}))

alex-dreo-persefoni · 2024-03-21T17:23:47Z

@zsimjee Confirmed that chaining the guard after the agent_executor works in your minimal example. Intuitively, this make a lot of sense: we want the agent's output, to go through guardrails. For this reason alone, it makes sense to me to chain guard all the way at the end rather than in an intermediate step as I had tried to do in my code above.

zsimjee · 2024-03-21T18:13:10Z

Awesome! I wish there was a way to chain the actual agent and [type]AgentOutputParser() directly into the AgentExecutor, but I couldn't get that to work. IF it does work, it would be cool to have a chain like

  RunnablePassthrough.assign(
      agent_scratchpad=lambda x: format_to_openai_tool_messages(
          x["intermediate_steps"]
      )
  )
  | prompt
  | llm_with_tools
  | OpenAIToolsAgentOutputParser()
  | AgentExecutor().with_config("run_name": "Agent")
  | guard
  | [other output parsers]

Closing for now

zsimjee closed this as completed Mar 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guardrails Langchain integration with streamable AgentExecutor #651

Guardrails Langchain integration with streamable AgentExecutor #651

adreo00 commented Mar 15, 2024 •

edited

Loading

zsimjee commented Mar 19, 2024

zsimjee commented Mar 19, 2024

zsimjee commented Mar 20, 2024

zsimjee commented Mar 20, 2024

zsimjee commented Mar 21, 2024 •

edited

Loading

alex-dreo-persefoni commented Mar 21, 2024

zsimjee commented Mar 21, 2024

Guardrails Langchain integration with streamable AgentExecutor #651

Guardrails Langchain integration with streamable AgentExecutor #651

Comments

adreo00 commented Mar 15, 2024 • edited Loading

Discussed in #644

Description

partial System info (let me know have a complete minimal example would be helpful)

zsimjee commented Mar 19, 2024

zsimjee commented Mar 19, 2024

zsimjee commented Mar 20, 2024

zsimjee commented Mar 20, 2024

zsimjee commented Mar 21, 2024 • edited Loading

alex-dreo-persefoni commented Mar 21, 2024

zsimjee commented Mar 21, 2024

adreo00 commented Mar 15, 2024 •

edited

Loading

zsimjee commented Mar 21, 2024 •

edited

Loading