-
Notifications
You must be signed in to change notification settings - Fork 105
Feature: Web Search #1276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature: Web Search #1276
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1276 +/- ##
==========================================
- Coverage 14.47% 14.46% -0.01%
==========================================
Files 2345 2346 +1
Lines 204298 204411 +113
Branches 184662 184775 +113
==========================================
+ Hits 29573 29574 +1
- Misses 173267 173379 +112
Partials 1458 1458 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
26a3171
to
7270068
Compare
7270068
to
4901434
Compare
@@ -175,6 +183,7 @@ impl ToolPermissions { | |||
"execute_bash" => "trust read-only commands".dark_grey(), | |||
"use_aws" => "trust read-only commands".dark_grey(), | |||
"report_issue" => "trusted".dark_green().bold(), | |||
"web_search" => "trusted".dark_green().bold(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we should have this as trusted by default, the model could make requests to arbitrary websites that can track the user's activity in unwanted ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. For ref, Claude Code's similar WebFetch requires permissions instead of being trusted by default: https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview#tools-available-to-claude
@@ -147,6 +147,33 @@ | |||
] | |||
} | |||
}, | |||
"web_search": { | |||
"name": "web_search", | |||
"description": "Search/retrieving the web for the specified query. Currently only supports retrieving.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe hold off on including any search related functionality until it's supported? Model might hallucinate
"Retrieving" | ||
}, | ||
if self.mode == WebSearchMode::Search { | ||
self.query.as_ref().unwrap_or(&"".to_string()).clone().dark_green() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit - unwrap_or(&"".to_string()).clone()
seems strange, shouldn't need to clone
here right?
); | ||
|
||
let user_agent = "AmazonQCLI/1.0"; | ||
let client = reqwest::Client::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use fig_request::client()
instead, should be fine to add as a workspace dependency for this. It's used as part of checking for updates anyways. The user agent will be set appropriately
.map_err(|e| eyre::eyre!("Failed to read response body: {}", e))?; | ||
|
||
// Convert HTML to Markdown | ||
let converter = HtmlToMarkdown::builder().skip_tags(vec!["script", "style"]).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning behind this? Does the model not understand html well enough, or is this intended to primarily save on token usage?
Just asking since I would debate whether or not it's worth adding this dependency
} | ||
|
||
// Simple function to check if a path is allowed by robots.txt | ||
fn is_allowed_by_robots_txt(robots_content: &str, user_agent: &str, path: &str) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably have some tests to verify robots txt parsing is correct
Currently only supports retrieving. We need to have more discussion to align on what search engine to use
websearch.mp4