Skip to content

Feature: Web Search #1276

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Feature: Web Search #1276

wants to merge 1 commit into from

Conversation

GoodluckH
Copy link
Contributor

@GoodluckH GoodluckH commented Apr 17, 2025

Currently only supports retrieving. We need to have more discussion to align on what search engine to use

websearch.mp4

@GoodluckH GoodluckH linked an issue Apr 17, 2025 that may be closed by this pull request
@GoodluckH GoodluckH self-assigned this Apr 17, 2025
@codecov-commenter
Copy link

codecov-commenter commented Apr 17, 2025

Codecov Report

Attention: Patch coverage is 0.88496% with 112 lines in your changes missing coverage. Please review.

Project coverage is 14.46%. Comparing base (8d6bbf9) to head (4901434).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
crates/q_chat/src/tools/web_search.rs 0.00% 106 Missing ⚠️
crates/q_chat/src/tools/mod.rs 16.66% 5 Missing ⚠️
crates/q_chat/src/tool_manager.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1276      +/-   ##
==========================================
- Coverage   14.47%   14.46%   -0.01%     
==========================================
  Files        2345     2346       +1     
  Lines      204298   204411     +113     
  Branches   184662   184775     +113     
==========================================
+ Hits        29573    29574       +1     
- Misses     173267   173379     +112     
  Partials     1458     1458              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@GoodluckH GoodluckH force-pushed the xipu/web-search branch 3 times, most recently from 26a3171 to 7270068 Compare April 21, 2025 23:31
@GoodluckH GoodluckH marked this pull request as ready for review April 21, 2025 23:32
@GoodluckH GoodluckH requested a review from a team April 21, 2025 23:32
@@ -175,6 +183,7 @@ impl ToolPermissions {
"execute_bash" => "trust read-only commands".dark_grey(),
"use_aws" => "trust read-only commands".dark_grey(),
"report_issue" => "trusted".dark_green().bold(),
"web_search" => "trusted".dark_green().bold(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think we should have this as trusted by default, the model could make requests to arbitrary websites that can track the user's activity in unwanted ways.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. For ref, Claude Code's similar WebFetch requires permissions instead of being trusted by default: https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/overview#tools-available-to-claude

@@ -147,6 +147,33 @@
]
}
},
"web_search": {
"name": "web_search",
"description": "Search/retrieving the web for the specified query. Currently only supports retrieving.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe hold off on including any search related functionality until it's supported? Model might hallucinate

"Retrieving"
},
if self.mode == WebSearchMode::Search {
self.query.as_ref().unwrap_or(&"".to_string()).clone().dark_green()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - unwrap_or(&"".to_string()).clone() seems strange, shouldn't need to clone here right?

);

let user_agent = "AmazonQCLI/1.0";
let client = reqwest::Client::new();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use fig_request::client() instead, should be fine to add as a workspace dependency for this. It's used as part of checking for updates anyways. The user agent will be set appropriately

.map_err(|e| eyre::eyre!("Failed to read response body: {}", e))?;

// Convert HTML to Markdown
let converter = HtmlToMarkdown::builder().skip_tags(vec!["script", "style"]).build();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning behind this? Does the model not understand html well enough, or is this intended to primarily save on token usage?

Just asking since I would debate whether or not it's worth adding this dependency

}

// Simple function to check if a path is allowed by robots.txt
fn is_allowed_by_robots_txt(robots_content: &str, user_agent: &str, path: &str) -> bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have some tests to verify robots txt parsing is correct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Web search
4 participants