Skip to content

Commit 4be9cfa

Browse files
Rollup merge of #109422 - notriddle:notriddle/impl-disambiguate-search, r=GuillaumeGomez
rustdoc-search: add impl disambiguator to duplicate assoc items Preview (to see the difference, click the link and pay attention to the specific function that comes up): | Before | After | |--|--| | [`simd<i64>, simd<i64> -> simd<i64>`](https://doc.rust-lang.org/nightly/std/?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) | [`simd<i64>, simd<i64> -> simd<i64>`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) | | [`cow, vec -> bool`](https://doc.rust-lang.org/nightly/std/?search=cow%2C%20vec%20-%3E%20bool) | [`cow, vec -> bool`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=cow%2C%20vec%20-%3E%20bool) Helps with #90929 This changes the search results, specifically, when there's more than one impl with an associated item with the same name. For example, the search queries `simd<i8> -> simd<i8>` and `simd<i64> -> simd<i64>` don't link to the same function, but most of the functions have the same names. This change should probably be FCP-ed, especially since it adds a new anchor link format for `main.js` to handle, so that URLs like `struct.Vec.html#impl-AsMut<[T]>-for-Vec<T,+A>/method.as_mut` redirect to `struct.Vec.html#method.as_mut-2`. It's a strange design, but there are a few reasons for it: * I'd like to avoid making the HTML bigger. Obviously, fixing this bug is going to add at least a little more data to the search index, but adding more HTML penalises viewers for the benefit of searchers. * Breaking `struct.Vec.html#method.len` would also be a disappointment. On the other hand: * The path-style anchors might be less prone to link rot than the numbered anchors. It's definitely less likely to have URLs that appear to "work", but silently point at the wrong thing. * This commit arranges the path-style anchor to redirect to the numbered anchor. Nothing stops rustdoc from doing the opposite, making path-style anchors the default and redirecting the "legacy" numbered ones. ### The bug On the "Before" links, this example search calls for `i64`: ![image](https://github.com/rust-lang/rust/assets/1593513/9431d89d-41dc-4f68-bbb1-3e2704a973d2) But if I click any of the results, I get `f64` instead. ![image](https://github.com/rust-lang/rust/assets/1593513/6d89c692-1847-421a-84d9-22e359d9cf82) The PR fixes this problem by adding enough information to the search result `href` to disambiguate methods with different types but the same name. More detailed description of the problem at: #109422 (comment) > When a struct/enum/union has multiple impls with different type parameters, it can have multiple methods that have the same name, but which are on different impls. Besides Simd, [Any](https://doc.rust-lang.org/nightly/std/any/trait.Any.html?search=any%3A%3Adowncast) also demonstrates this pattern. It has three methods named `downcast`, on three different impls. > > When that happens, it presents a challenge in linking to the method. Normally we link like `#method.foo`. When there are multiple `foo`, we number them like `#method.foo`, `#method.foo-1`, `#method.foo-2`, etc. > > It also presents a challenge for our search code. Currently we store all the variants in the index, but don’t have any way to generate unambiguous URLs in the results page, or to distinguish them in the SERP. > > To fix this, we need three things: > > 1. A fragment format that fully specifies the impl type parameters when needed to disambiguate (`#impl-SimdOrd-for-Simd<i64,+LANES>/method.simd_max`) > 2. A search index that stores methods with enough information to disambiguate the impl they were on. > 3. A search results interface that can display multiple methods on the same type with the same name, when appropriate OR a disambiguation landing section on item pages? > > For reviewers: it can be hard to see the new fragment format in action since it immediately gets rewritten to the numbered form.
2 parents 5b88d65 + 2a4c9d0 commit 4be9cfa

20 files changed

+350
-63
lines changed

src/librustdoc/formats/cache.rs

+42-31
Original file line numberDiff line numberDiff line change
@@ -223,17 +223,16 @@ impl<'a, 'tcx> DocFolder for CacheBuilder<'a, 'tcx> {
223223

224224
// If the impl is from a masked crate or references something from a
225225
// masked crate then remove it completely.
226-
if let clean::ImplItem(ref i) = *item.kind {
227-
if self.cache.masked_crates.contains(&item.item_id.krate())
226+
if let clean::ImplItem(ref i) = *item.kind &&
227+
(self.cache.masked_crates.contains(&item.item_id.krate())
228228
|| i.trait_
229229
.as_ref()
230230
.map_or(false, |t| self.cache.masked_crates.contains(&t.def_id().krate))
231231
|| i.for_
232232
.def_id(self.cache)
233-
.map_or(false, |d| self.cache.masked_crates.contains(&d.krate))
234-
{
235-
return None;
236-
}
233+
.map_or(false, |d| self.cache.masked_crates.contains(&d.krate)))
234+
{
235+
return None;
237236
}
238237

239238
// Propagate a trait method's documentation to all implementors of the
@@ -334,33 +333,37 @@ impl<'a, 'tcx> DocFolder for CacheBuilder<'a, 'tcx> {
334333
// A crate has a module at its root, containing all items,
335334
// which should not be indexed. The crate-item itself is
336335
// inserted later on when serializing the search-index.
337-
if item.item_id.as_def_id().map_or(false, |idx| !idx.is_crate_root()) {
336+
if item.item_id.as_def_id().map_or(false, |idx| !idx.is_crate_root())
337+
&& let ty = item.type_()
338+
&& (ty != ItemType::StructField
339+
|| u16::from_str_radix(s.as_str(), 10).is_err())
340+
{
338341
let desc =
339342
short_markdown_summary(&item.doc_value(), &item.link_names(self.cache));
340-
let ty = item.type_();
341-
if ty != ItemType::StructField
342-
|| u16::from_str_radix(s.as_str(), 10).is_err()
343-
{
344-
// In case this is a field from a tuple struct, we don't add it into
345-
// the search index because its name is something like "0", which is
346-
// not useful for rustdoc search.
347-
self.cache.search_index.push(IndexItem {
348-
ty,
349-
name: s,
350-
path: join_with_double_colon(path),
351-
desc,
352-
parent,
353-
parent_idx: None,
354-
search_type: get_function_type_for_search(
355-
&item,
356-
self.tcx,
357-
clean_impl_generics(self.cache.parent_stack.last()).as_ref(),
358-
self.cache,
359-
),
360-
aliases: item.attrs.get_doc_aliases(),
361-
deprecation: item.deprecation(self.tcx),
362-
});
363-
}
343+
// In case this is a field from a tuple struct, we don't add it into
344+
// the search index because its name is something like "0", which is
345+
// not useful for rustdoc search.
346+
self.cache.search_index.push(IndexItem {
347+
ty,
348+
name: s,
349+
path: join_with_double_colon(path),
350+
desc,
351+
parent,
352+
parent_idx: None,
353+
impl_id: if let Some(ParentStackItem::Impl { item_id, .. }) = self.cache.parent_stack.last() {
354+
item_id.as_def_id()
355+
} else {
356+
None
357+
},
358+
search_type: get_function_type_for_search(
359+
&item,
360+
self.tcx,
361+
clean_impl_generics(self.cache.parent_stack.last()).as_ref(),
362+
self.cache,
363+
),
364+
aliases: item.attrs.get_doc_aliases(),
365+
deprecation: item.deprecation(self.tcx),
366+
});
364367
}
365368
}
366369
(Some(parent), None) if is_inherent_impl_item => {
@@ -371,6 +374,13 @@ impl<'a, 'tcx> DocFolder for CacheBuilder<'a, 'tcx> {
371374
parent,
372375
item: item.clone(),
373376
impl_generics,
377+
impl_id: if let Some(ParentStackItem::Impl { item_id, .. }) =
378+
self.cache.parent_stack.last()
379+
{
380+
item_id.as_def_id()
381+
} else {
382+
None
383+
},
374384
});
375385
}
376386
_ => {}
@@ -541,6 +551,7 @@ impl<'a, 'tcx> DocFolder for CacheBuilder<'a, 'tcx> {
541551

542552
pub(crate) struct OrphanImplItem {
543553
pub(crate) parent: DefId,
554+
pub(crate) impl_id: Option<DefId>,
544555
pub(crate) item: clean::Item,
545556
pub(crate) impl_generics: Option<(clean::Type, clean::Generics)>,
546557
}

src/librustdoc/html/render/mod.rs

+28-13
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ use rustc_data_structures::fx::{FxHashMap, FxHashSet};
5454
use rustc_hir::def_id::{DefId, DefIdSet};
5555
use rustc_hir::Mutability;
5656
use rustc_middle::middle::stability;
57-
use rustc_middle::ty::TyCtxt;
57+
use rustc_middle::ty::{self, TyCtxt};
5858
use rustc_span::{
5959
symbol::{sym, Symbol},
6060
BytePos, FileName, RealFileName,
@@ -102,6 +102,7 @@ pub(crate) struct IndexItem {
102102
pub(crate) desc: String,
103103
pub(crate) parent: Option<DefId>,
104104
pub(crate) parent_idx: Option<isize>,
105+
pub(crate) impl_id: Option<DefId>,
105106
pub(crate) search_type: Option<IndexItemFunctionType>,
106107
pub(crate) aliases: Box<[Symbol]>,
107108
pub(crate) deprecation: Option<Deprecation>,
@@ -1877,7 +1878,7 @@ pub(crate) fn render_impl_summary(
18771878
aliases: &[String],
18781879
) {
18791880
let inner_impl = i.inner_impl();
1880-
let id = cx.derive_id(get_id_for_impl(&inner_impl.for_, inner_impl.trait_.as_ref(), cx));
1881+
let id = cx.derive_id(get_id_for_impl(cx.tcx(), i.impl_item.item_id));
18811882
let aliases = if aliases.is_empty() {
18821883
String::new()
18831884
} else {
@@ -1994,21 +1995,35 @@ pub(crate) fn small_url_encode(s: String) -> String {
19941995
}
19951996
}
19961997

1997-
fn get_id_for_impl(for_: &clean::Type, trait_: Option<&clean::Path>, cx: &Context<'_>) -> String {
1998-
match trait_ {
1999-
Some(t) => small_url_encode(format!("impl-{:#}-for-{:#}", t.print(cx), for_.print(cx))),
2000-
None => small_url_encode(format!("impl-{:#}", for_.print(cx))),
2001-
}
1998+
fn get_id_for_impl<'tcx>(tcx: TyCtxt<'tcx>, impl_id: ItemId) -> String {
1999+
use rustc_middle::ty::print::with_forced_trimmed_paths;
2000+
let (type_, trait_) = match impl_id {
2001+
ItemId::Auto { trait_, for_ } => {
2002+
let ty = tcx.type_of(for_).skip_binder();
2003+
(ty, Some(ty::TraitRef::new(tcx, trait_, [ty])))
2004+
}
2005+
ItemId::Blanket { impl_id, .. } | ItemId::DefId(impl_id) => {
2006+
match tcx.impl_subject(impl_id).skip_binder() {
2007+
ty::ImplSubject::Trait(trait_ref) => {
2008+
(trait_ref.args[0].expect_ty(), Some(trait_ref))
2009+
}
2010+
ty::ImplSubject::Inherent(ty) => (ty, None),
2011+
}
2012+
}
2013+
};
2014+
with_forced_trimmed_paths!(small_url_encode(if let Some(trait_) = trait_ {
2015+
format!("impl-{trait_}-for-{type_}", trait_ = trait_.print_only_trait_path())
2016+
} else {
2017+
format!("impl-{type_}")
2018+
}))
20022019
}
20032020

20042021
fn extract_for_impl_name(item: &clean::Item, cx: &Context<'_>) -> Option<(String, String)> {
20052022
match *item.kind {
2006-
clean::ItemKind::ImplItem(ref i) => {
2007-
i.trait_.as_ref().map(|trait_| {
2008-
// Alternative format produces no URLs,
2009-
// so this parameter does nothing.
2010-
(format!("{:#}", i.for_.print(cx)), get_id_for_impl(&i.for_, Some(trait_), cx))
2011-
})
2023+
clean::ItemKind::ImplItem(ref i) if i.trait_.is_some() => {
2024+
// Alternative format produces no URLs,
2025+
// so this parameter does nothing.
2026+
Some((format!("{:#}", i.for_.print(cx)), get_id_for_impl(cx.tcx(), item.item_id)))
20122027
}
20132028
_ => None,
20142029
}

src/librustdoc/html/render/search_index.rs

+31-2
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ use crate::formats::cache::{Cache, OrphanImplItem};
1212
use crate::formats::item_type::ItemType;
1313
use crate::html::format::join_with_double_colon;
1414
use crate::html::markdown::short_markdown_summary;
15-
use crate::html::render::{IndexItem, IndexItemFunctionType, RenderType, RenderTypeId};
15+
use crate::html::render::{self, IndexItem, IndexItemFunctionType, RenderType, RenderTypeId};
1616

1717
/// Builds the search index from the collected metadata
1818
pub(crate) fn build_index<'tcx>(
@@ -26,7 +26,8 @@ pub(crate) fn build_index<'tcx>(
2626

2727
// Attach all orphan items to the type's definition if the type
2828
// has since been learned.
29-
for &OrphanImplItem { parent, ref item, ref impl_generics } in &cache.orphan_impl_items {
29+
for &OrphanImplItem { impl_id, parent, ref item, ref impl_generics } in &cache.orphan_impl_items
30+
{
3031
if let Some((fqp, _)) = cache.paths.get(&parent) {
3132
let desc = short_markdown_summary(&item.doc_value(), &item.link_names(cache));
3233
cache.search_index.push(IndexItem {
@@ -36,6 +37,7 @@ pub(crate) fn build_index<'tcx>(
3637
desc,
3738
parent: Some(parent),
3839
parent_idx: None,
40+
impl_id,
3941
search_type: get_function_type_for_search(item, tcx, impl_generics.as_ref(), cache),
4042
aliases: item.attrs.get_doc_aliases(),
4143
deprecation: item.deprecation(tcx),
@@ -222,6 +224,29 @@ pub(crate) fn build_index<'tcx>(
222224
})
223225
.collect();
224226

227+
// Find associated items that need disambiguators
228+
let mut associated_item_duplicates = FxHashMap::<(isize, ItemType, Symbol), usize>::default();
229+
230+
for &item in &crate_items {
231+
if item.impl_id.is_some() && let Some(parent_idx) = item.parent_idx {
232+
let count = associated_item_duplicates
233+
.entry((parent_idx, item.ty, item.name))
234+
.or_insert(0);
235+
*count += 1;
236+
}
237+
}
238+
239+
let associated_item_disambiguators = crate_items
240+
.iter()
241+
.enumerate()
242+
.filter_map(|(index, item)| {
243+
let impl_id = ItemId::DefId(item.impl_id?);
244+
let parent_idx = item.parent_idx?;
245+
let count = *associated_item_duplicates.get(&(parent_idx, item.ty, item.name))?;
246+
if count > 1 { Some((index, render::get_id_for_impl(tcx, impl_id))) } else { None }
247+
})
248+
.collect::<Vec<_>>();
249+
225250
struct CrateData<'a> {
226251
doc: String,
227252
items: Vec<&'a IndexItem>,
@@ -230,6 +255,8 @@ pub(crate) fn build_index<'tcx>(
230255
//
231256
// To be noted: the `usize` elements are indexes to `items`.
232257
aliases: &'a BTreeMap<String, Vec<usize>>,
258+
// Used when a type has more than one impl with an associated item with the same name.
259+
associated_item_disambiguators: &'a Vec<(usize, String)>,
233260
}
234261

235262
struct Paths {
@@ -382,6 +409,7 @@ pub(crate) fn build_index<'tcx>(
382409
crate_data.serialize_field("f", &functions)?;
383410
crate_data.serialize_field("c", &deprecated)?;
384411
crate_data.serialize_field("p", &paths)?;
412+
crate_data.serialize_field("b", &self.associated_item_disambiguators)?;
385413
if has_aliases {
386414
crate_data.serialize_field("a", &self.aliases)?;
387415
}
@@ -398,6 +426,7 @@ pub(crate) fn build_index<'tcx>(
398426
items: crate_items,
399427
paths: crate_paths,
400428
aliases: &aliases,
429+
associated_item_disambiguators: &associated_item_disambiguators,
401430
})
402431
.expect("failed serde conversion")
403432
// All these `replace` calls are because we have to go through JS string for JSON content.

src/librustdoc/html/render/sidebar.rs

+1-2
Original file line numberDiff line numberDiff line change
@@ -503,8 +503,7 @@ fn sidebar_render_assoc_items(
503503
.iter()
504504
.filter_map(|it| {
505505
let trait_ = it.inner_impl().trait_.as_ref()?;
506-
let encoded =
507-
id_map.derive(super::get_id_for_impl(&it.inner_impl().for_, Some(trait_), cx));
506+
let encoded = id_map.derive(super::get_id_for_impl(cx.tcx(), it.impl_item.item_id));
508507

509508
let prefix = match it.inner_impl().polarity {
510509
ty::ImplPolarity::Positive | ty::ImplPolarity::Reservation => "",

src/librustdoc/html/static/js/main.js

+28
Original file line numberDiff line numberDiff line change
@@ -354,6 +354,34 @@ function preLoadCss(cssUrl) {
354354
expandSection(pageId);
355355
}
356356
}
357+
if (savedHash.startsWith("impl-")) {
358+
// impl-disambiguated links, used by the search engine
359+
// format: impl-X[-for-Y]/method.WHATEVER
360+
// turn this into method.WHATEVER[-NUMBER]
361+
const splitAt = savedHash.indexOf("/");
362+
if (splitAt !== -1) {
363+
const implId = savedHash.slice(0, splitAt);
364+
const assocId = savedHash.slice(splitAt + 1);
365+
const implElem = document.getElementById(implId);
366+
if (implElem && implElem.parentElement.tagName === "SUMMARY" &&
367+
implElem.parentElement.parentElement.tagName === "DETAILS") {
368+
onEachLazy(implElem.parentElement.parentElement.querySelectorAll(
369+
`[id^="${assocId}"]`),
370+
item => {
371+
const numbered = /([^-]+)-([0-9]+)/.exec(item.id);
372+
if (item.id === assocId || (numbered && numbered[1] === assocId)) {
373+
openParentDetails(item);
374+
item.scrollIntoView();
375+
// Let the section expand itself before trying to highlight
376+
setTimeout(() => {
377+
window.location.replace("#" + item.id);
378+
}, 0);
379+
}
380+
}
381+
);
382+
}
383+
}
384+
}
357385
}
358386

359387
function onHashChange(ev) {

src/librustdoc/html/static/js/search.js

+16-3
Original file line numberDiff line numberDiff line change
@@ -1752,6 +1752,7 @@ function initSearch(rawSearchIndex) {
17521752
type: item.type,
17531753
is_alias: true,
17541754
deprecated: item.deprecated,
1755+
implDisambiguator: item.implDisambiguator,
17551756
};
17561757
}
17571758

@@ -2218,7 +2219,7 @@ function initSearch(rawSearchIndex) {
22182219
href = ROOT_PATH + name + "/index.html";
22192220
} else if (item.parent !== undefined) {
22202221
const myparent = item.parent;
2221-
let anchor = "#" + type + "." + name;
2222+
let anchor = type + "." + name;
22222223
const parentType = itemTypes[myparent.ty];
22232224
let pageType = parentType;
22242225
let pageName = myparent.name;
@@ -2232,16 +2233,19 @@ function initSearch(rawSearchIndex) {
22322233
const enumName = item.path.substr(enumNameIdx + 2);
22332234
path = item.path.substr(0, enumNameIdx);
22342235
displayPath = path + "::" + enumName + "::" + myparent.name + "::";
2235-
anchor = "#variant." + myparent.name + ".field." + name;
2236+
anchor = "variant." + myparent.name + ".field." + name;
22362237
pageType = "enum";
22372238
pageName = enumName;
22382239
} else {
22392240
displayPath = path + "::" + myparent.name + "::";
22402241
}
2242+
if (item.implDisambiguator !== null) {
2243+
anchor = item.implDisambiguator + "/" + anchor;
2244+
}
22412245
href = ROOT_PATH + path.replace(/::/g, "/") +
22422246
"/" + pageType +
22432247
"." + pageName +
2244-
".html" + anchor;
2248+
".html#" + anchor;
22452249
} else {
22462250
displayPath = item.path + "::";
22472251
href = ROOT_PATH + item.path.replace(/::/g, "/") +
@@ -2727,6 +2731,10 @@ ${item.displayPath}<span class="${type}">${name}</span>\
27272731
* Types are also represented as arrays; the first item is an index into the `p`
27282732
* array, while the second is a list of types representing any generic parameters.
27292733
*
2734+
* b[i] contains an item's impl disambiguator. This is only present if an item
2735+
* is defined in an impl block and, the impl block's type has more than one associated
2736+
* item with the same name.
2737+
*
27302738
* `a` defines aliases with an Array of pairs: [name, offset], where `offset`
27312739
* points into the n/t/d/q/i/f arrays.
27322740
*
@@ -2746,6 +2754,7 @@ ${item.displayPath}<span class="${type}">${name}</span>\
27462754
* i: Array<Number>,
27472755
* f: Array<RawFunctionSearchType>,
27482756
* p: Array<Object>,
2757+
* b: Array<[Number, String]>,
27492758
* c: Array<Number>
27502759
* }}
27512760
*/
@@ -2766,6 +2775,7 @@ ${item.displayPath}<span class="${type}">${name}</span>\
27662775
id: id,
27672776
normalizedName: crate.indexOf("_") === -1 ? crate : crate.replace(/_/g, ""),
27682777
deprecated: null,
2778+
implDisambiguator: null,
27692779
};
27702780
id += 1;
27712781
searchIndex.push(crateRow);
@@ -2789,6 +2799,8 @@ ${item.displayPath}<span class="${type}">${name}</span>\
27892799
const itemFunctionSearchTypes = crateCorpus.f;
27902800
// an array of (Number) indices for the deprecated items
27912801
const deprecatedItems = new Set(crateCorpus.c);
2802+
// an array of (Number) indices for the deprecated items
2803+
const implDisambiguator = new Map(crateCorpus.b);
27922804
// an array of [(Number) item type,
27932805
// (String) name]
27942806
const paths = crateCorpus.p;
@@ -2849,6 +2861,7 @@ ${item.displayPath}<span class="${type}">${name}</span>\
28492861
id: id,
28502862
normalizedName: word.indexOf("_") === -1 ? word : word.replace(/_/g, ""),
28512863
deprecated: deprecatedItems.has(i),
2864+
implDisambiguator: implDisambiguator.has(i) ? implDisambiguator.get(i) : null,
28522865
};
28532866
id += 1;
28542867
searchIndex.push(row);

0 commit comments

Comments
 (0)