Skip to content

Commit 254c1b2

Browse files
authored
[Docs] Clarify behaviour of Pattern Capture Token Filter during search (#26278)
There was some confusion about the fact that tokens emitted from a Pattern Capture Token Filter are treated as synonyms when used to analyze a search query. This commit adds an explanation to the note in the docs to emphasize this behaviour. Closes #25746
1 parent 181e881 commit 254c1b2

File tree

1 file changed

+6
-4
lines changed

1 file changed

+6
-4
lines changed

docs/reference/analysis/tokenfilters/pattern-capture-tokenfilter.asciidoc

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -131,10 +131,12 @@ Multiple patterns are required to allow overlapping captures, but also
131131
means that patterns are less dense and easier to understand.
132132

133133
*Note:* All tokens are emitted in the same position, and with the same
134-
character offsets, so when combined with highlighting, the whole
135-
original token will be highlighted, not just the matching subset. For
136-
instance, querying the above email address for `"smith"` would
137-
highlight:
134+
character offsets. This means, for example, that a `match` query for
135+
`[email protected]` that uses this analyzer will return documents
136+
containing any of these tokens, even when using the `and` operator.
137+
Also, when combined with highlighting, the whole original token will
138+
be highlighted, not just the matching subset. For instance, querying
139+
the above email address for `"smith"` would highlight:
138140

139141
[source,html]
140142
--------------------------------------------------

0 commit comments

Comments
 (0)