Skip to content

Commit a9fdcad

Browse files
committed
[DOCS] Added documentation for the keep word token filter
1 parent 5f170cb commit a9fdcad

File tree

3 files changed

+53
-0
lines changed

3 files changed

+53
-0
lines changed

docs/reference/analysis/tokenfilters.asciidoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,3 +70,5 @@ include::tokenfilters/common-grams-tokenfilter.asciidoc[]
7070
include::tokenfilters/normalization-tokenfilter.asciidoc[]
7171

7272
include::tokenfilters/delimited-payload-tokenfilter.asciidoc[]
73+
74+
include::tokenfilters/keep-words-tokenfilter.asciidoc[]
Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
[[analysis-keep-words-tokenfilter]]
2+
=== Keep Words Token Filter
3+
4+
A token filter of type `keep` that only keeps tokens with text contained in a
5+
predefined set of words. The set of words can be defined in the settings or
6+
loaded from a text file containing one word per line.
7+
8+
9+
[float]
10+
=== Options
11+
[horizontal]
12+
keep_words:: a list of words to keep
13+
keep_words_path:: a path to a words file
14+
keep_words_case:: a boolean indicating whether to lower case the words (defaults to `false`)
15+
16+
17+
18+
[float]
19+
=== Settings example
20+
21+
[source,js]
22+
--------------------------------------------------
23+
{
24+
"index" : {
25+
"analysis" : {
26+
"analyzer" : {
27+
"my_analyzer" : {
28+
"tokenizer" : "standard",
29+
"filter" : ["standard", "lowercase", "words_till_three"]
30+
},
31+
"my_analyzer1" : {
32+
"tokenizer" : "standard",
33+
"filter" : ["standard", "lowercase", "words_on_file"]
34+
}
35+
},
36+
"filter" : {
37+
"words_till_three" : {
38+
"type" : "keep",
39+
"keep_words" : [ "one", "two", "three"]
40+
},
41+
"words_on_file" : {
42+
"type" : "keep",
43+
"keep_words_path" : "/path/to/word/file"
44+
}
45+
}
46+
}
47+
}
48+
}
49+
--------------------------------------------------

docs/reference/analysis/tokenfilters/normalization-tokenfilter.asciidoc

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,5 @@ http://lucene.apache.org/core/4_3_1/analyzers-common/org/apache/lucene/analysis/
1111
or the
1212
http://lucene.apache.org/core/4_3_1/analyzers-common/org/apache/lucene/analysis/fa/PersianNormalizer.html[PersianNormalizer]
1313
documentation.
14+
15+
*Note:* These filters are available since `0.90.2`

0 commit comments

Comments
 (0)