Skip to content

Commit 77ecac6

Browse files
committed
[DOCS] Reformat decimal digit token filter docs (#48722)
1 parent 6f15c12 commit 77ecac6

File tree

1 file changed

+87
-2
lines changed

1 file changed

+87
-2
lines changed
Lines changed: 87 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,89 @@
11
[[analysis-decimal-digit-tokenfilter]]
2-
=== Decimal Digit Token Filter
2+
=== Decimal digit token filter
3+
++++
4+
<titleabbrev>Decimal digit</titleabbrev>
5+
++++
36

4-
The `decimal_digit` token filter folds unicode digits to `0-9`
7+
Converts all digits in the Unicode `Decimal_Number` General Category to `0-9`.
8+
For example, the filter changes the Bengali numeral `৩` to `3`.
9+
10+
This filter uses Lucene's
11+
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysiscore/DecimalDigitFilter.html[DecimalDigitFilter].
12+
13+
[[analysis-decimal-digit-tokenfilter-analyze-ex]]
14+
==== Example
15+
16+
The following <<indices-analyze,analyze API>> request uses the `decimal_digit`
17+
filter to convert Devanagari numerals to `0-9`:
18+
19+
[source,console]
20+
--------------------------------------------------
21+
GET /_analyze
22+
{
23+
"tokenizer" : "whitespace",
24+
"filter" : ["decimal_digit"],
25+
"text" : "१-one two-२ ३"
26+
}
27+
--------------------------------------------------
28+
29+
The filter produces the following tokens:
30+
31+
[source,text]
32+
--------------------------------------------------
33+
[ 1-one, two-2, 3]
34+
--------------------------------------------------
35+
36+
/////////////////////
37+
[source,console-result]
38+
--------------------------------------------------
39+
{
40+
"tokens" : [
41+
{
42+
"token" : "1-one",
43+
"start_offset" : 0,
44+
"end_offset" : 5,
45+
"type" : "word",
46+
"position" : 0
47+
},
48+
{
49+
"token" : "two-2",
50+
"start_offset" : 6,
51+
"end_offset" : 11,
52+
"type" : "word",
53+
"position" : 1
54+
},
55+
{
56+
"token" : "3",
57+
"start_offset" : 12,
58+
"end_offset" : 13,
59+
"type" : "word",
60+
"position" : 2
61+
}
62+
]
63+
}
64+
--------------------------------------------------
65+
/////////////////////
66+
67+
[[analysis-decimal-digit-tokenfilter-analyzer-ex]]
68+
==== Add to an analyzer
69+
70+
The following <<indices-create-index,create index API>> request uses the
71+
`decimal_digit` filter to configure a new
72+
<<analysis-custom-analyzer,custom analyzer>>.
73+
74+
[source,console]
75+
--------------------------------------------------
76+
PUT /decimal_digit_example
77+
{
78+
"settings" : {
79+
"analysis" : {
80+
"analyzer" : {
81+
"whitespace_decimal_digit" : {
82+
"tokenizer" : "whitespace",
83+
"filter" : ["decimal_digit"]
84+
}
85+
}
86+
}
87+
}
88+
}
89+
--------------------------------------------------

0 commit comments

Comments
 (0)