|
4 | 4 | <titleabbrev>Simple</titleabbrev>
|
5 | 5 | ++++
|
6 | 6 |
|
7 |
| -The `simple` analyzer breaks text into terms whenever it encounters a |
8 |
| -character which is not a letter. All terms are lower cased. |
| 7 | +The `simple` analyzer breaks text into tokens at any non-letter character, such |
| 8 | +as numbers, spaces, hyphens and apostrophes, discards non-letter characters, |
| 9 | +and changes uppercase to lowercase. |
9 | 10 |
|
10 |
| -[float] |
11 |
| -=== Example output |
| 11 | +[[analysis-simple-analyzer-ex]] |
| 12 | +==== Example |
12 | 13 |
|
13 | 14 | [source,console]
|
14 |
| ---------------------------- |
| 15 | +---- |
15 | 16 | POST _analyze
|
16 | 17 | {
|
17 | 18 | "analyzer": "simple",
|
18 | 19 | "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
|
19 | 20 | }
|
20 |
| ---------------------------- |
21 |
| - |
22 |
| -///////////////////// |
| 21 | +---- |
23 | 22 |
|
| 23 | +//// |
24 | 24 | [source,console-result]
|
25 |
| ----------------------------- |
| 25 | +---- |
26 | 26 | {
|
27 | 27 | "tokens": [
|
28 | 28 | {
|
@@ -104,52 +104,47 @@ POST _analyze
|
104 | 104 | }
|
105 | 105 | ]
|
106 | 106 | }
|
107 |
| ----------------------------- |
108 |
| -
|
109 |
| -///////////////////// |
110 |
| - |
| 107 | +---- |
| 108 | +//// |
111 | 109 |
|
112 |
| -The above sentence would produce the following terms: |
| 110 | +The `simple` analyzer parses the sentence and produces the following |
| 111 | +tokens: |
113 | 112 |
|
114 | 113 | [source,text]
|
115 |
| ---------------------------- |
| 114 | +---- |
116 | 115 | [ the, quick, brown, foxes, jumped, over, the, lazy, dog, s, bone ]
|
117 |
| ---------------------------- |
| 116 | +---- |
118 | 117 |
|
119 |
| -[float] |
120 |
| -=== Configuration |
| 118 | +[[analysis-simple-analyzer-definition]] |
| 119 | +==== Definition |
121 | 120 |
|
122 |
| -The `simple` analyzer is not configurable. |
123 |
| - |
124 |
| -[float] |
125 |
| -=== Definition |
126 |
| - |
127 |
| -The `simple` analzyer consists of: |
| 121 | +The `simple` analyzer is defined by one tokenizer: |
128 | 122 |
|
129 | 123 | Tokenizer::
|
130 |
| -* <<analysis-lowercase-tokenizer,Lower Case Tokenizer>> |
| 124 | +* <<analysis-lowercase-tokenizer, Lowercase Tokenizer>> |
| 125 | + |
| 126 | +[[analysis-simple-analyzer-customize]] |
| 127 | +==== Customize |
131 | 128 |
|
132 |
| -If you need to customize the `simple` analyzer then you need to recreate |
133 |
| -it as a `custom` analyzer and modify it, usually by adding token filters. |
134 |
| -This would recreate the built-in `simple` analyzer and you can use it as |
135 |
| -a starting point for further customization: |
| 129 | +To customize the `simple` analyzer, duplicate it to create the basis for |
| 130 | +a custom analyzer. This custom analyzer can be modified as required, usually by |
| 131 | +adding token filters. |
136 | 132 |
|
137 | 133 | [source,console]
|
138 |
| ----------------------------------------------------- |
139 |
| -PUT /simple_example |
| 134 | +---- |
| 135 | +PUT /my_index |
140 | 136 | {
|
141 | 137 | "settings": {
|
142 | 138 | "analysis": {
|
143 | 139 | "analyzer": {
|
144 |
| - "rebuilt_simple": { |
| 140 | + "my_custom_simple_analyzer": { |
145 | 141 | "tokenizer": "lowercase",
|
146 |
| - "filter": [ <1> |
| 142 | + "filter": [ <1> |
147 | 143 | ]
|
148 | 144 | }
|
149 | 145 | }
|
150 | 146 | }
|
151 | 147 | }
|
152 | 148 | }
|
153 |
| ----------------------------------------------------- |
154 |
| -// TEST[s/\n$/\nstartyaml\n - compare_analyzers: {index: simple_example, first: simple, second: rebuilt_simple}\nendyaml\n/] |
155 |
| -<1> You'd add any token filters here. |
| 149 | +---- |
| 150 | +<1> Add token filters here. |
0 commit comments