Skip to content

Commit 0839cc0

Browse files
authored
[DOCS] Backporting API ref reformatting for document APIs (#47631) (#47684)
* [DOCS] Backporting API ref reformatting for document APIs (#47631) * [DOCS] Reformats bulk API. (#47479) * Reformats bulk API. * Update docs/reference/docs/bulk.asciidoc Co-Authored-By: James Rodewig <[email protected]> * Reformats mget API (#47477) * Reformats mget API * Update docs/reference/docs/get.asciidoc Co-Authored-By: James Rodewig <[email protected]> * Incorporated feedback. * Reformats reindex API (#47483) * Reformats reindex API * Incorporated review feedback. * Reformats term vectors APIs (#47484) * Reformat termvectors APIs * Reformats mtermvectors * Apply suggestions from code review Co-Authored-By: James Rodewig <[email protected]> * Incorporated review feedback. * Fixed console snippets.
1 parent 6644e51 commit 0839cc0

File tree

7 files changed

+1274
-1147
lines changed

7 files changed

+1274
-1147
lines changed

docs/reference/docs/bulk.asciidoc

+180-118
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,37 @@
11
[[docs-bulk]]
22
=== Bulk API
3+
++++
4+
<titleabbrev>Bulk</titleabbrev>
5+
++++
36

4-
The bulk API makes it possible to perform many index/delete operations
5-
in a single API call. This can greatly increase the indexing speed.
7+
Performs multiple indexing or delete operations in a single API call.
8+
This reduces overhead and can greatly increase indexing speed.
69

7-
.Client support for bulk requests
8-
*********************************************
9-
10-
Some of the officially supported clients provide helpers to assist with
11-
bulk requests and reindexing of documents from one index to another:
10+
[source,console]
11+
--------------------------------------------------
12+
POST _bulk
13+
{ "index" : { "_index" : "test", "_id" : "1" } }
14+
{ "field1" : "value1" }
15+
{ "delete" : { "_index" : "test", "_id" : "2" } }
16+
{ "create" : { "_index" : "test", "_id" : "3" } }
17+
{ "field1" : "value3" }
18+
{ "update" : {"_id" : "1", "_index" : "test"} }
19+
{ "doc" : {"field2" : "value2"} }
20+
--------------------------------------------------
1221

13-
Perl::
22+
[[docs-bulk-api-request]]
23+
==== {api-request-title}
1424

15-
See https://metacpan.org/pod/Search::Elasticsearch::Client::5_0::Bulk[Search::Elasticsearch::Client::5_0::Bulk]
16-
and https://metacpan.org/pod/Search::Elasticsearch::Client::5_0::Scroll[Search::Elasticsearch::Client::5_0::Scroll]
25+
`POST /_bulk`
1726

18-
Python::
27+
`POST /<index>/_bulk`
1928

20-
See http://elasticsearch-py.readthedocs.org/en/master/helpers.html[elasticsearch.helpers.*]
29+
[[docs-bulk-api-desc]]
30+
==== {api-description-title}
2131

22-
*********************************************
32+
Provides a way to perform multiple `index`, `create`, `delete`, and `update` actions in a single request.
2333

24-
The REST API endpoint is `/_bulk`, and it expects the following newline delimited JSON
25-
(NDJSON) structure:
34+
The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:
2635

2736
[source,js]
2837
--------------------------------------------------
@@ -36,19 +45,70 @@ optional_source\n
3645
--------------------------------------------------
3746
// NOTCONSOLE
3847

39-
*NOTE*: The final line of data must end with a newline character `\n`. Each newline character
40-
may be preceded by a carriage return `\r`. When sending requests to this endpoint the
41-
`Content-Type` header should be set to `application/x-ndjson`.
48+
The `index` and `create` actions expect a source on the next line,
49+
and have the same semantics as the `op_type` parameter in the standard index API:
50+
create fails if a document with the same name already exists in the index,
51+
index adds or replaces a document as necessary.
52+
53+
`update` expects that the partial doc, upsert,
54+
and script and its options are specified on the next line.
55+
56+
`delete` does not expect a source on the next line and
57+
has the same semantics as the standard delete API.
58+
59+
[NOTE]
60+
====
61+
The final line of data must end with a newline character `\n`.
62+
Each newline character may be preceded by a carriage return `\r`.
63+
When sending requests to the `_bulk` endpoint,
64+
the `Content-Type` header should be set to `application/x-ndjson`.
65+
====
66+
67+
Because this format uses literal `\n`'s as delimiters,
68+
make sure that the JSON actions and sources are not pretty printed.
69+
70+
If you specify an index in the request URI,
71+
it is used for any actions that don't explicitly specify an index.
72+
73+
A note on the format: The idea here is to make processing of this as
74+
fast as possible. As some of the actions are redirected to other
75+
shards on other nodes, only `action_meta_data` is parsed on the
76+
receiving node side.
77+
78+
Client libraries using this protocol should try and strive to do
79+
something similar on the client side, and reduce buffering as much as
80+
possible.
81+
82+
The response to a bulk action is a large JSON structure with
83+
the individual results of each action performed,
84+
in the same order as the actions that appeared in the request.
85+
The failure of a single action does not affect the remaining actions.
86+
87+
There is no "correct" number of actions to perform in a single bulk request.
88+
Experiment with different settings to find the optimal size for your particular workload.
89+
90+
When using the HTTP API, make sure that the client does not send HTTP chunks,
91+
as this will slow things down.
92+
93+
[float]
94+
[[bulk-clients]]
95+
===== Client support for bulk requests
96+
97+
Some of the officially supported clients provide helpers to assist with
98+
bulk requests and reindexing of documents from one index to another:
99+
100+
Perl::
101+
102+
See https://metacpan.org/pod/Search::Elasticsearch::Client::5_0::Bulk[Search::Elasticsearch::Client::5_0::Bulk]
103+
and https://metacpan.org/pod/Search::Elasticsearch::Client::5_0::Scroll[Search::Elasticsearch::Client::5_0::Scroll]
104+
105+
Python::
106+
107+
See http://elasticsearch-py.readthedocs.org/en/master/helpers.html[elasticsearch.helpers.*]
42108

43-
The possible actions are `index`, `create`, `delete`, and `update`.
44-
`index` and `create` expect a source on the next
45-
line, and have the same semantics as the `op_type` parameter to the
46-
standard index API (i.e. create will fail if a document with the same
47-
index exists already, whereas index will add or replace a
48-
document as necessary). `delete` does not expect a source on the
49-
following line, and has the same semantics as the standard delete API.
50-
`update` expects that the partial doc, upsert and script and its options
51-
are specified on the next line.
109+
[float]
110+
[[bulk-curl]]
111+
===== Submitting bulk requests with cURL
52112

53113
If you're providing text file input to `curl`, you *must* use the
54114
`--data-binary` flag instead of plain `-d`. The latter doesn't preserve
@@ -65,9 +125,97 @@ $ curl -s -H "Content-Type: application/x-ndjson" -XPOST localhost:9200/_bulk --
65125
// NOTCONSOLE
66126
// Not converting to console because this shows how curl works
67127

68-
Because this format uses literal `\n`'s as delimiters, please be sure
69-
that the JSON actions and sources are not pretty printed. Here is an
70-
example of a correct sequence of bulk commands:
128+
[float]
129+
[[bulk-optimistic-concurrency-control]]
130+
===== Optimistic Concurrency Control
131+
132+
Each `index` and `delete` action within a bulk API call may include the
133+
`if_seq_no` and `if_primary_term` parameters in their respective action
134+
and meta data lines. The `if_seq_no` and `if_primary_term` parameters control
135+
how operations are executed, based on the last modification to existing
136+
documents. See <<optimistic-concurrency-control>> for more details.
137+
138+
139+
[float]
140+
[[bulk-versioning]]
141+
===== Versioning
142+
143+
Each bulk item can include the version value using the
144+
`version` field. It automatically follows the behavior of the
145+
index / delete operation based on the `_version` mapping. It also
146+
support the `version_type` (see <<index-versioning, versioning>>).
147+
148+
[float]
149+
[[bulk-routing]]
150+
===== Routing
151+
152+
Each bulk item can include the routing value using the
153+
`routing` field. It automatically follows the behavior of the
154+
index / delete operation based on the `_routing` mapping.
155+
156+
[float]
157+
[[bulk-wait-for-active-shards]]
158+
===== Wait For Active Shards
159+
160+
When making bulk calls, you can set the `wait_for_active_shards`
161+
parameter to require a minimum number of shard copies to be active
162+
before starting to process the bulk request. See
163+
<<index-wait-for-active-shards,here>> for further details and a usage
164+
example.
165+
166+
[float]
167+
[[bulk-refresh]]
168+
===== Refresh
169+
170+
Control when the changes made by this request are visible to search. See
171+
<<docs-refresh,refresh>>.
172+
173+
NOTE: Only the shards that receive the bulk request will be affected by
174+
`refresh`. Imagine a `_bulk?refresh=wait_for` request with three
175+
documents in it that happen to be routed to different shards in an index
176+
with five shards. The request will only wait for those three shards to
177+
refresh. The other two shards that make up the index do not
178+
participate in the `_bulk` request at all.
179+
180+
[float]
181+
[[bulk-security]]
182+
===== Security
183+
184+
See <<url-access-control>>.
185+
186+
[float]
187+
[[bulk-partial-responses]]
188+
===== Partial responses
189+
To ensure fast responses, the bulk API will respond with partial results if one or more shards fail.
190+
See <<shard-failures, Shard failures>> for more information.
191+
192+
[[docs-bulk-api-path-params]]
193+
==== {api-path-parms-title}
194+
195+
`<index>`::
196+
(Optional, string) Name of the index to perform the bulk actions against.
197+
198+
[[docs-bulk-api-query-params]]
199+
==== {api-query-parms-title}
200+
201+
include::{docdir}/rest-api/common-parms.asciidoc[tag=pipeline]
202+
203+
include::{docdir}/rest-api/common-parms.asciidoc[tag=refresh]
204+
205+
include::{docdir}/rest-api/common-parms.asciidoc[tag=routing]
206+
207+
include::{docdir}/rest-api/common-parms.asciidoc[tag=source]
208+
209+
include::{docdir}/rest-api/common-parms.asciidoc[tag=source_excludes]
210+
211+
include::{docdir}/rest-api/common-parms.asciidoc[tag=source_includes]
212+
213+
include::{docdir}/rest-api/common-parms.asciidoc[tag=timeout]
214+
215+
include::{docdir}/rest-api/common-parms.asciidoc[tag=wait_for_active_shards]
216+
217+
[[docs-bulk-api-example]]
218+
==== {api-examples-title}
71219

72220
[source,js]
73221
--------------------------------------------------
@@ -82,7 +230,7 @@ POST _bulk
82230
--------------------------------------------------
83231
// CONSOLE
84232

85-
The result of this bulk operation is:
233+
The API returns the following result:
86234

87235
[source,js]
88236
--------------------------------------------------
@@ -172,85 +320,9 @@ The result of this bulk operation is:
172320
// TESTRESPONSE[s/"_seq_no" : 3/"_seq_no" : $body.items.3.update._seq_no/]
173321
// TESTRESPONSE[s/"_primary_term" : 4/"_primary_term" : $body.items.3.update._primary_term/]
174322

175-
The endpoints are `/_bulk` and `/{index}/_bulk`. When the index is provided, it
176-
will be used by default on bulk items that don't provide it explicitly.
177-
178-
A note on the format. The idea here is to make processing of this as
179-
fast as possible. As some of the actions will be redirected to other
180-
shards on other nodes, only `action_meta_data` is parsed on the
181-
receiving node side.
182-
183-
Client libraries using this protocol should try and strive to do
184-
something similar on the client side, and reduce buffering as much as
185-
possible.
186-
187-
The response to a bulk action is a large JSON structure with the individual
188-
results of each action that was performed in the same order as the actions that
189-
appeared in the request. The failure of a single action does not affect the
190-
remaining actions.
191-
192-
There is no "correct" number of actions to perform in a single bulk
193-
call. You should experiment with different settings to find the optimum
194-
size for your particular workload.
195-
196-
If using the HTTP API, make sure that the client does not send HTTP
197-
chunks, as this will slow things down.
198-
199-
[float]
200-
[[bulk-optimistic-concurrency-control]]
201-
==== Optimistic Concurrency Control
202-
203-
Each `index` and `delete` action within a bulk API call may include the
204-
`if_seq_no` and `if_primary_term` parameters in their respective action
205-
and meta data lines. The `if_seq_no` and `if_primary_term` parameters control
206-
how operations are executed, based on the last modification to existing
207-
documents. See <<optimistic-concurrency-control>> for more details.
208-
209-
210-
[float]
211-
[[bulk-versioning]]
212-
==== Versioning
213-
214-
Each bulk item can include the version value using the
215-
`version` field. It automatically follows the behavior of the
216-
index / delete operation based on the `_version` mapping. It also
217-
support the `version_type` (see <<index-versioning, versioning>>).
218-
219-
[float]
220-
[[bulk-routing]]
221-
==== Routing
222-
223-
Each bulk item can include the routing value using the
224-
`routing` field. It automatically follows the behavior of the
225-
index / delete operation based on the `_routing` mapping.
226-
227-
[float]
228-
[[bulk-wait-for-active-shards]]
229-
==== Wait For Active Shards
230-
231-
When making bulk calls, you can set the `wait_for_active_shards`
232-
parameter to require a minimum number of shard copies to be active
233-
before starting to process the bulk request. See
234-
<<index-wait-for-active-shards,here>> for further details and a usage
235-
example.
236-
237-
[float]
238-
[[bulk-refresh]]
239-
==== Refresh
240-
241-
Control when the changes made by this request are visible to search. See
242-
<<docs-refresh,refresh>>.
243-
244-
NOTE: Only the shards that receive the bulk request will be affected by
245-
`refresh`. Imagine a `_bulk?refresh=wait_for` request with three
246-
documents in it that happen to be routed to different shards in an index
247-
with five shards. The request will only wait for those three shards to
248-
refresh. The other two shards that make up the index do not
249-
participate in the `_bulk` request at all.
250-
251323
[float]
252324
[[bulk-update]]
253-
==== Update
325+
===== Bulk update example
254326

255327
When using the `update` action, `retry_on_conflict` can be used as a field in
256328
the action itself (not in the extra payload line), to specify how many
@@ -278,13 +350,3 @@ POST _bulk
278350
// CONSOLE
279351
// TEST[continued]
280352

281-
[float]
282-
[[bulk-security]]
283-
==== Security
284-
285-
See <<url-access-control>>.
286-
287-
[float]
288-
[[bulk-partial-responses]]
289-
==== Partial responses
290-
To ensure fast responses, the bulk API will respond with partial results if one or more shards fail. See <<shard-failures, Shard failures>> for more information.

docs/reference/docs/get.asciidoc

+12-17
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,12 @@
66

77
Retrieves the specified JSON document from an index.
88

9+
[source,console]
10+
--------------------------------------------------
11+
GET twitter/_doc/0
12+
--------------------------------------------------
13+
// TEST[setup:twitter]
14+
915
[[docs-get-api-request]]
1016
==== {api-request-title}
1117

@@ -156,32 +162,21 @@ deleted documents in the background as you continue to index more data.
156162
[[docs-get-api-query-params]]
157163
==== {api-query-parms-title}
158164

159-
`preference`::
160-
(Optional, string) Specify the node or shard the operation should
161-
be performed on (default: random).
165+
include::{docdir}/rest-api/common-parms.asciidoc[tag=preference]
162166

163-
`realtime`::
164-
(Optional, boolean) Set to `false` to disable real time GET
165-
(default: `true`). See <<realtime>>.
167+
include::{docdir}/rest-api/common-parms.asciidoc[tag=realtime]
166168

167169
include::{docdir}/rest-api/common-parms.asciidoc[tag=refresh]
168170

169171
include::{docdir}/rest-api/common-parms.asciidoc[tag=routing]
170172

171-
`stored_fields`::
172-
(Optional, boolean) Set to `true` to retrieve the document fields stored in the
173-
index rather than the document `_source` (default: `false`).
173+
include::{docdir}/rest-api/common-parms.asciidoc[tag=stored_fields]
174174

175-
`_source`::
176-
(Optional, list) Set to `false` to disable source retrieval (default: `true`).
177-
You can also specify a comma-separated list of the fields
178-
you want to retrieve.
175+
include::{docdir}/rest-api/common-parms.asciidoc[tag=source]
179176

180-
`_source_excludes`::
181-
(Optional, list) Specify the source fields you want to exclude.
177+
include::{docdir}/rest-api/common-parms.asciidoc[tag=source_excludes]
182178

183-
`_source_includes`::
184-
(Optional, list) Specify the source fields you want to retrieve.
179+
include::{docdir}/rest-api/common-parms.asciidoc[tag=source_includes]
185180

186181
include::{docdir}/rest-api/common-parms.asciidoc[tag=doc-version]
187182

0 commit comments

Comments
 (0)