Skip to content

Commit c38252d

Browse files
clintongormleyareek
authored andcommitted
Documented the query cache module
Related to #7161 and #7167
1 parent 5710c22 commit c38252d

File tree

6 files changed

+233
-51
lines changed

6 files changed

+233
-51
lines changed

docs/reference/index-modules.asciidoc

+2
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,8 @@ include::index-modules/translog.asciidoc[]
7272

7373
include::index-modules/cache.asciidoc[]
7474

75+
include::index-modules/query-cache.asciidoc[]
76+
7577
include::index-modules/fielddata.asciidoc[]
7678

7779
include::index-modules/codec.asciidoc[]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
[[index-modules-shard-query-cache]]
2+
== Shard query cache
3+
4+
coming[1.4.0]
5+
6+
When a search request is run against an index or against many indices, each
7+
involved shard executes the search locally and returns its local results to
8+
the _coordinating node_, which combines these shard-level results into a
9+
``global'' result set.
10+
11+
The shard-level query cache module caches the local results on each shard.
12+
This allows frequently used (and potentially heavy) search requests to return
13+
results almost instantly. The query cache is a very good fit for the logging
14+
use case, where only the most recent index is being actively updated --
15+
results from older indices will be served directly from the cache.
16+
17+
[IMPORTANT]
18+
==================================
19+
20+
For now, the query cache will only only cache the results of search requests
21+
where <<count,`?search_type=count`>>, so it will not cache `hits`,
22+
but it will cache `hits.total`, <<search-aggregations,aggregations>>, and
23+
<<search-suggesters,suggestions>>.
24+
25+
Queries that use `now` (see <<date-math>>) cannot be cached.
26+
==================================
27+
28+
[float]
29+
=== Cache invalidation
30+
31+
The cache is smart -- it keeps the same _near real-time_ promise as uncached
32+
search.
33+
34+
Cached results are invalidated automatically whenever the shard refreshes, but
35+
only if the data in the shard has actually changed. In other words, you will
36+
always get the same results from the cache as you would for an uncached search
37+
request.
38+
39+
The longer the refresh interval, the longer that cached entries will remain
40+
valid. If the cache is full, the least recently used cache keys will be
41+
evicted.
42+
43+
The cache can be expired manually with the <<indices-clearcache,`clear-cache` API>>:
44+
45+
[source,json]
46+
------------------------
47+
curl -XPOST 'localhost:9200/kimchy,elasticsearch/_cache/clear?query_cache=true'
48+
------------------------
49+
50+
[float]
51+
=== Enabling caching by default
52+
53+
The cache is not enabled by default, but can be enabled when creating a new
54+
index as follows:
55+
56+
[source,json]
57+
-----------------------------
58+
curl -XPUT localhost:9200/my_index -d'
59+
{
60+
"settings": {
61+
"index.cache.query.enable": true
62+
}
63+
}
64+
'
65+
-----------------------------
66+
67+
It can also be enabled or disabled dynamically on an existing index with the
68+
<<indices-update-settings,`update-settings`>> API:
69+
70+
[source,json]
71+
-----------------------------
72+
curl -XPUT localhost:9200/my_index/_settings -d'
73+
{ "index.cache.query.enable": true }
74+
'
75+
-----------------------------
76+
77+
[float]
78+
=== Enabling caching per request
79+
80+
The `query_cache` query-string parameter can be used to enable or disable
81+
caching on a *per-query* basis. If set, it overrides the index-level setting:
82+
83+
[source,json]
84+
-----------------------------
85+
curl localhost:9200/my_index/_search?search_type=count&query_cache=true -d'
86+
{
87+
"aggs": {
88+
"popular_colors": {
89+
"terms": {
90+
"field": "colors"
91+
}
92+
}
93+
}
94+
}
95+
'
96+
-----------------------------
97+
98+
IMPORTANT: If your query uses a script whose result is not deterministic (e.g.
99+
it uses a random function or references the current time) you should set the
100+
`query_cache` flag to `false` to disable caching for that request.
101+
102+
[float]
103+
=== Cache key
104+
105+
The whole JSON body is used as the cache key. This means that if the JSON
106+
changes -- for instance if keys are output in a different order -- then the
107+
cache key will not be recognised.
108+
109+
TIP: Most JSON libraries support a _canonical_ mode which ensures that JSON
110+
keys are always emitted in the same order. This canonical mode can be used in
111+
the application to ensure that a request is always serialized in the same way.
112+
113+
[float]
114+
=== Cache settings
115+
116+
The cache is managed at the node level, and has a default maximum size of `1%`
117+
of the heap. This can be changed in the `config/elasticsearch.yml` file with:
118+
119+
[source,yaml]
120+
--------------------------------
121+
indices.cache.query.size: 2%
122+
--------------------------------
123+
124+
Also, you can use the +indices.cache.query.expire+ setting to specify a TTL
125+
for cached results, but there should be no reason to do so. Remember that
126+
stale results are automatically invalidated when the index is refreshed. This
127+
setting is provided for completeness' sake only.
128+
129+
[float]
130+
=== Monitoring cache usage
131+
132+
The size of the cache (in bytes) and the number of evictions can be viewed
133+
by index, with the <<indices-stats,`indices-stats`>> API:
134+
135+
[source,json]
136+
------------------------
137+
curl -XPOST 'localhost:9200/_stats/query_cache?pretty&human'
138+
------------------------
139+
140+
or by node with the <<cluster-nodes-stats,`nodes-stats`>> API:
141+
142+
[source,json]
143+
------------------------
144+
curl -XPOST 'localhost:9200/_nodes/stats/indices/query_cache?pretty&human'
145+
------------------------

docs/reference/indices/clearcache.asciidoc

+3-3
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ associated with one ore more indices.
99
$ curl -XPOST 'http://localhost:9200/twitter/_cache/clear'
1010
--------------------------------------------------
1111

12-
The API, by default, will clear all caches. Specific caches can be
13-
cleaned explicitly by setting `filter`, `field_data` or `id_cache` to
14-
`true`.
12+
The API, by default, will clear all caches. Specific caches can be cleaned
13+
explicitly by setting `filter`, `field_data`, `query_cache` coming[1.4.0],
14+
or `id_cache` to `true`.
1515

1616
All caches relating to a specific field(s) can also be cleared by
1717
specifying `fields` parameter with a comma delimited list of the

docs/reference/indices/stats.asciidoc

+24-12
Original file line numberDiff line numberDiff line change
@@ -39,20 +39,32 @@ specified as well in the URI. Those stats can be any of:
3939
groups). The `groups` parameter accepts a comma separated list of group names.
4040
Use `_all` to return statistics for all groups.
4141

42-
`warmer`:: Warmer statistics.
43-
`merge`:: Merge statistics.
44-
`fielddata`:: Fielddata statistics.
45-
`flush`:: Flush statistics.
46-
`completion`:: Completion suggest statistics.
47-
`refresh`:: Refresh statistics.
48-
`suggest`:: Suggest statistics.
49-
50-
Some statistics allow per field granularity which accepts a list comma-separated list of included fields. By default all fields are included:
42+
`completion`:: Completion suggest statistics.
43+
`fielddata`:: Fielddata statistics.
44+
`flush`:: Flush statistics.
45+
`merge`:: Merge statistics.
46+
`query_cache`:: <<index-modules-shard-query-cache,Shard query cache>> statistics. coming[1.4.0]
47+
`refresh`:: Refresh statistics.
48+
`suggest`:: Suggest statistics.
49+
`warmer`:: Warmer statistics.
50+
51+
Some statistics allow per field granularity which accepts a list
52+
comma-separated list of included fields. By default all fields are included:
5153

5254
[horizontal]
53-
`fields`:: List of fields to be included in the statistics. This is used as the default list unless a more specific field list is provided (see below).
54-
`completion_fields`:: List of fields to be included in the Completion Suggest statistics
55-
`fielddata_fields`:: List of fields to be included in the Fielddata statistics
55+
`fields`::
56+
57+
List of fields to be included in the statistics. This is used as the
58+
default list unless a more specific field list is provided (see below).
59+
60+
`completion_fields`::
61+
62+
List of fields to be included in the Completion Suggest statistics.
63+
64+
`fielddata_fields`::
65+
66+
List of fields to be included in the Fielddata statistics.
67+
5668

5769
Here are some samples:
5870

docs/reference/search/aggregations.asciidoc

+15-3
Original file line numberDiff line numberDiff line change
@@ -104,9 +104,9 @@ are being aggregated. The values are typically extracted from the fields of the
104104
can also be generated using scripts.
105105

106106
Numeric metrics aggregations are a special type of metrics aggregation which output numeric values. Some aggregations output
107-
a single numeric metric (e.g. `avg`) and are called `single-value numeric metrics aggregation`, others generate multiple
108-
metrics (e.g. `stats`) and are called `multi-value numeric metrics aggregation`. The distinction between single-value and
109-
multi-value numeric metrics aggregations plays a role when these aggregations serve as direct sub-aggregations of some
107+
a single numeric metric (e.g. `avg`) and are called `single-value numeric metrics aggregation`, others generate multiple
108+
metrics (e.g. `stats`) and are called `multi-value numeric metrics aggregation`. The distinction between single-value and
109+
multi-value numeric metrics aggregations plays a role when these aggregations serve as direct sub-aggregations of some
110110
bucket aggregations (some bucket aggregations enable you to sort the returned buckets based on the numeric metrics in each bucket).
111111

112112

@@ -125,6 +125,18 @@ aggregated for the buckets created by their "parent" bucket aggregation.
125125
There are different bucket aggregators, each with a different "bucketing" strategy. Some define a single bucket, some
126126
define fixed number of multiple buckets, and others dynamically create the buckets during the aggregation process.
127127

128+
[float]
129+
=== Caching heavy aggregations
130+
131+
coming[1.4.0]
132+
133+
Frequently used aggregations (e.g. for display on the home page of a website)
134+
can be cached for faster responses. These cached results are the same results
135+
that would be returned by an uncached aggregation -- you will never get stale
136+
results.
137+
138+
See <<index-modules-shard-query-cache>> for more details.
139+
128140

129141
include::aggregations/metrics.asciidoc[]
130142

docs/reference/search/request-body.asciidoc

+44-33
Original file line numberDiff line numberDiff line change
@@ -46,39 +46,50 @@ And here is a sample response:
4646
[float]
4747
=== Parameters
4848

49-
[cols="<,<",options="header",]
50-
|=======================================================================
51-
|Name |Description
52-
|`timeout` |A search timeout, bounding the search request to be executed
53-
within the specified time value and bail with the hits accumulated up to
54-
that point when expired. Defaults to no timeout. See <<time-units>>.
55-
56-
|`from` |The starting from index of the hits to return. Defaults to `0`.
57-
58-
|`size` |The number of hits to return. Defaults to `10`.
59-
60-
|`search_type` |The type of the search operation to perform. Can be
61-
`dfs_query_then_fetch`, `dfs_query_and_fetch`, `query_then_fetch`,
62-
`query_and_fetch`. Defaults to `query_then_fetch`. See
63-
<<search-request-search-type,_Search Type_>> for
64-
more details on the different types of search that can be performed.
65-
66-
|coming[1.4.0] `terminate_after` |The maximum number of documents to collect for
67-
each shard, upon reaching which the query execution will terminate early.
68-
If set, the response will have a boolean field `terminated_early` to
69-
indicate whether the query execution has actually terminated_early.
70-
Defaults to no terminate_after.
71-
|=======================================================================
72-
73-
Out of the above, the `search_type` is the one that can not be passed
74-
within the search request body, and in order to set it, it must be
75-
passed as a request REST parameter.
76-
77-
The rest of the search request should be passed within the body itself.
78-
The body content can also be passed as a REST parameter named `source`.
79-
80-
Both HTTP GET and HTTP POST can be used to execute search with body.
81-
Since not all clients support GET with body, POST is allowed as well.
49+
[horizontal]
50+
`timeout`::
51+
52+
A search timeout, bounding the search request to be executed within the
53+
specified time value and bail with the hits accumulated up to that point
54+
when expired. Defaults to no timeout. See <<time-units>>.
55+
56+
`from`::
57+
58+
The starting from index of the hits to return. Defaults to `0`.
59+
60+
`size`::
61+
62+
The number of hits to return. Defaults to `10`.
63+
64+
`search_type`::
65+
66+
The type of the search operation to perform. Can be
67+
`dfs_query_then_fetch`, `dfs_query_and_fetch`, `query_then_fetch`,
68+
`query_and_fetch`. Defaults to `query_then_fetch`. See
69+
<<search-request-search-type,_Search Type_>> for more.
70+
71+
`query_cache`::
72+
73+
coming[1.4.0] Set to `true` or `false` to enable or disable the caching
74+
of search results for requests where `?search_type=count`, ie
75+
aggregations and suggestions. See <<index-modules-shard-query-cache>>.
76+
77+
`terminate_after`::
78+
79+
coming[1.4.0] The maximum number of documents to collect for each shard,
80+
upon reaching which the query execution will terminate early. If set, the
81+
response will have a boolean field `terminated_early` to indicate whether
82+
the query execution has actually terminated_early. Defaults to no
83+
terminate_after.
84+
85+
86+
Out of the above, the `search_type` and the `query_cache` must be passed as
87+
query-string parameters. The rest of the search request should be passed
88+
within the body itself. The body content can also be passed as a REST
89+
parameter named `source`.
90+
91+
Both HTTP GET and HTTP POST can be used to execute search with body. Since not
92+
all clients support GET with body, POST is allowed as well.
8293

8394

8495
include::request/query.asciidoc[]

0 commit comments

Comments
 (0)