-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Return cursor id before row or send cursor as header in any format of sql response #31819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-search-aggs |
Could you explain why you'd prefer having the cursor before the columns? Can you just add The header option isn't viable since the cursor can grow in size and pass the maximum header limit (see #16993). |
I can open index request in separate thread and stream data from from response http entity to request. It helps to avoid unnecessery memory allocation (needs to have only some reusable buffers to store entries in queue between response and request)
It can be just boolean header indicates if this response is the last for current cursor. (i.e. if it will be cursor field at the end) |
I don't think we can do this consistently. It is possible in some queries for us to know that there won't be other batches but for others we can't know until we pull that batch. Since we don't store state in SQL and we don't want to store state in SQL we'd only ever be able to tell you when we're sure that there isn't another row. But if we tell you there is another row there might not be one.
I don't think this'll work in all cases. You might be better off making all of the bulk writes to Elasticsearch asynchronously and adding |
thanks, it is important. It will be good to update ?refresh and _bulk documentation
I don't ask to change any system behaviour. I meet But if we tell you there is another row there might not be one behaviour, because if you say there is no another row, there is no another row. And if I get this information before fully parse 1000 records I may use this fact for some optimisations. Additional header is better than change JSON field order, because JSON semantic ignores field order and it is bad practice to use json field order in application logic. |
Only the shards that receive the bulk request will be affected by `refresh`. Imagine a `_bulk?refresh=wait_for` request with three documents in it that happen to be routed to different shards in an index with five shards. The request will only wait for those three shards to refresh. The other two shards of that make up the index do not participate in the `_bulk` request at all. Relates to elastic#31819
Only the shards that receive the bulk request will be affected by `refresh`. Imagine a `_bulk?refresh=wait_for` request with three documents in it that happen to be routed to different shards in an index with five shards. The request will only wait for those three shards to refresh. The other two shards of that make up the index do not participate in the `_bulk` request at all. Relates to #31819
Only the shards that receive the bulk request will be affected by `refresh`. Imagine a `_bulk?refresh=wait_for` request with three documents in it that happen to be routed to different shards in an index with five shards. The request will only wait for those three shards to refresh. The other two shards of that make up the index do not participate in the `_bulk` request at all. Relates to #31819
Pinging @elastic/es-analytical-engine (Team:Analytics) |
superceded by ES|QL |
In current version _xpack sql rest api send cursor field in response after rows, but in some cases, it is useful to know if this response is last batch of data. For example, if I want to store result in other index using bulk api, i woud like to set refresh=wait_for only for the last batch.
Please, send cursor before rows, or send boolean header if cursor is over for all response formats
The text was updated successfully, but these errors were encountered: