You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SQL: Fix issues with format=txt when paging through result sets and in mixed node environments (#83833)
Resolves#83581Resolves#83788
SQL REST requests using `format=txt` stand out from the other formats because the cursor needs to carry the formatter state from the initial request to subsequent scroll requests. The state is needed to be able to format the subsequent pages with column widths that match the widths in the first page.
Currently, this is solved by wrapping cursor objects together with the formatter state in `TextFormatterCursor`. Hence, a query might return a `ListCursor` and `TextFormat.PLAIN_TEXT` adds the necessary state when formatting the output. This approach is handy because a `TextFormatterCursor` is a `Cursor` and delegates calls to the wrapped cursor. Unfortunately, it also has some downsides that have been revealed when looking into #83581 and #83788:
- MediaType formatting is a concern of the REST layer of the plugin that does not have access to the serialization logic usually accessible through the `NamedWriteablesRegistry`. Because the formatting required to (de)serialize the cursors for the wrapping, this meant that the `Cursor` deserialization could only use a subset of `NamedWriteable`s specific to Cursors (namely the ones returned by `Cursors.getNamedWriteables`). This subset of writeables is not enough to deserialize `CompositeAggCursor` which lead to the surprising design that `CompositeAggCursor` has a `nextQuery` member consisting of the serialized `SearchSourceBuilder` whose deserialization is suspended until the call to `Cursor.nextPage`.
- `TextFormat.PLAIN_TEXT` had to deserialize the complete cursor to do it's work, not just the information relevant to the formatter. Because of the `min_compatible_version` redirects that occur during rolling upgrades, it can happen though that an upgraded node needs to deserialize a cursor from a redirected SQL query written by an older node. Because we do not offer any form of bwc for cursors, this causes an error and leads to #83581.
In this PR, I propose to change the design such that the REST layer can add state to the cursor strings without having to read or write instances of `Cursor`. This allows to address the issues described above and also made fixing #83788 very straightforward.
Attaching state to the cursor is achieved by extending the format of encoded cursors. Previously, an base64 encoded SQL cursor had the following wire format `<version><zoneId><cursor>`. Now, there are two different cursor types:
* `<version><zoneId><CursorType.NO_STATE><cursor>` for most cases
* `<version><ZoneOffset.UTC><CursorType.WITH_STATE><formatterState><base64encodedCursor>` for txt format cursors. Note, `<ZoneOffset.UTC>` could be any zone id because consumers will use the one encoded in `<base64encodedCursor>`.
This serialization format ensures that every version of ES can read and check the cursor version and produce an according error in case of a mismatch.
0 commit comments