Skip to content

SQL: incorrect column resolution from parameter #56013

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bpintea opened this issue Apr 30, 2020 · 1 comment · Fixed by #63710
Closed

SQL: incorrect column resolution from parameter #56013

bpintea opened this issue Apr 30, 2020 · 1 comment · Fixed by #63710
Labels
:Analytics/SQL SQL querying >bug Team:QL (Deprecated) Meta label for query languages team

Comments

@bpintea
Copy link
Contributor

bpintea commented Apr 30, 2020

POST _sql
{
  "query": "SELECT ?, ? FROM library",
  "params": [
    {
      "type": "INTEGER",
      "value": 1
    },
    {
      "type": "INTEGER",
      "value": 2
    }
  ],
  "mode": "jdbc",
  "version": "7.7.0"
}

will fail with Unknown output attribute ?{r}#9796.

Leaving the index out or changing the parameter type (i.e. no type repetition in the list) will work as expected.

Stacktrace:

    java.sql.SQLException: Server encountered an error [Unknown output attribute ?{r}#27]. [org.elasticsearch.xpack.sql.SqlIllegalArgumentException: Unknown output attribute ?{r}#27
        at org.elasticsearch.xpack.sql.querydsl.container.QueryContainer.asFieldExtraction(QueryContainer.java:478)
        at org.elasticsearch.xpack.sql.querydsl.container.QueryContainer.addColumn(QueryContainer.java:445)
        at org.elasticsearch.xpack.sql.planner.QueryFolder$PlanOutputToQueryRef.rule(QueryFolder.java:781)
        at org.elasticsearch.xpack.sql.planner.QueryFolder$PlanOutputToQueryRef.rule(QueryFolder.java:770)
        at org.elasticsearch.xpack.ql.tree.Node.lambda$transformUp$11(Node.java:197)
        at org.elasticsearch.xpack.ql.tree.Node.transformUp(Node.java:191)
        at org.elasticsearch.xpack.ql.tree.Node.transformUp(Node.java:197)
        at org.elasticsearch.xpack.sql.planner.QueryFolder$FoldingRule.apply(QueryFolder.java:875)
        at org.elasticsearch.xpack.sql.planner.QueryFolder$FoldingRule.apply(QueryFolder.java:871)
        at org.elasticsearch.xpack.ql.rule.RuleExecutor$Transformation.<init>(RuleExecutor.java:82)
        at org.elasticsearch.xpack.ql.rule.RuleExecutor.executeWithInfo(RuleExecutor.java:158)
        at org.elasticsearch.xpack.ql.rule.RuleExecutor.execute(RuleExecutor.java:130)
        at org.elasticsearch.xpack.sql.planner.QueryFolder.fold(QueryFolder.java:111)
        at org.elasticsearch.xpack.sql.planner.Planner.foldPlan(Planner.java:38)
        at org.elasticsearch.xpack.sql.planner.Planner.plan(Planner.java:28)
        at org.elasticsearch.xpack.sql.session.SqlSession.lambda$physicalPlan$4(SqlSession.java:160)
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63)
        at org.elasticsearch.xpack.sql.session.SqlSession.lambda$optimizedPlan$3(SqlSession.java:156)
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63)
        at org.elasticsearch.xpack.sql.session.SqlSession.lambda$preAnalyze$2(SqlSession.java:144)
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63)
        at org.elasticsearch.xpack.ql.index.IndexResolver.lambda$resolveAsMergedMapping$3(IndexResolver.java:288)
        at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63)
        at org.elasticsearch.client.node.NodeClient.lambda$executeLocally$0(NodeClient.java:91)
        at org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:158)
        at org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:151)
        at org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction.lambda$doExecute$0(TransportFieldCapabilitiesAction.java:88)
        at org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$1.onResponse(TransportFieldCapabilitiesAction.java:101)
        at org.elasticsearch.action.fieldcaps.TransportFieldCapabilitiesAction$1.onResponse(TransportFieldCapabilitiesAction.java:97)
        at org.elasticsearch.client.node.NodeClient.lambda$executeLocally$0(NodeClient.java:91)
        at org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:158)
        at org.elasticsearch.tasks.TaskManager$1.onResponse(TaskManager.java:151)
        at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2.handleResponse(TransportSingleShardAction.java:261)
        at org.elasticsearch.action.support.single.shard.TransportSingleShardAction$AsyncSingleAction$2.handleResponse(TransportSingleShardAction.java:247)
        at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1103)
        at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1181)
        at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1161)
        at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:54)
        at org.elasticsearch.action.support.ChannelActionListener.onResponse(ChannelActionListener.java:47)
        at org.elasticsearch.action.support.ChannelActionListener.onResponse(ChannelActionListener.java:30)
        at org.elasticsearch.action.ActionRunnable.lambda$supply$0(ActionRunnable.java:58)
        at org.elasticsearch.action.ActionRunnable$2.doRun(ActionRunnable.java:73)
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:691)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
        at java.base/java.lang.Thread.run(Thread.java:832)
@rjernst rjernst added the Team:QL (Deprecated) Meta label for query languages team label May 4, 2020
palesz pushed a commit to palesz/elasticsearch that referenced this issue Oct 15, 2020
Parameters in the SQL query marked as `?` turn into an `UnresolvedAlias`
in the parsed tree, aliasing (without a name) the `Literal` representing
the passed in parameter value.

For example in the query `SELECT ? FROM test` with a single integer
(value = 100) parameter the ? turns into an `UnresolvedAlias
(child=Literal(100))` or if we try to print it back into SQL, `SELECT
100 as ? FROM test`.

A problem arises when multiple `?` is in the query, since the `SELECT ?,
? FROM test` query with integer parameters (value=100, 200) will turn
into `SELECT 100 as ?, 200 as ? FROM test`. You guessed it correctly,
name conflict is the underlying issue.

The `UnresolvedAlias`es will be assigned the same name and are turned
into `Alias(name="?")` objects by the `Analyzer`.
The problem ultimately comes down to the fact that `Alias`es in the
`QueryFolder`. The `Alias`es eventually put into an `AttributeMap`,
but not directly. First the collection of `Alias`es gets turned into a
`java.util.LinkedHashMap<Attribute, Expression>` which later get's
turned an `AttributeMap`. The key in the map gets created using `Alias
.toAttribute()`, but the `LinkedHashMap<>` uses the `Attribute.equals()`
 and `Attribute.hashCode()` methods which only take into the account the
`name` field of the attribute vs `Attribute.semanticEquals()` and
`Attribute.semanticHash()` that take into account the `name` and `id`
fields (used by the `AttributeMap`).
The name-based equality check in the `LinkedHashMap` essentially
de-duplicates the `Alias`es based on the name and that is why one of the
attributes cannot be resolved (resolution uses both `name` and `id`).

This fix deprecates the `AttributeMap(Map)` constructor, makes sure that
the `AttributeMap`s are populated from `Stream`s or `Iterable`s
directly, so no deduplication happens based on the names of the
`Attribute`s.

Note: The `SELECT 100 as ?, 200 as ? FROM test` query will result in two
columns named as `?` in the results. Alternatively I could have just
generated different names for the `UnresolvedAlias` objects, but that
would just hide the above mentioned bug in the code.

Fix elastic#56013
palesz pushed a commit to palesz/elasticsearch that referenced this issue Oct 27, 2020
For example in the query `SELECT ? FROM test` with a single integer
(value = 100) parameter the ? turns into an `UnresolvedAlias
(child=Literal(100))` or if we try to print it back into SQL, `SELECT
100 as ? FROM test`.

A problem arises when multiple `?` is in the query with the same type.
For example the `SELECT ?, ? FROM test` query with integer parameters
(value=100, 200) will turn into `SELECT 100 as ?, 200 as ? FROM test`.

This commit changes the way aliases are assigned: the first unaliased
parameter will stay `?`, but from the 2nd unaliased parameter literal
a counter will be added to the alias (`?2`, `?3`, ...).

Callout: This will introduce a change for the users: a query with
multiple unaliased parameters (but each parameter with different
types) successfully executed before and returned the results with
multiple `?` columns (although must have accessed via index from the
result set instead of alias).

Fix elastic#56013
palesz pushed a commit that referenced this issue Nov 4, 2020
* SQL: Fix incorrect parameter resolution

Summary of the issue and the root cause:

```
(1) SELECT 100, 100 -> success
(2) SELECT ?, ? (with params: 100, 100) -> success
(3) SELECT 100, 100 FROM test -> Unknown output attribute exception for the second 100
(4) SELECT ?, ? FROM test (params: 100, 100) -> Unknown output attribute exception for the second ?
(5) SELECT field1 as "x", field1 as "x" FROM test -> Unknown output attribute exception for the second "x"
```

There are two separate issues at play here:
1. Construction of `AttributeMap`s keeps only one of the `Attribute`s with the same name even if the `id`s are different (see the `AttributeMapTests` in this PR). This should be fixed no matter what, we should not overwrite attributes with one another during the construction of the `AttributeMap`.
2. The `id` on the `Alias`es is not the same in case the `Alias`es have the same `name` and same `child`

It was considered to simpy fix the second issue by just reassigning the same `id`s to the `Alias`es with the same name and child, but it would not solve the `unknown output attribute exception` (see notes below). This PR covers the fix for the first issue.

Fix #56013
palesz pushed a commit to palesz/elasticsearch that referenced this issue Nov 4, 2020
* SQL: Fix incorrect parameter resolution

Summary of the issue and the root cause:

```
(1) SELECT 100, 100 -> success
(2) SELECT ?, ? (with params: 100, 100) -> success
(3) SELECT 100, 100 FROM test -> Unknown output attribute exception for the second 100
(4) SELECT ?, ? FROM test (params: 100, 100) -> Unknown output attribute exception for the second ?
(5) SELECT field1 as "x", field1 as "x" FROM test -> Unknown output attribute exception for the second "x"
```

There are two separate issues at play here:
1. Construction of `AttributeMap`s keeps only one of the `Attribute`s with the same name even if the `id`s are different (see the `AttributeMapTests` in this PR). This should be fixed no matter what, we should not overwrite attributes with one another during the construction of the `AttributeMap`.
2. The `id` on the `Alias`es is not the same in case the `Alias`es have the same `name` and same `child`

It was considered to simpy fix the second issue by just reassigning the same `id`s to the `Alias`es with the same name and child, but it would not solve the `unknown output attribute exception` (see notes below). This PR covers the fix for the first issue.

Fix elastic#56013
palesz pushed a commit to palesz/elasticsearch that referenced this issue Nov 4, 2020
* SQL: Fix incorrect parameter resolution

Summary of the issue and the root cause:

```
(1) SELECT 100, 100 -> success
(2) SELECT ?, ? (with params: 100, 100) -> success
(3) SELECT 100, 100 FROM test -> Unknown output attribute exception for the second 100
(4) SELECT ?, ? FROM test (params: 100, 100) -> Unknown output attribute exception for the second ?
(5) SELECT field1 as "x", field1 as "x" FROM test -> Unknown output attribute exception for the second "x"
```

There are two separate issues at play here:
1. Construction of `AttributeMap`s keeps only one of the `Attribute`s with the same name even if the `id`s are different (see the `AttributeMapTests` in this PR). This should be fixed no matter what, we should not overwrite attributes with one another during the construction of the `AttributeMap`.
2. The `id` on the `Alias`es is not the same in case the `Alias`es have the same `name` and same `child`

It was considered to simpy fix the second issue by just reassigning the same `id`s to the `Alias`es with the same name and child, but it would not solve the `unknown output attribute exception` (see notes below). This PR covers the fix for the first issue.

Fix elastic#56013
palesz pushed a commit that referenced this issue Nov 5, 2020
Summary of the issue and the root cause:

```
(1) SELECT 100, 100 -> success
(2) SELECT ?, ? (with params: 100, 100) -> success
(3) SELECT 100, 100 FROM test -> Unknown output attribute exception for the second 100
(4) SELECT ?, ? FROM test (params: 100, 100) -> Unknown output attribute exception for the second ?
(5) SELECT field1 as "x", field1 as "x" FROM test -> Unknown output attribute exception for the second "x"
```

There are two separate issues at play here:
1. Construction of `AttributeMap`s keeps only one of the `Attribute`s with the same name even if the `id`s are different (see the `AttributeMapTests` in this PR). This should be fixed no matter what, we should not overwrite attributes with one another during the construction of the `AttributeMap`.
2. The `id` on the `Alias`es is not the same in case the `Alias`es have the same `name` and same `child`

It was considered to simpy fix the second issue by just reassigning the same `id`s to the `Alias`es with the same name and child, but it would not solve the `unknown output attribute exception` (see notes below). This PR covers the fix for the first issue.

Relates to #56013
palesz pushed a commit that referenced this issue Nov 5, 2020
Summary of the issue and the root cause:

```
(1) SELECT 100, 100 -> success
(2) SELECT ?, ? (with params: 100, 100) -> success
(3) SELECT 100, 100 FROM test -> Unknown output attribute exception for the second 100
(4) SELECT ?, ? FROM test (params: 100, 100) -> Unknown output attribute exception for the second ?
(5) SELECT field1 as "x", field1 as "x" FROM test -> Unknown output attribute exception for the second "x"
```

There are two separate issues at play here:
1. Construction of `AttributeMap`s keeps only one of the `Attribute`s with the same name even if the `id`s are different (see the `AttributeMapTests` in this PR). This should be fixed no matter what, we should not overwrite attributes with one another during the construction of the `AttributeMap`.
2. The `id` on the `Alias`es is not the same in case the `Alias`es have the same `name` and same `child`

It was considered to simpy fix the second issue by just reassigning the same `id`s to the `Alias`es with the same name and child, but it would not solve the `unknown output attribute exception` (see notes below). This PR covers the fix for the first issue.

Relates to #56013
@palesz
Copy link
Contributor

palesz commented Nov 5, 2020

Detailed description of the root cause can be found here: #63710 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/SQL SQL querying >bug Team:QL (Deprecated) Meta label for query languages team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants