Skip to content

Regression in 8.6.0: Not returning empty arrays when source filtering inside them #109668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Wim-De-Clercq opened this issue Jun 13, 2024 · 6 comments · Fixed by #112250
Closed
Labels
>bug priority:normal A label for assessing bug priority to be used by ES engineers :Search Foundations/Search Catch all for Search Foundations :Search/Search Search-related issues that do not fall into other categories Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch Team:Search Meta label for search team

Comments

@Wim-De-Clercq
Copy link

Wim-De-Clercq commented Jun 13, 2024

Elasticsearch Version

from 8.6.0 until current

Installed Plugins

No response

Java Version

openjdk version "19.0.1" 2022-10-18

OS Version

Linux e3b996519639 6.5.0-35-generic #35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC

Problem Description

Exclude filtering causes empty objects from the response to disappear entirely.

This is probably very similar to these.
#29622
#23796

But the thing is that this WAS working fine in 8.3.x, 8.4.x, 8.5.x, ...

8.6.0 broke this feature again.

Steps to Reproduce

setup data

PUT /test
PUT /test/_mapping
{
  "properties": {
    "myArray": {
      "properties": {
        "myField": {
          "type": "keyword"
        }
      }
    }
  }
}
PUT /test/_doc/1
{
  "myArray": [{"myField": "value", "other": "value"}]
}
PUT /test/_doc/2
{
  "myArray": []
}

search

POST /test/_search
{
  "_source": {
    "includes": ["myArray"],
    "excludes": ["myArray.myField"]
  }
}

result

{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test",
        "_id": "2",
        "_score": 1,
        "_source": {}
      },
      {
        "_index": "test",
        "_id": "1",
        "_score": 1,
        "_source": {
          "myArray": [
            {
              "other": "value"
            }
          ]
        }
      }
    ]
  }
}

It should contain "myArray" with empty arrays.

Logs (if relevant)

No response

@Wim-De-Clercq Wim-De-Clercq added >bug needs:triage Requires assignment of a team area label labels Jun 13, 2024
@Wim-De-Clercq Wim-De-Clercq changed the title Regression in 8.6: Not returning empty arrays when source filtering inside them Regression in 8.6.0: Not returning empty arrays when source filtering inside them Jun 13, 2024
@jakelandis jakelandis added :Search/Search Search-related issues that do not fall into other categories and removed needs:triage Requires assignment of a team area label labels Jun 13, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Jun 13, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@javanna javanna added the priority:normal A label for assessing bug priority to be used by ES engineers label Jun 18, 2024
@martijnvg martijnvg added :StorageEngine/Mapping The storage related side of mappings and removed :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team labels Jun 18, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@lkts
Copy link
Contributor

lkts commented Jun 18, 2024

It looks like this change in behaviour was caused by #91127.

@kved96
Copy link

kved96 commented Aug 14, 2024

+1 to this we rely on the old behavior in all of our clients quite heavily and it would be a huge undertaking to build compatibility. Is this something that can be prioritized for a fix? @elasticsearchmachine

mccheah added a commit to mccheah/elasticsearch that referenced this issue Aug 27, 2024
Fixes elastic#109668.

This fixes the specific case outlined in the above linked Github issue -
namely this:

* Document: `{ "myArray": [] }`
* Source filter: `myArray.myField`
* Expected filtered result: `{ "myArray": [] }`
* Actual filtered result: `{}`

When we switched to from map-based source filtering to
Jackson-streaming-based source filtering in some code paths as part of
filtering (using the automata, etc.) correctly included the empty array
from the parent. Unfortunately, it's not clear when these two code paths
for source filtering diverge in behavior.

Note also the long comment in `FilterPathBasedFilter`. It's not clear what
the correct behavior is when we are filtering out all contents of arrays
via source filter path exclusion.
mccheah added a commit to mccheah/elasticsearch that referenced this issue Aug 27, 2024
This fixes the specific case outlined in the above linked Github issue - namely this:

Document: { "myArray": [] }
Source filter: myArray.myField
Expected filtered result: { "myArray": [] }
Actual filtered result: {}

This broke when we switched to from map-based source filtering to Jackson-streaming-based
source filtering in some code paths - see elastic#91127. Map-based source filtering (using the
automata, etc.) correctly included the empty array from the parent. Unfortunately, it's
not clear when these two code paths for source filtering diverge in behavior.

Note also the long comment in FilterPathBasedFilter. It's not clear what the correct
behavior is when we are filtering out all contents of arrays via source filter path
exclusion.
@carlosdelest carlosdelest added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Aug 30, 2024
@elasticsearchmachine elasticsearchmachine removed the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Aug 30, 2024
@lkts lkts added :Search/Search Search-related issues that do not fall into other categories :Search Foundations/Search Catch all for Search Foundations and removed Team:StorageEngine :StorageEngine/Mapping The storage related side of mappings labels Oct 3, 2024
@elasticsearchmachine elasticsearchmachine added Team:Search Meta label for search team Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch labels Oct 3, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-foundations (Team:Search Foundations)

@lkts
Copy link
Contributor

lkts commented Oct 3, 2024

re-tagged based on the tags in #112250

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug priority:normal A label for assessing bug priority to be used by ES engineers :Search Foundations/Search Catch all for Search Foundations :Search/Search Search-related issues that do not fall into other categories Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch Team:Search Meta label for search team
Projects
None yet
8 participants