Skip to content

Support for all Elasticsearch field types #1813

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 22 tasks
masseyke opened this issue Dec 7, 2021 · 4 comments
Open
2 of 22 tasks

Support for all Elasticsearch field types #1813

masseyke opened this issue Dec 7, 2021 · 4 comments

Comments

@masseyke
Copy link
Member

masseyke commented Dec 7, 2021

There are quite a few field types that have been added to Elasticsearch that are not currently supported in es-hadoop.

Unsupported field types

I have not tested all of them, but at least some of these cause failures in es-hadoop if you read from an index that uses them.

@rseldner
Copy link

Looks like the dense_vector type needs to be added as well.
https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html

@sbhatm1213
Copy link

Has the dense_vector field type been added ? Or does anyone know of any other way to pull that field on Pyspark ?

@mouhagaye
Copy link

mouhagaye commented Feb 29, 2024

I've tested recently with spark reading from index containing most of elastic search field type.
Used versions:

  • Spark : 3.3.1
  • ElasticSearch: 8.12.1
  • elasticsearch-spark: 8.12.1

Spark read successfully the index but the these following field type does not appear on the created dataframe:

  • constant_keyword
  • unsigned_long
  • flattened
  • integer_range
  • float_range
  • long_range
  • double_range
  • date_range
  • ip_range
  • version
  • agg_metric
  • histogram
  • match_only_text
  • completion
  • search_as_you_type
  • dense_vector
  • sparse_vector
  • rank_feature
  • rank_features
  • point
  • shape
  • percolator

@berglh
Copy link

berglh commented Jan 6, 2025

I hit this issue with match_only_text field using Spark 3.4.1 and the elasticsearch-spark-30_2.12-8.7.1.jar against Elasticsearch 8.11 - ouch. Just want to return the message field, which is using the default dynamic template which seems to default to this type.

Edit: I also tried to append keyword field to message via the index template and rolled over the data stream, and Elasticsearch Spark seems to still ignore the keyword value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants