Skip to content

Synthetic _source: support dense_vector #89840

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Sep 6, 2022

This adds support for synthetic _source to dense_vector fields.

image

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Sep 6, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @nik9000, I've created a changelog YAML for you.

}

private class IndexedSyntheticFieldLoader implements SourceLoader.SyntheticFieldLoader {
private VectorValues values;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been using the lucene APIs directly for SyntheticFieldLoader subclasses. I believe it saves allocating a float[] on the doc values version which is nice, but not a huge thing. But I did it just to line up with the other implementations.


@Override
public List<SyntheticSourceInvalidExample> invalidExample() throws IOException {
return List.of();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me like there is nothing like ignore_above or ignore_malformed or doc_values: false on this field type.

This adds support for synthetic _source to `dense_vector` fields.
Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment, LGTM otherwise

@Override
public DocValuesLoader docValuesLoader(LeafReader leafReader, int[] docIdsInLeaf) throws IOException {
values = DocValues.getBinary(leafReader, name());
if (values == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be leafReader.getBinary(name()) because DocValues.getBinary() never returns null

Copy link
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nik9000 Thanks, this LGTM!

@nik9000 nik9000 mentioned this pull request Sep 7, 2022
50 tasks
@nik9000 nik9000 added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 7, 2022
@nik9000
Copy link
Member Author

nik9000 commented Sep 7, 2022

run
elasticsearch-ci/part-1

@nik9000
Copy link
Member Author

nik9000 commented Sep 7, 2022

run elasticsearch-ci/part-1

@elasticsearchmachine elasticsearchmachine merged commit c4a77d5 into elastic:main Sep 8, 2022
@nik9000 nik9000 deleted the synthetic_source_dense_vector branch September 8, 2022 15:25
weizijun added a commit to weizijun/elasticsearch that referenced this pull request Sep 9, 2022
* main: (34 commits)
  Make sure ivy repo directory exists before downloading artifacts
  Use 'file://' scheme for local repository URL
  Use DRA artifacts for release build CI jobs
  Log unsuccessful attempts to get credentials from web identity tokens (elastic#88241)
  Script: Write Field API path manipulation (elastic#89889)
  Fetch health info action (elastic#89820)
  Fix memory leak in TransportDeleteExpiredDataAction (elastic#89935)
  [ML] Performance improvements for categorization jobs (elastic#89824)
  [DOCS] Revert changes for ES_JAVA_OPTS (elastic#89931)
  Fix deadlock bug exposed by a test (elastic#89934)
  [Downsampling] Remove `FieldValueFetcher` validator (elastic#89497)
  Fix segment stats in tsdb (elastic#89754)
  Synthetic _source: support dense_vector (elastic#89840)
  REST tests fetching fields with synthetic _source (elastic#89888)
  Do not deserialize back BytesTransportRequest to clone a request in MockTransportService (elastic#89926)
  Add SDK request logging to debug failures of S3BlobStoreRepositoryTests#testRequestStats (elastic#89912)
  Fix SnapshotStatusApisIT.testGetSnapshotsWithSnapshotInProgress (elastic#89925)
  Document synthetic source for text and keyword (elastic#89893)
  Fix CloneSnapshotIT.testRemoveFailedCloneFromCSWithQueuedSnapshotInProgress (elastic#89914)
  Add missing index.mapping.total_fields.limit setting to the target index (elastic#89875)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >feature :Search Relevance/Vectors Vector search Team:Search Meta label for search team v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants