Better behavior and documentation on ingest pipelines and update operations #104941
Labels
:Data Management/Ingest Node
Execution or management of Ingest Pipelines including GeoIP
>enhancement
Team:Data Management
Meta label for data/management team
Using a
_bulk
to index some documents against an index supports a variety of operations for the documents in the bulk request. Some of those operations are well supported in the context of running ingest pipelines, and others are not supported in a not-at-all-surprising way, unfortunatelyupdate
operations leave a lot to be desired in both behavior and documentation.create
andindex
operations are very well supported, they're the bread and butter of bulk indexing, and we run ingest pipelines against those documents in a way that works.delete
operations do not run ingest pipelines, of course, but that's not especially surprising.update
operations are a mixed bag. There's update with a script, there's update with a partial doc, there's upsert with adoc
to-be-created and ascript
for updates, there's upsert with adoc_as_upsert
, etc.Technically we document that
doc_as_upsert
isn't supported with ingest pipelines (see #57649) , but I'm not sure what we actually mean by that. I think you can actually run ingest pipelines againstdoc_as_upsert
requests (that is, I don't we think throw anUnsupportedOperationException
or the like) -- do we just mean that it's buggy and has bad semantics and we don't want you to do it?Here's a small sample of some issues that have been reported in this area: #36745, #72108, #81764, #89194 -- I imagine there are more that I haven't immediately found.
Basically, I'd like to see:
Related to #36746 which brought this to the top of my mind.
The text was updated successfully, but these errors were encountered: