-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Format doc values fields. #22146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Format doc values fields. #22146
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea. I like anything that makes it clear that these fields have been processed somehow and we aren't just returning them directly from source.
* @param name The field to get from the docvalue | ||
* @param format How to format the field, {@code null} to use defaults. | ||
*/ | ||
public SearchRequestBuilder addDocValueField(String name, String format) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're usually OK with making breaking changes to the Transport API in a minor release it is probably ok to add the format parameter to this method if you think it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know Kibana uses this feature to retrieve dates in millis since Epoch, so with this option they will be able to pass format: epoch_millis
when we format docvalue fields. I also plan to use it for the transition period: users will be able in 5.x to enable the 6.x behaviour by using the special format use_field_defaults
.
* @param name name of the field | ||
* @param format how to format the field | ||
*/ | ||
public InnerHitBuilder addDocValueField(String name, String format) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make format
@Nullable
and describe what null
means here?
+1 as well. I think it's useful for dates, it also fixes problems we can have with binary doc_values. Not sure about the name of the BWC param though, in the code you use |
Woooooops, good catch. I like |
|
||
@Override | ||
public void writeTo(StreamOutput out) throws IOException { | ||
out.writeString(name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this up under the reading constructor? It helps to have them on the same screen.
-------------------------------------------------- | ||
GET _search | ||
{ | ||
"docvalue_fields" : [ { "name": "my_date", "format": "epoch_millis" } ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep this as the default if possible in 5.x so we aren't making breaking changes, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
44bbf06
to
bf181d9
Compare
I'm removing the discuss label since we agreed to do this in FixitFriday. I also opened another PR for 5.x to better show the bw compat layer: #22354. |
"docvalue_fields" : [ { "name": "my_date", "format": "epoch_millis" } ] | ||
} | ||
-------------------------------------------------- | ||
// CONSOLE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optionally, you may want to do something like // TEST[setup:twitter]
to create a small index so the test generated from the snippet will find anything an exercise the snippet more fully. It certainly isn't required but I've been trying to do it lately.
bf181d9
to
e782af4
Compare
Currently `docvalues_fields` return the values of the fields as they are stored in doc values. I don't like that it exposes implementation details, but there are also user-facing issues like the fact it cannot work with binary fields. This change will also make it easier for users to reindex if they do not store the source, since `docvalues_fields` will return data is such a format that it can be put in an indexing request with the same mappings. The hard part of the change is backward compatibility, since it is breaking. The approach taken here is that 5.x will keep exposing the internal representation, with a special format name called `use_field_format` which will format the field depending on how it is mapped. This will become the default in 6.0, and this hardcoded format name will be removed in 7.0 to ease the transition from 5.x to 6.x.
e782af4
to
d1490e1
Compare
@jpountz Should this PR (and the related 5.x PR) be brought up to date and merged? |
@jpountz ping on this issue again for updating and merging (or closing if it's not needed any more) |
this PR needs sync with Kibana in order to be merged so I'll do it after 6.0 is out |
Currently
docvalues_fields
return the values of the fields as they are storedin doc values. I don't like that it exposes implementation details, but there
are also user-facing issues like the fact it cannot work with binary fields.
This change will also make it easier for users to reindex if they do not store
the source, since
docvalues_fields
will return data is such a format that itcan be put in an indexing request with the same mappings.
The hard part of the change is backward compatibility, since it is breaking.
The approach taken here is that 5.x will keep exposing the internal
representation, with a special format name called
use_field_format
whichwill format the field depending on how it is mapped. This will become the
default in 6.0, and this hardcoded format name will be removed in 7.0 to ease
the transition from 5.x to 6.x.