Skip to content

Field Aliases #17511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nik9000 opened this issue Apr 4, 2016 · 20 comments
Closed

Field Aliases #17511

nik9000 opened this issue Apr 4, 2016 · 20 comments
Labels
discuss >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@nik9000
Copy link
Member

nik9000 commented Apr 4, 2016

Describe the feature:
Say you have documents like

{
  "description": "lorem ipsum",
  "customer": {
    "id": 2134,
    "name": "Test Enterprises"
  }
}

But you really want to search like customer:2134 to find this issue. It'd be useful to be able to define customer as an alias for customer.id or maybe customer.id^10, customer.name.

Traditionally this has been something Elasticsearch has left to the user but maybe it is time to talk about implementing this in Elasticsearch?

@nik9000 nik9000 added discuss :Search Foundations/Mapping Index mappings, including merging and defining field types >enhancement labels Apr 4, 2016
@clintongormley
Copy link
Contributor

Hi @nik9000

I closed a very similar issue recently (#17163). I'd much prefer not to clutter our mappings with something that seems to fit nicely into the application layer. Why do you feel this is needed?

@nik9000
Copy link
Member Author

nik9000 commented Apr 6, 2016

Sorry, I hadn't searched the issues before opening this. I should have done that. 20 days ago is super recent.

I agree it isn't a thing you'd expect from a database. It is more a feature you expect from a search interface which is why we've said "make it the application's problem". But now that we are doing ingest node, it feels like we are more willing to do things to make the "elasticsearch not wrapped in application code" experience nicer. This feels like an extension of that.

@clintongormley
Copy link
Contributor

With the ingest node, you can now rename the fields yourself. This is why I'm wondering why we also need field aliases. Also, boost in the mappings is now implemented as a query time boost instead of an index time boost. If #16920 makes progress, we could even allow updating the boost parameter on existing fields, in which case your foo^10 example wouldn't be necessary either.

@nik9000
Copy link
Member Author

nik9000 commented Apr 6, 2016

now rename

That is fine if Elasticsearch is just your search engine. But if it is your document store then renaming fields gets really confusing really fast.

I certainly wasn't thinking of aliases as a way to get updatable boosts. More of a way to get multiple fields. Kind of like a query time brother to copy_to.

@ejsmith
Copy link

ejsmith commented Apr 6, 2016

In our app we are exposing a search box that is just sent directly to query_string. We don't want to modify our documents because ES is our document store, but we want to give end users a friendly experience where they can type company:blah instead of data.company.id:blah

I understand that you guys wanted to make things simpler and got rid of just_name because it was confusing, but I think adding alias support could be intuitive and work great for things like this.

Alternatively, I will need to parse the users query myself and rewrite it. I feel like this has got to be a very common thing.

What is the downside of aliases?

@clintongormley
Copy link
Contributor

but we want to give end users a friendly experience where they can type company:blah instead of data.company.id:blah

Makes sense.

What is the downside of aliases?

Code complexity. The mapping code is seriously complex and prone to bugs. You just need to search for all the issues labelled :Mapping to get an idea of problematic they have been. For this reason, I'm loathe to add further complexity.

For instance, possibly the most obvious way of adding field aliases would be to add a field of type alias, which points to the real field. But now you want to change or remove those aliases. This is a huge change from the way we deal with mappings today: fields never change type and fields never get removed. What happens if you index into an alias field? etc etc

I could see field aliases having all sorts of unintended consequences - this is why they concern me. Yet they are really easy to implement application side. There are so many other truly useful things that we'd like to add to Elasticsearch - things that can't easily be done in the application - that something like field aliases doesn't make the cut in my opinion.

@ejsmith
Copy link

ejsmith commented Apr 7, 2016

What if aliases were their own feature separate from mapping and were only applied to queries. The aliases would be listed separate from the field mappings so it would be a lot more straightforward to see what is going on vs using something like just_name which I could see would be very confusing.

It wouldn't be the end of the world if I had to implement this in my app, but it just feels like something that should've solved by ElasticSEARCH and seems like it would be a very common thing.

@niemyjski
Copy link
Contributor

I agree with @ejsmith

@gmoskovicz
Copy link
Contributor

@ejsmith

I think that this can be really useful as well, however it shouldn't be so hard to apply something to the search box before searching for something in order to translate a query to it's alias.

If we add this to any structure, mapping or not, we should keep the aliases for each field and document and index in either the mapping or the cluster state. I believe that the infromation related to structures are only stored in the mapping, hence why if we add this it doesn't make sense to use another structure as it will break the concept of mapping, and why they exist. @clintongormley am i right?

The good thing about this is that there is an easy workaround: translating this at the app level, you just keep the structure in your end, and if you have an alias a: a.id and a query string a:id_search you need to translate this into a.id:id_search.

Another complex workaround (i believe), could be creating a small plugin, but that can be tricky since you will need to find your way to store this.

Finally, you could use the Mapping Metadata to save this. But then i believe you should load the information at app start time to avoid GET operations only to get this information from the mapping. @clintongormley shouldn't the meta field be an option to store this information about field "aliases"? That can be an option to not include this straight away in the mapping.

@JackRyanson
Copy link

gentlemen, i would see a very important use case for this: on the fly mapping of an index to another

say i have LOG1 which contains IP addresses under a field name e.g. SRC_IP and LOG2 that contains IP addresse under a completely different name e.g. .. LOGGEDIP .. would this feature not allow one to create an alias (index alias) which mapes the 2 indexes into one.. and then maps the 2 fields into what would look like the same field.. therefore allowing me to ask "top 10 IPs" etc.. across the 2 indexes.

I think this would be tremendously poweful and enabling. Thoughts? thanks for considering

@clintongormley
Copy link
Contributor

We discussed this in FixItFriday today. The comment in #17511 (comment) made me think that perhaps this could be a solution for #18195 (comment)

Turns out that adding field aliases would be massively complex, and wouldn't provide a proper solution. This is more than just queries. Field rewriting would need to happen in queries, aggs, highlighting, stored fields, docvalue_fields, suggesters, source filtering, security etc, both in requests and responses. It wouldn't be clean or obvious, we would leak the real field names all over the place.

We have decided against doing this.

@niemyjski
Copy link
Contributor

@clintongormley I don't think anyone here expects you guys to rewrite the response anywhere, we just want an easier way to query. The results should stay the same!

@ejsmith
Copy link

ejsmith commented Aug 26, 2016

Exactly. Really just for query string filters that users would type in directly

@luckydonald
Copy link

luckydonald commented Jun 6, 2017

You can write rather complex queries in there, and

query_string.replace("customer:", "customer.id:")

feels dirty for a reason.

One reason for doing this on elastic side is, you already doing some user input verification, and probably parse this way better as the user application anyway.

So maybe just a rewrite for the single query, as part of query_string?

@edudar
Copy link

edudar commented Aug 15, 2017

Sounds like it was way over-engineered on the meeting compared to what was asked. Handling this task in an application requires proper query parser and elastic does not expose one at the moment. We use Lucene's one directly but it not tied to elastic analyzers per field so error prone at the end.

@arakelian
Copy link

I agree with @edudar.

I think use case being described is simple and limited in scope. When parsing queries, allow the caller to use a field alias, which gets mapped to a field name. Period. Everything else works the same.

The value of this feature is that it enables us to rename fields in the index, without breaking backwards compatibility for clients. The old fields names are then deprecated, and over time we can migrate clients off of the old field names.

In our case, we had fronted ES with our own service anyway -- to add security features -- so we had a path for adding field aliases as described above. But I would much rather have this in the index, than external, if possible.

@luckydonald
Copy link

@clintongormley, can you please reconsider this issue after people expressing a need for it to do it (only) in queries?

As elastic search is already having a parser for queries, building a hopefully identical parsing parser for the query string in the frontend application, just to be able to hopefully safely replace a fields name for that query (not anywhere else), seems to be a bad idea, really.

{
  "query": {
        "query_string" : {
            "default_field" : "content",
            "query" : "foo:foo: OR foobar:lol AND c_id:1234",
            "aliases": {"foo": "foobar", "c_id": "customer.id"}
        }
}

This should search for the field foobar being either foo:1) or lol, and the customer being the one with id 12342).


1) Here a simple string replace would break things

2) Additionally, if you'd name the alias customer, it could indeed be resolved first, so that customer:1234 would be possible to?

@clintongormley
Copy link
Contributor

See #23714

@rretter
Copy link

rretter commented Nov 9, 2017

It seems they've decided that their complicated interpretation of the simple request is complicated, and therefor not worth doing. We need to remember: the product is for the implementors, not the users.

@arakelian
Copy link

arakelian commented Nov 10, 2017

@rretter Why don't you submit a PR, or better yet, create an open source project of your own to share with others, before you impune the motivations of @clintongormley or characterize the request as "simple"? FWIW, this request of mine is simple too.

@javanna javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss >enhancement :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests