-
Notifications
You must be signed in to change notification settings - Fork 25.2k
add ipv6 field support #5758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add ipv6 field support #5758
Conversation
Thank you @kzwang , I think this is the right approach in general for managing ipv6 addresses. However, I'd like to be careful here since this decides on how we are going to index these addresses, and it won't be easy to change the format in the future since we need to maintain backward compatibility. For that reason, maybe the index-time range support for types that are greater than 64 bits should be contributed to Lucene first in order to have ideas/feedback from other Lucene committers? For example, I'm wondering if we should support any fixed length data like you did or if we could just write something that would support variable lengths. Regarding the field mapper, I'm wondering if we need a dedicated field mapper for ipv6 addresses or if we should try to write a field mapper that would both support ipv4 and ipv6 addresses. I think the latter option would be more convenient from the user perspective, but maybe it would also be a problem because it would require us to be able to distinguish ipv4 and ipv6 addresses at both query and indexing time? |
+1 on pushing index formatting to Lucene and not tie it to ES
I'd like to see us pursue adding this support to the ip mapper we already have. Note that the type name doesn't indicate the version of the ip... so it can be confusing and inconsistent to some degree now that we'll support both. @jpountz what problems do you expect at index/query time with having a single mapper? I believe we can have a setting on the mapper for the version where the default is set to IPv4 (for bwc) |
The scenario I was thinking about is a user indexing Apache access logs who has a field in Elasticsearch that is used to store ip addresses. I don't think we should expect clients to use different fields depending on the version of the ip address, so I guess we either need:
I think the 2nd option would make bw compatibility easier to maintain but maybe it would also raise issues (eg. currently field mappers can only expose a single |
I've created a issue in lucene (https://issues.apache.org/jira/browse/LUCENE-5596) and I'll move those code to there |
Hi, we started to discuss on the Lucene issue already, The proposal in Lucene is to not provide Big numeric support at all, just allow the precision step stuff also work on binary terms. We also try to allow indexing "binary" terms (which is supported under the hood by lucene) more easily. In that case, ES is repsonsible to create a byte[] out of the network addresse (or whatever type) and index it. DocValues and Stored fields work out of the box already, just the part that takes care of indexing and range-querying the values may need improvements in Lucene (to support fast ranges). Out of the box, you can even do a TermRangeQuery on a binary term easily, if you indexed it as binary term! It is just not using prefix encoded terms for range speedup. |
any update on this issue ? |
@clintongormley I think we are awaiting: https://issues.apache.org/jira/browse/LUCENE-5879 |
@jrideout https://issues.apache.org/jira/browse/LUCENE-5879 got fixed recently in trunk and 5.2 of Lucene. Is it possible to provide an update now for ipv6 support in ip field type? |
@vvaradhan Auto-prefixed terms in Lucene has been exposed as an experimental postings format, so it's not safe for us to use until the feature makes it to the default postings format (which is backward compatible) |
I think we could alternatively use the new (just released in Lucene 5.3.0) NumericRangeTree to implement this? See #5683 (comment) for some ideas on how it would work for BigInteger/Decimal ... I think the basic idea would be similar: any value that can be converted into a "same sort order" byte[] should work. |
Regarding #5683: Converting of the ipv6 addresses to BigDecimal would be a waste here? Just use the byte[] directly: http://docs.oracle.com/javase/7/docs/api/java/net/Inet6Address.html#getAddress() All signs are correct from the beginning, because its just 16 bytes (0-255), highest byte first. So order is correct by default. Theoretically we can index it as is (if we could use the new AutoPrefix terms), but for NumericRangeTree it should also be simple. |
@uschindler right, we should go straight to the byte[]! It ought to work very well. |
@kimchy @jpountz @dadoonet @kzwang Hi, I have used the above fix in my 1.4.4 code to support big integer by changing the IPV6 Mapper. Search and range queries works fine. Our application needs support for Bigdecimal too. Could you please provide me pointers about how can i implement big decimal support with range functionality as well.. |
Closing in favour of #17007 |
What I've done is copied the
NumericTokenStream
,NumericRangeFilter
,NumericRangeQuery
andNumericUtils
from lucene and change those to supportBigInteger
the term query and range query/filter are both working, but there are still other things like field data need to do
@kimchy @jpountz @dadoonet can you have a quick look just to make sure I've done the correct thing so far?
closes #3714