add ipv6 field support #5758

kzwang · 2014-04-10T08:29:18Z

What I've done is copied the NumericTokenStream, NumericRangeFilter, NumericRangeQuery and NumericUtils from lucene and change those to support BigInteger

the term query and range query/filter are both working, but there are still other things like field data need to do

@kimchy @jpountz @dadoonet can you have a quick look just to make sure I've done the correct thing so far?

closes #3714

jpountz · 2014-04-10T13:48:47Z

Thank you @kzwang , I think this is the right approach in general for managing ipv6 addresses.

However, I'd like to be careful here since this decides on how we are going to index these addresses, and it won't be easy to change the format in the future since we need to maintain backward compatibility. For that reason, maybe the index-time range support for types that are greater than 64 bits should be contributed to Lucene first in order to have ideas/feedback from other Lucene committers? For example, I'm wondering if we should support any fixed length data like you did or if we could just write something that would support variable lengths.

Regarding the field mapper, I'm wondering if we need a dedicated field mapper for ipv6 addresses or if we should try to write a field mapper that would both support ipv4 and ipv6 addresses. I think the latter option would be more convenient from the user perspective, but maybe it would also be a problem because it would require us to be able to distinguish ipv4 and ipv6 addresses at both query and indexing time?

uboness · 2014-04-10T14:23:31Z

+1 on pushing index formatting to Lucene and not tie it to ES

Regarding the field mapper, I'm wondering if we need a dedicated field mapper for ipv6 addresses or if we should try to write a field mapper that would both support ipv4 and ipv6 addresses. I think the latter option would be more convenient from the user perspective, but maybe it would also be a problem because it would require us to be able to distinguish ipv4 and ipv6 addresses at both query and indexing time?

I'd like to see us pursue adding this support to the ip mapper we already have. Note that the type name doesn't indicate the version of the ip... so it can be confusing and inconsistent to some degree now that we'll support both. @jpountz what problems do you expect at index/query time with having a single mapper? I believe we can have a setting on the mapper for the version where the default is set to IPv4 (for bwc)

jpountz · 2014-04-10T17:10:16Z

@jpountz what problems do you expect at index/query time with having a single mapper? I believe we can have a setting on the mapper for the version where the default is set to IPv4 (for bwc)

The scenario I was thinking about is a user indexing Apache access logs who has a field in Elasticsearch that is used to store ip addresses. I don't think we should expect clients to use different fields depending on the version of the ip address, so I guess we either need:

to be able to support both v4 and v6 addresses in the same index field,
or have two index fields (one for v4 and one for v6) and add a bridge on top of them that would redirect to the appropriate field based on the version of the ip that is being indexed/searched.

I think the 2nd option would make bw compatibility easier to maintain but maybe it would also raise issues (eg. currently field mappers can only expose a single FieldMappers.Names.indexName()).

kzwang · 2014-04-11T01:06:16Z

I've created a issue in lucene (https://issues.apache.org/jira/browse/LUCENE-5596) and I'll move those code to there

uschindler · 2014-04-15T08:43:45Z

Hi, we started to discuss on the Lucene issue already,
To me it looks wrong to use BigInteger at all for IPv6 addresses. IP adresses in elasticsearch should use the raw bytes as returned by InetAddress#getAddress() which are in network byte order. Network byte order is sortable as we want to have it, without any signedness issues.

The proposal in Lucene is to not provide Big numeric support at all, just allow the precision step stuff also work on binary terms. We also try to allow indexing "binary" terms (which is supported under the hood by lucene) more easily.

In that case, ES is repsonsible to create a byte[] out of the network addresse (or whatever type) and index it. DocValues and Stored fields work out of the box already, just the part that takes care of indexing and range-querying the values may need improvements in Lucene (to support fast ranges). Out of the box, you can even do a TermRangeQuery on a binary term easily, if you indexed it as binary term! It is just not using prefix encoded terms for range speedup.

clintongormley · 2014-07-11T09:41:09Z

Depends on https://issues.apache.org/jira/browse/LUCENE-5596

gurvindersingh · 2014-10-03T08:35:57Z

any update on this issue ?

jrideout · 2014-10-06T18:19:29Z

@clintongormley I think we are awaiting: https://issues.apache.org/jira/browse/LUCENE-5879

vvaradhan · 2015-05-05T18:04:10Z

@jrideout https://issues.apache.org/jira/browse/LUCENE-5879 got fixed recently in trunk and 5.2 of Lucene. Is it possible to provide an update now for ipv6 support in ip field type?

clintongormley · 2015-05-07T18:58:48Z

@vvaradhan Auto-prefixed terms in Lucene has been exposed as an experimental postings format, so it's not safe for us to use until the feature makes it to the default postings format (which is backward compatible)

mikemccand · 2015-09-01T23:29:02Z

I think we could alternatively use the new (just released in Lucene 5.3.0) NumericRangeTree to implement this?

See #5683 (comment) for some ideas on how it would work for BigInteger/Decimal ... I think the basic idea would be similar: any value that can be converted into a "same sort order" byte[] should work.

uschindler · 2015-09-02T05:23:13Z

Regarding #5683: Converting of the ipv6 addresses to BigDecimal would be a waste here? Just use the byte[] directly: http://docs.oracle.com/javase/7/docs/api/java/net/Inet6Address.html#getAddress()

All signs are correct from the beginning, because its just 16 bytes (0-255), highest byte first. So order is correct by default. Theoretically we can index it as is (if we could use the new AutoPrefix terms), but for NumericRangeTree it should also be simple.

mikemccand · 2015-09-02T10:08:44Z

@uschindler right, we should go straight to the byte[]! It ought to work very well.

SKumarMN · 2015-10-13T10:05:18Z

@kimchy @jpountz @dadoonet @kzwang

Hi,

I have used the above fix in my 1.4.4 code to support big integer by changing the IPV6 Mapper. Search and range queries works fine. Our application needs support for Bigdecimal too. Could you please provide me pointers about how can i implement big decimal support with range functionality as well..

clintongormley · 2016-03-08T14:28:22Z

Closing in favour of #17007

add ipv6 support

ea50b84

jpountz self-assigned this Apr 10, 2014

jeffbryner mentioned this pull request Apr 14, 2014

ElasticSearch 'ip' field doesn't support ipv6 mozilla/MozDef#41

Closed

clintongormley added the stalled label Jul 11, 2014

clintongormley added the :Search Foundations/Mapping Index mappings, including merging and defining field types label Nov 11, 2014

drewr force-pushed the master branch from dcc3da0 to 7c20a8a Compare February 20, 2015 16:48

mikemccand removed the stalled label Sep 1, 2015

SKumarMN mentioned this pull request Oct 19, 2015

BigInteger/BigDecimal support #5683

Closed

clintongormley closed this Mar 8, 2016

asfimport mentioned this pull request Mar 9, 2016

Support for index/search large numeric field [LUCENE-5596] apache/lucene#6658

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add ipv6 field support #5758

add ipv6 field support #5758

kzwang commented Apr 10, 2014

jpountz commented Apr 10, 2014

uboness commented Apr 10, 2014

jpountz commented Apr 10, 2014

kzwang commented Apr 11, 2014

uschindler commented Apr 15, 2014

clintongormley commented Jul 11, 2014

gurvindersingh commented Oct 3, 2014

jrideout commented Oct 6, 2014

vvaradhan commented May 5, 2015

clintongormley commented May 7, 2015

mikemccand commented Sep 1, 2015

uschindler commented Sep 2, 2015

mikemccand commented Sep 2, 2015

SKumarMN commented Oct 13, 2015

clintongormley commented Mar 8, 2016

add ipv6 field support #5758

add ipv6 field support #5758

Conversation

kzwang commented Apr 10, 2014

jpountz commented Apr 10, 2014

uboness commented Apr 10, 2014

jpountz commented Apr 10, 2014

kzwang commented Apr 11, 2014

uschindler commented Apr 15, 2014

clintongormley commented Jul 11, 2014

gurvindersingh commented Oct 3, 2014

jrideout commented Oct 6, 2014

vvaradhan commented May 5, 2015

clintongormley commented May 7, 2015

mikemccand commented Sep 1, 2015

uschindler commented Sep 2, 2015

mikemccand commented Sep 2, 2015

SKumarMN commented Oct 13, 2015

clintongormley commented Mar 8, 2016