Skip to content

Update template to use ES 5.x mapping #462

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Aug 17, 2016
4 changes: 2 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@ sudo: false
jdk: oraclejdk8
env:
- INTEGRATION=false
- INTEGRATION=true ES_VERSION=2.2.0
- INTEGRATION=true ES_VERSION=1.7.5
- INTEGRATION=true ES_VERSION=5.0.0-alpha1
- INTEGRATION=true ES_VERSION=2.3.4
- INTEGRATION=true ES_VERSION=5.0.0-alpha4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be alpha5 now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump version to alpha5?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be increased to alpha5 since it has been released.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahahaha. All of us saw the same thing 👍

language: ruby
cache: bundler
rvm:
Expand Down
14 changes: 13 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
## 5.0.0
- Breaking Change: Index template for 5.0 has been changed to reflect Elasticsearch's mapping changes. Most importantly,
the subfield for string multi-fields has changed from `.raw` to `.keyword` to match ES default behavior. ([#386](https://github.com/logstash-plugins/logstash-output-elasticsearch/issues/386))

** Users installing ES 5.x and LS 5.x **
This change will not affect you and you will continue to use the ES defaults.

** Users upgrading from LS 2.x to LS 5.x with ES 5.x **
LS will not force upgrade the template, if `logstash` template already exists. This means you will still use
`.raw` for sub-fields coming from 2.x. If you choose to use the new template, you will have to reindex your data after
the new template is installed.

## 4.1.3
- Relax constraint on logstash-core-plugin-api to >= 1.60 <= 2.99
- Relax constraint on logstash-core-plugin-api to >= 1.60 <= 2.99

## 4.1.2

Expand Down
13 changes: 13 additions & 0 deletions lib/logstash/outputs/elasticsearch.rb
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,19 @@
#
# You can learn more about Elasticsearch at <https://www.elastic.co/products/elasticsearch>
#
# ==== Template management for Elasticsearch 5.x
# Index template for this version (Logstash 5.0) has been changed to reflect Elasticsearch's mapping changes in version 5.0.
# Most importantly, the subfield for string multi-fields has changed from `.raw` to `.keyword` to match ES default
# behavior.
#
# ** Users installing ES 5.x and LS 5.x **
# This change will not affect you and you will continue to use the ES defaults.
#
# ** Users upgrading from LS 2.x to LS 5.x with ES 5.x **
# LS will not force upgrade the template, if `logstash` template already exists. This means you will still use
# `.raw` for sub-fields coming from 2.x. If you choose to use the new template, you will have to reindex your data after
# the new template is installed.
#
# ==== Retry Policy
#
# The retry policy has changed significantly in the 2.2.0 release.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@
},
"mappings" : {
"_default_" : {
"_all" : {"enabled" : true, "omit_norms" : true},
"_all" : {"enabled" : true, "norms" : false},
"dynamic_templates" : [ {
"message_field" : {
"match" : "message",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"type" : "string", "index" : "analyzed", "norms" : false,
"fielddata" : { "format" : "disabled" }
}
}
Expand All @@ -20,24 +20,23 @@
"match" : "*",
"match_mapping_type" : "string",
"mapping" : {
"type" : "string", "index" : "analyzed", "omit_norms" : true,
"fielddata" : { "format" : "disabled" },
"type" : "text", "norms" : false,
"fields" : {
"raw" : {"type": "string", "index" : "not_analyzed", "ignore_above" : 256}
"keyword" : { "type": "keyword" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the idea was to let ES run through its defaults in the case that a stringed field is being dynamically mapped

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but we need "norms" : false, to be set, which ES does not by default. That's why I had to explicitly add this rule for strings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See 2 lines above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see. sounds good

}
}
}
} ],
"properties" : {
"@timestamp": { "type": "date" },
"@version": { "type": "string", "index": "not_analyzed" },
"@timestamp": { "type": "date", "include_in_all": false },
"@version": { "type": "keyword", "include_in_all": false },
"geoip" : {
"dynamic": true,
"properties" : {
"ip": { "type": "ip" },
"location" : { "type" : "geo_point" },
"latitude" : { "type" : "float" },
"longitude" : { "type" : "float" }
"latitude" : { "type" : "half_float" },
"longitude" : { "type" : "half_float" }
}
}
}
Expand Down
2 changes: 1 addition & 1 deletion logstash-output-elasticsearch.gemspec
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Gem::Specification.new do |s|

s.name = 'logstash-output-elasticsearch'
s.version = '4.1.3'
s.version = '5.0.0'
s.licenses = ['apache-2.0']
s.summary = "Logstash Output to Elasticsearch"
s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
Expand Down
4 changes: 2 additions & 2 deletions spec/integration/outputs/pipeline_spec.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
require_relative "../../../spec/es_spec_helper"

describe "Ingest pipeline execution behavior", :integration => true, :version_5x => true do
describe "Ingest pipeline execution behavior", :integration => true, :version_greater_than_5x => true do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

greater_than? and equal to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call

subject! do
require "logstash/outputs/elasticsearch"
settings = {
Expand All @@ -19,7 +19,7 @@
{
"grok": {
"field": "message",
"pattern": "%{COMBINEDAPACHELOG}"
"patterns": ["%{COMBINEDAPACHELOG}"]
}
}
]
Expand Down
93 changes: 93 additions & 0 deletions spec/integration/outputs/templates_5x_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
require_relative "../../../spec/es_spec_helper"

# This file is a copy of template test for 2.x. We can DRY this up later.
describe "index template expected behavior for 5.x", :integration => true, :version_greater_than_5x => true do
subject! do
require "logstash/outputs/elasticsearch"
settings = {
"manage_template" => true,
"template_overwrite" => true,
"hosts" => "#{get_host_port()}"
}
next LogStash::Outputs::ElasticSearch.new(settings)
end

before :each do
# Delete all templates first.
require "elasticsearch"

# Clean ES of data before we start.
@es = get_client
@es.indices.delete_template(:name => "*")

# This can fail if there are no indexes, ignore failure.
@es.indices.delete(:index => "*") rescue nil

subject.register

subject.multi_receive([
LogStash::Event.new("message" => "sample message here"),
LogStash::Event.new("somevalue" => 100),
LogStash::Event.new("somevalue" => 10),
LogStash::Event.new("somevalue" => 1),
LogStash::Event.new("country" => "us"),
LogStash::Event.new("country" => "at"),
LogStash::Event.new("geoip" => { "location" => [ 0.0, 0.0 ] })
])

@es.indices.refresh

# Wait or fail until everything's indexed.
Stud::try(20.times) do
r = @es.search
insist { r["hits"]["total"] } == 7
end
end

it "permits phrase searching on string fields" do
results = @es.search(:q => "message:\"sample message\"")
insist { results["hits"]["total"] } == 1
insist { results["hits"]["hits"][0]["_source"]["message"] } == "sample message here"
end

it "numbers dynamically map to a numeric type and permit range queries" do
results = @es.search(:q => "somevalue:[5 TO 105]")
insist { results["hits"]["total"] } == 2

values = results["hits"]["hits"].collect { |r| r["_source"]["somevalue"] }
insist { values }.include?(10)
insist { values }.include?(100)
reject { values }.include?(1)
end

it "does not create .keyword field for the message field" do
results = @es.search(:q => "message.keyword:\"sample message here\"")
insist { results["hits"]["total"] } == 0
end

it "creates .keyword field from any string field which is not_analyzed" do
results = @es.search(:q => "country.keyword:\"us\"")
insist { results["hits"]["total"] } == 1
insist { results["hits"]["hits"][0]["_source"]["country"] } == "us"

# partial or terms should not work.
results = @es.search(:q => "country.keyword:\"u\"")
insist { results["hits"]["total"] } == 0
end

it "make [geoip][location] a geo_point" do
expect(@es.indices.get_template(name: "logstash")["logstash"]["mappings"]["_default_"]["properties"]["geoip"]["properties"]["location"]["type"]).to eq("geo_point")
end

it "aggregate .keyword results correctly " do
results = @es.search(:body => { "aggregations" => { "my_agg" => { "terms" => { "field" => "country.keyword" } } } })["aggregations"]["my_agg"]
terms = results["buckets"].collect { |b| b["key"] }

insist { terms }.include?("us")

# 'at' is a stopword, make sure stopwords are not ignored.
insist { terms }.include?("at")
end
end


2 changes: 1 addition & 1 deletion spec/integration/outputs/templates_spec.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
require_relative "../../../spec/es_spec_helper"

describe "index template expected behavior", :integration => true do
describe "index template expected behavior", :integration => true, :version_less_than_5x => true do
subject! do
require "logstash/outputs/elasticsearch"
settings = {
Expand Down
2 changes: 1 addition & 1 deletion spec/integration/outputs/update_spec.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
require_relative "../../../spec/es_spec_helper"

describe "Update actions", :integration => true, :version_2x_plus => true do
describe "Update actions", :integration => true, :version_greater_than_2x => true do
require "logstash/outputs/elasticsearch"

def get_es_output( options={} )
Expand Down
11 changes: 7 additions & 4 deletions travis-run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,15 +35,18 @@ if [[ "$INTEGRATION" != "true" ]]; then
else
if [[ "$ES_VERSION" == 5.* ]]; then
setup_es https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/$ES_VERSION/elasticsearch-$ES_VERSION.tar.gz
start_es -Ees.script.inline=true -Ees.script.indexed=true -Ees.script.file=true
bundle exec rspec -fd spec --tag integration --tag version_5x --tag integration_2x_plus
start_es -Escript.inline=true -Escript.stored=true -Escript.file=true
# Run all tests which are for versions > 5 but don't run ones tagged < 5.x. Include ingest, new template
bundle exec rspec -fd spec --tag integration --tag version_greater_than_5x --tag ~version_less_than_5x
elif [[ "$ES_VERSION" == 2.* ]]; then
setup_es https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-$ES_VERSION.tar.gz
start_es -Des.script.inline=on -Des.script.indexed=on -Des.script.file=on
bundle exec rspec -fd spec --tag integration --tag ~version_5x
# Run all tests which are for versions < 5 but don't run ones tagged 5.x and above. Skip ingest, new template
bundle exec rspec -fd spec --tag integration --tag version_less_than_5x --tag ~version_greater_than_5x
else
setup_es https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-$ES_VERSION.tar.gz
start_es -Des.script.inline=on -Des.script.indexed=on -Des.script.file=on
bundle exec rspec -fd spec --tag integration --tag ~version_5x --tag ~version_2x_plus
# Still have to support ES versions < 2.x so run tests for those.
bundle exec rspec -fd spec --tag integration --tag ~version_greater_than_5x --tag ~version_greater_than_2x
fi
fi