Skip to content

Commit d907249

Browse files
andseljsvd
andauthored
Raise the default value of up to 512 MB (#46)
Every parsing of incoming data should be limited, to avoid OOM. The original 20MB maybe is to low for some circumstances. To avoid generate noise to users that appropriately parses big json lines it's raised up to 512MB. Updates the default value for setting decode_size_limit_bytes to 512MB from 20MB, and print a deprecation log to inform the user the default value will be lowered in future version. --------- Co-authored-by: João Duarte <[email protected]>
1 parent aed3db3 commit d907249

File tree

5 files changed

+22
-4
lines changed

5 files changed

+22
-4
lines changed

CHANGELOG.md

+3
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
## 3.2.1
2+
- Raise the default value of `decode_size_limit_bytes` up to 512 MB. [#46](https://github.com/logstash-plugins/logstash-codec-json_lines/pull/46)
3+
14
## 3.2.0
25
- Add decode_size_limit_bytes option to limit the maximum size of each JSON line that can be parsed.[#43](https://github.com/logstash-plugins/logstash-codec-json_lines/pull/43)
36

docs/index.asciidoc

+9
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ Therefore this codec cannot work with line oriented inputs.
3535
|=======================================================================
3636
|Setting |Input type|Required
3737
| <<plugins-{type}s-{plugin}-charset>> |<<string,string>>, one of `["ASCII-8BIT", "UTF-8", "US-ASCII", "Big5", "Big5-HKSCS", "Big5-UAO", "CP949", "Emacs-Mule", "EUC-JP", "EUC-KR", "EUC-TW", "GB2312", "GB18030", "GBK", "ISO-8859-1", "ISO-8859-2", "ISO-8859-3", "ISO-8859-4", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", "ISO-8859-8", "ISO-8859-9", "ISO-8859-10", "ISO-8859-11", "ISO-8859-13", "ISO-8859-14", "ISO-8859-15", "ISO-8859-16", "KOI8-R", "KOI8-U", "Shift_JIS", "UTF-16BE", "UTF-16LE", "UTF-32BE", "UTF-32LE", "Windows-31J", "Windows-1250", "Windows-1251", "Windows-1252", "IBM437", "IBM737", "IBM775", "CP850", "IBM852", "CP852", "IBM855", "CP855", "IBM857", "IBM860", "IBM861", "IBM862", "IBM863", "IBM864", "IBM865", "IBM866", "IBM869", "Windows-1258", "GB1988", "macCentEuro", "macCroatian", "macCyrillic", "macGreek", "macIceland", "macRoman", "macRomania", "macThai", "macTurkish", "macUkraine", "CP950", "CP951", "IBM037", "stateless-ISO-2022-JP", "eucJP-ms", "CP51932", "EUC-JIS-2004", "GB12345", "ISO-2022-JP", "ISO-2022-JP-2", "CP50220", "CP50221", "Windows-1256", "Windows-1253", "Windows-1255", "Windows-1254", "TIS-620", "Windows-874", "Windows-1257", "MacJapanese", "UTF-7", "UTF8-MAC", "UTF-16", "UTF-32", "UTF8-DoCoMo", "SJIS-DoCoMo", "UTF8-KDDI", "SJIS-KDDI", "ISO-2022-JP-KDDI", "stateless-ISO-2022-JP-KDDI", "UTF8-SoftBank", "SJIS-SoftBank", "BINARY", "CP437", "CP737", "CP775", "IBM850", "CP857", "CP860", "CP861", "CP862", "CP863", "CP864", "CP865", "CP866", "CP869", "CP1258", "Big5-HKSCS:2008", "ebcdic-cp-us", "eucJP", "euc-jp-ms", "EUC-JISX0213", "eucKR", "eucTW", "EUC-CN", "eucCN", "CP936", "ISO2022-JP", "ISO2022-JP2", "ISO8859-1", "ISO8859-2", "ISO8859-3", "ISO8859-4", "ISO8859-5", "ISO8859-6", "CP1256", "ISO8859-7", "CP1253", "ISO8859-8", "CP1255", "ISO8859-9", "CP1254", "ISO8859-10", "ISO8859-11", "CP874", "ISO8859-13", "CP1257", "ISO8859-14", "ISO8859-15", "ISO8859-16", "CP878", "MacJapan", "ASCII", "ANSI_X3.4-1968", "646", "CP65000", "CP65001", "UTF-8-MAC", "UTF-8-HFS", "UCS-2BE", "UCS-4BE", "UCS-4LE", "CP932", "csWindows31J", "SJIS", "PCK", "CP1250", "CP1251", "CP1252", "external", "locale"]`|No
38+
| <<plugins-{type}s-{plugin}-decode_size_limit_bytes>> |<<string,string>>|No
3839
| <<plugins-{type}s-{plugin}-delimiter>> |<<string,string>>|No
3940
| <<plugins-{type}s-{plugin}-ecs_compatibility>> |<<string,string>>|No
4041
| <<plugins-{type}s-{plugin}-target>> |<<string,string>>|No
@@ -58,6 +59,14 @@ actual encoding of the text and logstash will convert it for you.
5859

5960
For nxlog users, you'll want to set this to `CP1252`
6061

62+
[id="plugins-{type}s-{plugin}-decode_size_limit_bytes"]
63+
===== `decode_size_limit_bytes`
64+
65+
* Value type is <<string,string>>
66+
* Default value is 512 MB
67+
68+
Maximum number of bytes for a single line before stop processing.
69+
6170
[id="plugins-{type}s-{plugin}-delimiter"]
6271
===== `delimiter`
6372

lib/logstash/codecs/json_lines.rb

+8-3
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ class LogStash::Codecs::JSONLines < LogStash::Codecs::Base
2828

2929
config_name "json_lines"
3030

31+
DEFAULT_DECODE_SIZE_LIMIT_BYTES = 512 * (1024 * 1024)
32+
3133
# The character encoding used in this codec. Examples include `UTF-8` and
3234
# `CP1252`
3335
#
@@ -43,9 +45,9 @@ class LogStash::Codecs::JSONLines < LogStash::Codecs::Base
4345
config :delimiter, :validate => :string, :default => "\n"
4446

4547
# Maximum number of bytes for a single line before a fatal exception is raised
46-
# which will stop Logsash.
47-
# The default is 20MB which is quite large for a JSON document
48-
config :decode_size_limit_bytes, :validate => :number, :default => 20 * (1024 * 1024) # 20MB
48+
# which will stop Logstash.
49+
# The default is 512MB which is quite large for a JSON document
50+
config :decode_size_limit_bytes, :validate => :number, :default => DEFAULT_DECODE_SIZE_LIMIT_BYTES # 512MB
4951

5052
# Defines a target field for placing decoded fields.
5153
# If this setting is omitted, data gets stored at the root (top level) of the event.
@@ -55,6 +57,9 @@ class LogStash::Codecs::JSONLines < LogStash::Codecs::Base
5557
public
5658

5759
def register
60+
if decode_size_limit_bytes == DEFAULT_DECODE_SIZE_LIMIT_BYTES
61+
deprecation_logger.deprecated "The default value for `decode_size_limit_bytes`, currently at 512Mb, will be lowered in a future version to prevent Out of Memory errors from abnormally large messages or missing delimiters. Please set a value that reflects the largest expected message size (e.g. 20971520 for 20Mb)"
62+
end
5863
@buffer = FileWatch::BufferedTokenizer.new(@delimiter, @decode_size_limit_bytes)
5964
@converter = LogStash::Util::Charset.new(@charset)
6065
@converter.logger = @logger

logstash-codec-json_lines.gemspec

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Gem::Specification.new do |s|
22

33
s.name = 'logstash-codec-json_lines'
4-
s.version = '3.2.0'
4+
s.version = '3.2.1'
55
s.licenses = ['Apache License (2.0)']
66
s.summary = "Reads and writes newline-delimited JSON"
77
s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"

spec/codecs/json_lines_spec.rb

+1
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,7 @@
119119
end
120120

121121
describe "decode_size_limits_bytes" do
122+
let(:codec_options) { { "decode_size_limit_bytes" => 20 * 1024 * 1024 } } # lower the default to avoid OOM errors in tests
122123
let(:maximum_payload) { "a" * subject.decode_size_limit_bytes }
123124

124125
it "should not raise an error if the number of bytes is not exceeded" do

0 commit comments

Comments
 (0)