-
Notifications
You must be signed in to change notification settings - Fork 119
#270 Support periodic manual commits #275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -212,7 +212,7 @@ class LogStash::Inputs::Kafka < LogStash::Inputs::Base | |
# `key`: A ByteBuffer containing the message key | ||
# `timestamp`: The timestamp of this message | ||
config :decorate_events, :validate => :boolean, :default => false | ||
|
||
config :manual_commit_interval_ms, :validate => :string | ||
|
||
public | ||
def register | ||
|
@@ -221,6 +221,7 @@ def register | |
|
||
public | ||
def run(logstash_queue) | ||
@manual_commit_interval_ms = manual_commit_interval_ms.to_i | ||
@runner_consumers = consumer_threads.times.map { |i| create_consumer("#{client_id}-#{i}") } | ||
@runner_threads = @runner_consumers.map { |consumer| thread_runner(logstash_queue, consumer) } | ||
@runner_threads.each { |t| t.join } | ||
|
@@ -247,6 +248,7 @@ def thread_runner(logstash_queue, consumer) | |
else | ||
consumer.subscribe(topics); | ||
end | ||
last_commit_time = timestamp_ms | ||
codec_instance = @codec.clone | ||
while !stop? | ||
records = consumer.poll(poll_timeout_ms) | ||
|
@@ -266,8 +268,9 @@ def thread_runner(logstash_queue, consumer) | |
end | ||
end | ||
# Manual offset commit | ||
if @enable_auto_commit == "false" | ||
if has_to_commit?(last_commit_time) | ||
consumer.commitSync | ||
last_commit_time = timestamp_ms | ||
end | ||
end | ||
rescue org.apache.kafka.common.errors.WakeupException => e | ||
|
@@ -354,4 +357,16 @@ def set_sasl_config(props) | |
|
||
props.put("sasl.kerberos.service.name",sasl_kerberos_service_name) unless sasl_kerberos_service_name.nil? | ||
end | ||
|
||
def timestamp_ms | ||
(Time.now.to_f * 1000).to_i | ||
end | ||
|
||
def has_to_commit?(last_commit_time) | ||
# If auto_commit is enable we just leave the commit to the client library on poll and close actions | ||
return false if @enable_auto_commit == "false" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this should be |
||
|
||
# If auto_commit is disable, we need to commit, we will do it depending on the manual_commit_interval option | ||
@manual_commit_interval_ms <= 0 || (last_commit_time + @manual_commit_interval_ms) < timestamp_ms | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for clarity, we can change the big conditional into two operations: def has_to_commit?(last_commit_time)
# auto_commit is enabled so we just leave the commit to the client library on poll and close actions
return false if @enable_auto_commit == "true"
# auto_commit is disabled but interval committing is disabled as well, so commit on every poll
return true if @manual_commit_interval_ms <= 0
# auto_commit is disabled and an interval is set, so let's check if enough time passed since last commit
(last_commit_time + @manual_commit_interval_ms) < current_timestamp_ms
end |
||
end | ||
end #class LogStash::Inputs::Kafka |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by not committing anymore on all
poll
operations we now have two issues that must be addressed:has_to_commit?
returns true and then no other events arrive, we'll never commit the offset because we have a guard at the start of the loop to skip if no records are returned from poll.stop
operation doesn't do it explicitly. Currently it relies on either commit per poll or auto commit.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I only took in account my case in which I have a pretty stable flow.