-
Notifications
You must be signed in to change notification settings - Fork 100
Add better support for file rotation #192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I am not sure about Windows and io.stat. I need to test. |
@colinsurprenant |
lib/filewatch/discoverer.rb
Outdated
new_discovery = true | ||
watched_file = WatchedFile.new(pathname, pathname.stat, @settings) | ||
logger.debug("Discoverer new discovery: #{watched_file.path}, inode: #{watched_file.sincedb_key.inode}, discover_files: #{path} (exclude is #{@exclude.inspect})") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: you are using the structured logs notations in most places, maybe here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started to change the logging to structured but its a PR all of its own.
I just did a shallow review and changes LGTM. I haven't run the tests nor tried to more deeply analyse because there are lots of changes and complexity is high. I will trust your testing and spec coverage. |
@guyboertje ah damn ... I did not see your last comment before starting. oh well. |
After some investigation in GCP on a Windows Server 2008 box I now know:
|
to better handle when a file is rotated that has never been active. Change the way @previous_inode_size is cached, it needs to be set only when an inode change is detected. Change discover associate behaviour if the matched watched_file is open. Add spec for when a file is rotated that has never been active. Alter specs because we can rely on ordering with "file_sort_by"
especially in the case where the rotated files are discoverable - in this case we were not the moving the state and the sincedb record correctly. Next - abstract the *nix and Windows stat calls and structure into two classes with the same API.
Move all tail processing state iteration into its own method Clean up rspec sequencing usage. Remove all Mutex use in favour of AtomicBoolean, AtomicArray, AtomicHash Add wait for completely_stopped before removing sincedb in file_tail spec
Arrrrrggghhhhhhh!!!! @£$%$%$ So I backed out the IO stat code. Now travis is green, biggest yay evar! Now to run the tests on Windows to make sure the windows execution path is not broken. @colinsurprenant Nearly there. |
@colinsurprenant |
ENV LOGSTASH_SOURCE 1 | ||
RUN jruby -S gem install bundler | ||
RUN jruby -S bundle install --jobs=3 --retry=3 | ||
RUN jruby -S bundle exec rake vendor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we be using Gradle ? also why is this needed here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copied from @jsvd work after the jruby S3 bucket went AWOL on the weekend.
Behind the scenes it does use gradle so I suppose we could run gradle here.
I don't want to touch it now, it works, we can circle back after confirming with Joao about this. It is a pattern we need to apply to all (most) plugins - probably scripted.
Step 11/11 : RUN jruby -S bundle exec rake vendor
---> Running in a4c0913880da
Downloading https://services.gradle.org/distributions/gradle-4.5.1-bin.zip
.....................................................................
Unzipping /usr/share/logstash/.gradle/wrapper/dists/gradle-4.5.1-bin/a5vbgfvpwtoqz8v2cdivxz28k/gradle-4.5.1-bin.zip to /usr/share/logstash/.gradle/wrapper/dists/gradle-4.5.1-bin/a5vbgfvpwtoqz8v2cdivxz28k
Set executable permissions for: /usr/share/logstash/.gradle/wrapper/dists/gradle-4.5.1-bin/a5vbgfvpwtoqz8v2cdivxz28k/gradle-4.5.1/bin/gradle
Starting a Gradle Daemon (subsequent builds will be faster)
:cleanGemjar UP-TO-DATE
:clean UP-TO-DATE
:compileJava
Download https://repo.maven.apache.org/maven2/org/jruby/jruby-complete/9.1.13.0/jruby-complete-9.1.13.0.pom
Download https://repo.maven.apache.org/maven2/org/jruby/jruby-artifacts/9.1.13.0/jruby-artifacts-9.1.13.0.pom
Download https://repo.maven.apache.org/maven2/org/jruby/jruby-parent/9.1.13.0/jruby-parent-9.1.13.0.pom
Download https://repo.maven.apache.org/maven2/org/sonatype/oss/oss-parent/7/oss-parent-7.pom
Download https://repo.maven.apache.org/maven2/org/jruby/jruby-complete/9.1.13.0/jruby-complete-9.1.13.0.jar
:processResources NO-SOURCE
:classes
:jar
:sourcesJar
:copyGemjar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I see.
I just created elastic/logstash/issues/9823 - we can move that discussion over there (or open a specific issue about it and link it there)
lib/filewatch/bootstrap.rb
Outdated
require "java" | ||
require_relative "../../lib/jars/filewatch-#{jar_version}.jar" | ||
fullpath = gem_root_dir.join("lib/jars/filewatch-#{jar_version}.jar").expand_path.to_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we rely on jar-dependencies to load this jar file? I believe this is what filter-date does to load its own compiles jar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will look but TBH jar-dependencies is so opaque/obscure to me that I think it is fine for external deps from maven but I trust the current mechanism more.
@@ -0,0 +1,8 @@ | |||
# encoding: utf-8 | |||
|
|||
ENV["LOG_AT"].tap do |level| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this documented somewhere so that future devs know about this if useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. I suppose it should be in devutils but until now I have been reluctant to make changes in that direction.
It is useful in this plugin with --fail-fast and trace logging to see what is going on. Before, the tests might fail in the suite but not when run individually.
I am not sure how useful it is to other plugins that don't have extensive trace logging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure - please don't touch devutils if only for getting rid of it. What I was saying is that since this could be useful to other developers that will work on this plugin and will need to run tests, it could be documented somewhere like a local CONTRIBUTING file or unsure where else but the idea is that if it can be useful to other, they must know about it in some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand now. I will add something when building the final changelog PR.
Same disclaimer: for the exception of a few minor comments, given the amount and complexity of the changes this is shallow review LGTM and I will rely on @guyboertje extensive testing. |
Ok. I'm going to squash merge this. |
Great news @guyboertje ! |
TIL The difference in Ruby between
IO#stat
andIO.stat
, former takes a stat on the io (file) object and the latter takes a stat on a path string (as doesFile::Stat.new
and Pathname#stat).When an open file is rotated the watched_file
path
may be pointing at a different inode with new content, and if so, the opened file is still pointing to the original inode and content. We have usedIO.stat
orFile::Stat.new
(since 2012) on the path regardless of whether the file is open or closed - this is not ideal.This PR adds the ability to do path based stat on closed files and path plus io based stat on open files. By incrementing
pending_inode_count
when we detect inode changes between the two kinds of stat calls and we also cache the previous size of the old content and the new content.When the inode is different between the two stat structures we know that the file is rotated. When we detect this we can fully read the old content as we still have an open file handle. At the end of the full read we close the file io and reset the watched_file and sincedb.
NOTE: we do detect inode changes for watched_files that are not
active
(i.e. closed, ignored, watched) but we can't know the renamed path of the old content in order to read any content we may not have processed while the watched_file was notactive
. A WARN message is logged in this case. In practice this only likely to occur on notactive
files when tailing more than 4095 (ormax_open_files
if set) files AND the file is rotated before active files can be put into theclosed
state - files become eligible for close when the last read was more thanclose_older
ago AND the file is fully read to the last stat size.Handle race conditions better:
stat
is taken bang in the middle of the rotation rename cascade,f -> f.1
|stat
|tmp -> f
path
glob pattern- rare).Most of this code has been tested by a discuss user who had problems with rotations.
Fixes #188