Skip to content

Add better support for file rotation #192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Jul 5, 2018
Merged
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
7622fc7
Add better support for file rotation
May 14, 2018
257577f
Move common_restat for watched and active to one iteration
May 24, 2018
60ab865
Ensure single file `path` option is ok, reverts earlier change in thi…
May 25, 2018
fb48cc5
Improve and expand our use of FFI on Windows
Jun 3, 2018
f0720e1
jnr and ffi interop, yay
Jun 4, 2018
f25c2b1
put self back in
Jun 4, 2018
fb4ae28
use fieldId label
Jun 5, 2018
2c846d3
handle rotations better.
Jun 19, 2018
f2b8555
fixes for rebase from master
Jun 19, 2018
fe96100
Abstract Stat part 1
Jun 20, 2018
d127aa0
Abstract Stat part 2
Jun 20, 2018
58a2518
Finally have (all) the kinks worked out
Jun 21, 2018
124e318
Try to fix travis failures.
Jun 22, 2018
05498b4
Try to fix travis failures 2
Jun 24, 2018
487c641
Try fix travis 3
Jun 24, 2018
167eecf
Try fix travis 4
Jun 24, 2018
6c82057
Try fix travis 5
Jun 24, 2018
168f451
Try fix travis 6
Jun 27, 2018
87b4e8d
Remove io based stat reliance. travis jruby 1.7.27 should pass now.
Jun 29, 2018
ab409ed
Some windows fixes
Jun 30, 2018
a176a35
more windows fixes
Jun 30, 2018
62dccf8
windows changes 2
Jun 30, 2018
8504125
rename rspec run tag in ci/build.sh
Jul 1, 2018
6e36c89
move one trace logging line
Jul 2, 2018
ff3bb5a
add first run discovery methods
Jul 2, 2018
ac62656
fix regression on files seen after inital run, travis 2 use docker.
Jul 2, 2018
257687e
add execute permissions
Jul 2, 2018
ee0855b
fix path ordering travis failures
Jul 2, 2018
a86893e
fix jar loading so it works for tests and when installed in LS
Jul 4, 2018
242bad3
reorder the jar require statements
Jul 5, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 12 additions & 14 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,18 +1,16 @@
---
sudo: false
language: ruby
cache: bundler
sudo: required
services: docker
addons:
apt:
packages:
- docker-ce
matrix:
include:
- rvm: jruby-9.1.13.0
env: LOGSTASH_BRANCH=master
- rvm: jruby-9.1.13.0
env: LOGSTASH_BRANCH=6.x
- rvm: jruby-9.1.13.0
env: LOGSTASH_BRANCH=6.0
- rvm: jruby-1.7.27
env: LOGSTASH_BRANCH=5.6
- env: ELASTIC_STACK_VERSION=5.6.10
- env: ELASTIC_STACK_VERSION=6.3.0
- env: ELASTIC_STACK_VERSION=6.4.0-SNAPSHOT
- env: ELASTIC_STACK_VERSION=7.0.0-alpha1-SNAPSHOT
fast_finish: true
install: true
script: ci/build.sh
jdk: oraclejdk8
install: ci/unit/docker-setup.sh
script: ci/unit/docker-run.sh
2 changes: 1 addition & 1 deletion JAR_VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.0.0
1.0.1
3 changes: 0 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -2,9 +2,6 @@
Travis Build
[![Travis Build Status](https://travis-ci.org/logstash-plugins/logstash-input-file.svg)](https://travis-ci.org/logstash-plugins/logstash-input-file)

Jenkins Build
[![Travis Build Status](https://travis-ci.org/logstash-plugins/logstash-input-file.svg)](https://travis-ci.org/logstash-plugins/logstash-input-file)

This is a plugin for [Logstash](https://github.com/elastic/logstash).

It is fully free and fully open source. The license is Apache 2.0, meaning you are pretty much free to use it however you want in whatever way.
21 changes: 0 additions & 21 deletions ci/build.sh

This file was deleted.

26 changes: 0 additions & 26 deletions ci/setup.sh

This file was deleted.

11 changes: 11 additions & 0 deletions ci/unit/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
ARG ELASTIC_STACK_VERSION
FROM docker.elastic.co/logstash/logstash:$ELASTIC_STACK_VERSION
WORKDIR /usr/share/logstash/logstash-core
RUN cp versions-gem-copy.yml ../logstash-core-plugin-api/versions-gem-copy.yml
COPY --chown=logstash:logstash . /usr/share/plugins/this
WORKDIR /usr/share/plugins/this
ENV PATH=/usr/share/logstash/vendor/jruby/bin:${PATH}
ENV LOGSTASH_SOURCE 1
RUN jruby -S gem install bundler
RUN jruby -S bundle install --jobs=3 --retry=3
RUN jruby -S bundle exec rake vendor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we be using Gradle ? also why is this needed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copied from @jsvd work after the jruby S3 bucket went AWOL on the weekend.
Behind the scenes it does use gradle so I suppose we could run gradle here.
I don't want to touch it now, it works, we can circle back after confirming with Joao about this. It is a pattern we need to apply to all (most) plugins - probably scripted.

Step 11/11 : RUN jruby -S bundle exec rake vendor
 ---> Running in a4c0913880da
Downloading https://services.gradle.org/distributions/gradle-4.5.1-bin.zip
.....................................................................
Unzipping /usr/share/logstash/.gradle/wrapper/dists/gradle-4.5.1-bin/a5vbgfvpwtoqz8v2cdivxz28k/gradle-4.5.1-bin.zip to /usr/share/logstash/.gradle/wrapper/dists/gradle-4.5.1-bin/a5vbgfvpwtoqz8v2cdivxz28k
Set executable permissions for: /usr/share/logstash/.gradle/wrapper/dists/gradle-4.5.1-bin/a5vbgfvpwtoqz8v2cdivxz28k/gradle-4.5.1/bin/gradle
Starting a Gradle Daemon (subsequent builds will be faster)
:cleanGemjar UP-TO-DATE
:clean UP-TO-DATE
:compileJava
Download https://repo.maven.apache.org/maven2/org/jruby/jruby-complete/9.1.13.0/jruby-complete-9.1.13.0.pom
Download https://repo.maven.apache.org/maven2/org/jruby/jruby-artifacts/9.1.13.0/jruby-artifacts-9.1.13.0.pom
Download https://repo.maven.apache.org/maven2/org/jruby/jruby-parent/9.1.13.0/jruby-parent-9.1.13.0.pom
Download https://repo.maven.apache.org/maven2/org/sonatype/oss/oss-parent/7/oss-parent-7.pom
Download https://repo.maven.apache.org/maven2/org/jruby/jruby-complete/9.1.13.0/jruby-complete-9.1.13.0.jar
:processResources NO-SOURCE
:classes
:jar
:sourcesJar
:copyGemjar

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I see.
I just created elastic/logstash/issues/9823 - we can move that discussion over there (or open a specific issue about it and link it there)

17 changes: 17 additions & 0 deletions ci/unit/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
version: '3'

# run tests: docker-compose -f ci/unit/docker-compose.yml up --build --force-recreate
# only set up: docker-compose -f ci/unit/docker-compose.yml up --build --no-start --force-recreate
# start manually: docker-compose -f ci/unit/docker-compose.yml run logstash
services:
logstash:
build:
context: ../../
dockerfile: ci/unit/Dockerfile
args:
- ELASTIC_STACK_VERSION=$ELASTIC_STACK_VERSION
command: /usr/share/plugins/this/ci/unit/run.sh
environment:
LS_JAVA_OPTS: "-Xmx256m -Xms256m"
OSS: "true"
tty: true
7 changes: 7 additions & 0 deletions ci/unit/docker-run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash

# This is intended to be run the plugin's root directory. `ci/unit/docker-test.sh`
# Ensure you have Docker installed locally and set the ELASTIC_STACK_VERSION environment variable.
set -e

docker-compose -f ci/unit/docker-compose.yml run logstash
31 changes: 31 additions & 0 deletions ci/unit/docker-setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
#!/bin/bash

# This is intended to be run the plugin's root directory. `ci/unit/docker-test.sh`
# Ensure you have Docker installed locally and set the ELASTIC_STACK_VERSION environment variable.
set -e

if [ "$ELASTIC_STACK_VERSION" ]; then
echo "Testing against version: $ELASTIC_STACK_VERSION"

if [[ "$ELASTIC_STACK_VERSION" = *"-SNAPSHOT" ]]; then
cd /tmp
wget https://snapshots.elastic.co/docker/logstash-"$ELASTIC_STACK_VERSION".tar.gz
tar xfvz logstash-"$ELASTIC_STACK_VERSION".tar.gz repositories
echo "Loading docker image: "
cat repositories
docker load < logstash-"$ELASTIC_STACK_VERSION".tar.gz
rm logstash-"$ELASTIC_STACK_VERSION".tar.gz
cd -
fi

if [ -f Gemfile.lock ]; then
rm Gemfile.lock
fi

docker-compose -f ci/unit/docker-compose.yml down
docker-compose -f ci/unit/docker-compose.yml up --no-start --build --force-recreate logstash
else
echo "Please set the ELASTIC_STACK_VERSION environment variable"
echo "For example: export ELASTIC_STACK_VERSION=6.2.4"
exit 1
fi
6 changes: 6 additions & 0 deletions ci/unit/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash

# This is intended to be run inside the docker container as the command of the docker-compose.
set -ex

bundle exec rspec -fd --pattern spec/**/*_spec.rb,spec/**/*_specs.rb
30 changes: 9 additions & 21 deletions lib/filewatch/bootstrap.rb
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
# encoding: utf-8
require "rbconfig"
require "pathname"
# require "logstash/environment"

## Common setup
# all the required constants and files
@@ -13,36 +11,26 @@ module FileWatch
# this is used in the read loop e.g.
# @opts[:file_chunk_count].times do
# where file_chunk_count defaults to this constant
FIXNUM_MAX = (2**(0.size * 8 - 2) - 1)
MAX_ITERATIONS = (2**(0.size * 8 - 2) - 2) / 32768

require_relative "helper"

module WindowsInode
def prepare_inode(path, stat)
fileId = Winhelper.GetWindowsUniqueFileIdentifier(path)
[fileId, 0, 0] # dev_* doesn't make sense on Windows
end
end

module UnixInode
def prepare_inode(path, stat)
[stat.ino.to_s, stat.dev_major, stat.dev_minor]
end
end

jar_version = Pathname.new(__FILE__).dirname.join("../../JAR_VERSION").realpath.read.strip

gem_root_dir = Pathname.new(__FILE__).dirname.join("../../").realpath
jar_version = gem_root_dir.join("JAR_VERSION").read.strip
fullpath = gem_root_dir.join("lib/jars/filewatch-#{jar_version}.jar").expand_path.to_path
require "java"
require_relative "../../lib/jars/filewatch-#{jar_version}.jar"
require fullpath
require "jruby_file_watch"

if LogStash::Environment.windows?
require_relative "winhelper"
require_relative "stat/windows_path"
PathStatClass = Stat::WindowsPath
FileOpener = FileExt
InodeMixin = WindowsInode
else
require_relative "stat/generic"
PathStatClass = Stat::Generic
FileOpener = ::File
InodeMixin = UnixInode
end

# Structs can be used as hash keys because they compare by value
63 changes: 35 additions & 28 deletions lib/filewatch/discoverer.rb
Original file line number Diff line number Diff line change
@@ -10,8 +10,8 @@ class Discoverer
include LogStash::Util::Loggable

def initialize(watched_files_collection, sincedb_collection, settings)
@watching = []
@exclude = []
@watching = Concurrent::Array.new
@exclude = Concurrent::Array.new
@watched_files_collection = watched_files_collection
@sincedb_collection = sincedb_collection
@settings = settings
@@ -21,13 +21,13 @@ def initialize(watched_files_collection, sincedb_collection, settings)
def add_path(path)
return if @watching.member?(path)
@watching << path
discover_files(path)
discover_files_new_path(path)
self
end

def discover
@watching.each do |path|
discover_files(path)
discover_files_ongoing(path)
end
end

@@ -37,7 +37,7 @@ def can_exclude?(watched_file, new_discovery)
@exclude.each do |pattern|
if watched_file.pathname.fnmatch?(pattern)
if new_discovery
logger.debug("Discoverer can_exclude?: #{watched_file.path}: skipping " +
logger.trace("Discoverer can_exclude?: #{watched_file.path}: skipping " +
"because it matches exclude #{pattern}")
end
watched_file.unwatch
@@ -47,45 +47,52 @@ def can_exclude?(watched_file, new_discovery)
false
end

def discover_files(path)
globbed = Dir.glob(path)
globbed = [path] if globbed.empty?
logger.debug("Discoverer found files, count: #{globbed.size}")
globbed.each do |file|
logger.debug("Discoverer found file, path: #{file}")
def discover_files_new_path(path)
discover_any_files(path, false)
end

def discover_files_ongoing(path)
discover_any_files(path, true)
end

def discover_any_files(path, ongoing)
fileset = Dir.glob(path).select{|f| File.file?(f) && !File.symlink?(f)}
logger.trace("discover_files", "count" => fileset.size)
fileset.each do |file|
pathname = Pathname.new(file)
next unless pathname.file?
next if pathname.symlink?
new_discovery = false
watched_file = @watched_files_collection.watched_file_by_path(file)
if watched_file.nil?
logger.debug("Discoverer discover_files: #{path}: new: #{file} (exclude is #{@exclude.inspect})")
new_discovery = true
watched_file = WatchedFile.new(pathname, pathname.stat, @settings)
watched_file = WatchedFile.new(pathname, PathStatClass.new(pathname), @settings)
end
# if it already unwatched or its excluded then we can skip
next if watched_file.unwatched? || can_exclude?(watched_file, new_discovery)

logger.trace("discover_files handling:", "new discovery"=> new_discovery, "watched_file details" => watched_file.details)

if new_discovery
if watched_file.file_ignorable?
logger.debug("Discoverer discover_files: #{file}: skipping because it was last modified more than #{@settings.ignore_older} seconds ago")
# on discovery ignorable watched_files are put into the ignored state and that
# updates the size from the internal stat
# so the existing contents are not read.
# because, normally, a newly discovered file will
# have a watched_file size of zero
# they are still added to the collection so we know they are there for the next periodic discovery
watched_file.ignore
end
# now add the discovered file to the watched_files collection and adjust the sincedb collections
@watched_files_collection.add(watched_file)
watched_file.initial_completed if ongoing
# initially when the sincedb collection is filled with records from the persistence file
# each value is not associated with a watched file
# a sincedb_value can be:
# unassociated
# associated with this watched_file
# associated with a different watched_file
@sincedb_collection.associate(watched_file)
if @sincedb_collection.associate(watched_file)
if watched_file.file_ignorable?
logger.trace("Discoverer discover_files: #{file}: skipping because it was last modified more than #{@settings.ignore_older} seconds ago")
# on discovery ignorable watched_files are put into the ignored state and that
# updates the size from the internal stat
# so the existing contents are not read.
# because, normally, a newly discovered file will
# have a watched_file size of zero
# they are still added to the collection so we know they are there for the next periodic discovery
watched_file.ignore_as_unread
end
# now add the discovered file to the watched_files collection and adjust the sincedb collections
@watched_files_collection.add(watched_file)
end
end
# at this point the watched file is created, is in the db but not yet opened or being processed
end
3 changes: 2 additions & 1 deletion lib/filewatch/observing_base.rb
Original file line number Diff line number Diff line change
@@ -44,7 +44,8 @@ def initialize(opts={})
:exclude => [],
:start_new_files_at => :end,
:delimiter => "\n",
:file_chunk_count => FIXNUM_MAX,
:file_chunk_count => MAX_ITERATIONS,
:file_chunk_size => FILE_READ_SIZE,
:file_sort_by => "last_modified",
:file_sort_direction => "asc",
}.merge(opts)
Loading