Skip to content

[file input] re-using inodes leads to missing/corrupted data - please implement periodic cleanup of sincedb entries #111

Closed
@purbon

Description

@purbon

From the issue:

Hi,

Sincedb for file input infinitely keeps reference to all files (inodes) that were ever tracked.
Under some (frequent) conditions this will lead to missing or corrupted data.

For example if there is any kind of log rotation mechanism that actually deletes any (even single) source file and new files are created afterwards, then you will certainly see this issue occurring at some point as file will be eventually created using an inode that was already tracked in past. Issue is even more visible if "start_position": is set to "beginning" as likely first parsed message will be incomplete.

The more input files you create the more chances to observe this issue. In my repro case I've created and rotated (deleted) around 1k files per day and I've seen this issue re-occurring few times a day. Currently my sincedb files are having over 100k lines.

If possible please implement dropping references from sincedb to files that do not exist anymore.

original issue: elastic/logstash#4566

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions