-
Notifications
You must be signed in to change notification settings - Fork 66
Notion of State / Variable Expansion in Config #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yup. Back to external scripts, I guess. |
I propose the following -- using the time formatting syntax we already support with events, we could make a special handler within this plugin that allows you to say Example for "now" that maybe uses unix epoch (
Or with today's date only:
All times would be "now" aka the HTTP request was initiated. |
@PhaedrusTheGreek @hummingV Thoughts on my proposal above? I worry it may confuse users since it uses the same syntax as event formatting, but at least it's possibly a familiar format? I'm open to other proposals. |
@jordansissel is there any way to have global variables that could be set by a ruby filter? I know, I'm a dreamer - but, addressing only the time issue in a single plugin might result in unwanted technical debt. |
@PhaedrusTheGreek , @jordansissel If the ruby filter was to stay upstream, then there is no events yet to process since input plugins are the ones responsible to create events and push it down the queue. If the ruby filter was to stay downstream, then it can set global variables for the next cycle of events but that's not a very elegant solution either. All in all, without a redesign of overall logstash event flow, forcing a ruby filter for this specific scenario will also incur technical debts of its own. My proposal is that this decision should be based on demand. Is there sufficient demand to allow filter plugins to be able to run upstream from input plugins? Or can we get away with minimal changes for this specific requirement? The only limitation I see is sprintf only support current time. What if we want to use other time objects instead? Can we extend the format to be the following: |
@hummingV I see your point - trying to control input state by filters makes no sense. +1 for some standard support beyond date. Has inline scripting already been ruled out? |
I am open to the idea of pre and post-event hooks. That is most flexible solution imo. Having them only for this plugin might be a bit awkward though. Not sure if logstash team would want an implementation at base class level. |
For inputs, post-event things can be done in filters. As for "pre-event" hooks, I don't think that solves this problem. The issue described in this ticket is for doing some computation to generate the URL used in the http request before each poll -- this is not necessarily "before the event" but is more "immediately before the next http request". It is possible that the http_poller could produce multiple individual events from a single http request. As for solving this for more than just dates, I don't have any solutions available yet. I believe we can solve the "now" and time formatting concern without needing a general solution. There's no mechanism to provide mutable state to an input for any plugin today. Some plugins will ask for the current time or may ask for a random number, but neither of those things are user-facing configurations. Do you think we could solve the "now" problem without needing a scripting/general solution? |
Formattable %{now} does solve this particular problem for me. |
My use case: I'm trying to hit a URI from a remote provider. The base URL is itself static. It takes some headers (supported), but requires some dynamic time indicators in the query string (start_time, end_time). I would like to send in values like "now-5min" and "now". Based on the other comments, I would like portions of the timestamp to be constant. In my case, I'll probably run every minute, so I would want seconds to be zero'd out (12:15:00). Depending on whether the remote provide handles dates as inclusive or exclusive, you might also need ":59", e.g. 12:15:59. Having a mechanism to track the last time successfully processed would also be useful if it could be used as the start time for the next request (similar to the timestamp tracked by the S3 input). |
let's try this another way, instead of describing the solution ("I need current time" or "I need now-5min") can y'all tell a story about what you want to do? Maybe we can find a solution from the information in these stories. |
My story: I need scheduledTime variable expansion to specify date range in the query to the |
Looking at my requirement again, I actually do need to be able to do date math as well. Use case is Fetching jobstats from Yarn.
|
I should also mention that I have similar use cases for other types of |
+1 for this too |
The request sounds fairly complex, so let me try and restate what I am hearing. Tell me if this isn't right:
If this is accurate, then I am in favor of such a thing, but I'm not sure exactly how to solve it just now. It will take some effort to come up with a solution that fits well. |
+1 |
This would be an amazing feature. We get some |
Copying my response from https://discuss.elastic.co/t/response-headers-http-poller/80739/3 here: Having the http_poller input be aware of how to paginate is something we've discussed internally. I personally feel it's not something we can achieve because of the ways that things present pagination, such as:
There are probably more pagination strategies, and I'm not sure we can prepare the http_poller plugin to support them all. In cases such as this, I generally recommend that a new input plugin be created that has specific knowledge of how to read data from a specific data source. In this case, it would be my recommendation to have a custom input plugin that knows how to handle the pagination strategy deployed by the Microsoft Table Service. |
Similar use case like url url => "${URL}&creationdate= |
+1 for this. My use case is pulling logs via API calls that require an end and start timestamps. |
If the poller could just work like the JDBC input plugin and have the concept of state. This would allow much easier indexing of large datasets via different api's |
That wouldn't work as well because APIs might use all kinds of different ways to track state while SQL only uses standard date format or integer IDs, but it would be considerably better than nothing. |
I wrote my own ruby script for this situation (in my case pulling DUO events), but if we could get support in http_poller input, that would make polling time-based APIs a lot simpler. It might just be a matter of adding in a parameter for 'seconds since' or some such thing (I used a hard-coded 900 second buffer in my script).
|
Wonder if they'll let this through: #111 |
Really would like this. I get data from an api and it needs relative times. I can’t figure out a way around it |
In my similar case, I have to extract Dynatrace dashboard data. I'm using Dynatrace AppMon Server REST interface to generate XML reports per minute. I'm able to parse the response using Xml filter plugin and format it with a custom Ruby script according to my need, however I also need a solution inside Logstash to get the source data periodically with an http request and I don't want to depend on a custom shell script etc. running outside of Logstash. I can't use Last1Min filter because it is out of my control, it depends on the triggering second and can cause other issues. an example url for 16:50- 16:51 UTC +3 is the following: url => https://:8021/rest/management/reports/create/?filter=tf:CustomTimeframe?1572875400000:1572875460000&type=XML |
For my use-case, I was looking for this. But unfortunately, it doesn't support it. So, I created a dirty workaround for this. For anyone, who would like to go with any alternate: https://stackoverflow.com/a/61259006/3565756 Not relevant to HTTP Poller, but a bit relevant to Polling. |
Not having this leaves us with developing bridge scripts that forwards logs into logstash. We are very frustrated, because we had many push based logs and a few that required polling. We developed push based because it was easier and have not considered that polling wont have state awareness at all. Now we end up creating a hack instead of choosing Apache Nifi or similar tool. Actually without this feature collecting more advanced logs is impossible. |
Any update on this? We're trying to fetch logs from an API that requires a timeframe for which to get the logs, so we have to update this timeframe for every request. Currently, updating the API is not an option because the API is produced by another company and the only resort is to creating a custom plugin based off of this one. |
I'm with the same concern! My approach is to insert the data through a python script, nevertheless, this is a better option. Any updates? Thank you very much! |
Is there any workaround for this. My use case is to use pull model to stream logs from loki server. We cannot use push model supported by loki grafana output plugin since that requires changes on firewall on client end. |
Is there any updates to this? I really need this feature |
As explained in Event Dependent Configuration
Which is fair, but some plugins such as the http_poller run on intervals, and could theoretically access some state.
The end goal would be to say something like this:
With use of some globally maintained variable (not even sure the best way to do this)
Or by maintaining an environment variable outside of Logstash
The text was updated successfully, but these errors were encountered: