Add the possibility to flush traces/metrics after a handler loop has finished #691

purkhusid · 2023-09-04T09:25:29Z

The problem I currently have is that there is no good way to flush traces/logs/metrics/errors once the runtime has finished handling a message.

Currently the handling code is structured something like this:

let request_span = <create span for the request>

async {
    <do parsing, call the user provider handler, catch panics etc>
}.instrument(tracing_span)

There are a couple of issues with this approach:

The user provided handler is the only way or the end user to affect the loop and thus he has to e.g. flush spans inside the handler
Since the handler can panic it is possible that the flushing is never done since it has to happen after the work has been done.
The request_span span can not be flushed since it's finished after the user provided handler is provided.
If the handler fails on e.g. parsing there is no way for the end user to send the errors to their error reporting tool of choice because their handler was never run.

I think it would be very valuable that the runtime would provide a hook for the end user so that they can do cleanup after each invocation.

The text was updated successfully, but these errors were encountered:

purkhusid · 2023-09-05T15:15:29Z

I patched the runtime with this to fix tracing in our lambdas:

diff --git lambda-runtime/src/lib.rs lambda-runtime/src/lib.rs
index e3ffd49..a01d2cb 100644
--- lambda-runtime/src/lib.rs
+++ lambda-runtime/src/lib.rs
@@ -20,7 +20,7 @@ use std::{
     env,
     fmt::{self, Debug, Display},
     future::Future,
-    panic,
+    panic, sync::Arc,
 };
 use tokio::io::{AsyncRead, AsyncWrite};
 use tokio_stream::{Stream, StreamExt};
@@ -101,6 +101,7 @@ where
         &self,
         incoming: impl Stream<Item = Result<http::Response<hyper::Body>, Error>> + Send,
         mut handler: F,
+        callback: Option<Arc<impl Fn()>>
     ) -> Result<(), Error>
     where
         F: Service<LambdaEvent<A>>,
@@ -202,7 +203,13 @@ where
             }
             .instrument(request_span)
             .await?;
+        
+            if let Some(callback) = callback.clone() {
+                callback();
+            }
         }
+
+
         Ok(())
     }
 }
@@ -258,7 +265,52 @@ where
 
     let client = &runtime.client;
     let incoming = incoming(client);
-    runtime.run(incoming, handler).await
+    let callback : Option<Arc<fn()>>= None;
+    runtime.run(incoming, handler, callback).await
+}
+
+/// Starts the Lambda Rust runtime and begins polling for events on the [Lambda
+/// Runtime APIs](https://docs.aws.amazon.com/lambda/latest/dg/runtimes-api.html).
+/// 
+/// The callback function is called at the end of a single invocation of the runtime.
+///
+/// # Example
+/// ```no_run
+/// use lambda_runtime::{Error, service_fn, LambdaEvent};
+/// use serde_json::Value;
+///
+/// #[tokio::main]
+/// async fn main() -> Result<(), Error> {
+///     let func = service_fn(func);
+///     lambda_runtime::run_with_callback(func, callback_func).await?;
+///     Ok(())
+/// }
+///
+/// async fn func(event: LambdaEvent<Value>) -> Result<Value, Error> {
+///     Ok(event.payload)
+/// }
+/// 
+/// async fn callback_func() -> () {
+///     println!("Callback function called!");
+///     ()
+/// }
+/// ```
+pub async fn run_with_callback<A, B, F>(handler: F, callback: Arc<impl Fn()>) -> Result<(), Error>
+where
+    F: Service<LambdaEvent<A>>,
+    F::Future: Future<Output = Result<B, F::Error>>,
+    F::Error: fmt::Debug + fmt::Display,
+    A: for<'de> Deserialize<'de>,
+    B: Serialize,
+{
+    trace!("Loading config from env");
+    let config = Config::from_env()?;
+    let client = Client::builder().build().expect("Unable to create a runtime client");
+    let runtime = Runtime { client, config };
+
+    let client = &runtime.client;
+    let incoming = incoming(client);
+    runtime.run(incoming, handler, Some(callback)).await
 }
 
 fn type_name_of_val<T>(_: T) -> &'static str {
@@ -293,7 +345,7 @@ mod endpoint_tests {
     use lambda_runtime_api_client::Client;
     use serde_json::json;
     use simulated::DuplexStreamWrapper;
-    use std::{convert::TryFrom, env};
+    use std::{convert::TryFrom, env, sync::Arc};
     use tokio::{
         io::{self, AsyncRead, AsyncWrite},
         select,
@@ -525,7 +577,8 @@ mod endpoint_tests {
         let runtime = Runtime { client, config };
         let client = &runtime.client;
         let incoming = incoming(client).take(1);
-        runtime.run(incoming, f).await?;
+        let callback: Option<Arc<fn()>> = None;
+        runtime.run(incoming, f, callback).await?;
 
         // shutdown server
         tx.send(()).expect("Receiver has been dropped");
@@ -568,7 +621,9 @@ mod endpoint_tests {
         let runtime = Runtime { client, config };
         let client = &runtime.client;
         let incoming = incoming(client).take(1);
-        runtime.run(incoming, f).await?;
+        let callback: Option<Arc<fn()>> = None;
+        
+        runtime.run(incoming, f, callback).await?;
 
         match server.await {
             Ok(_) => Ok(()),

calavera · 2023-09-05T15:26:07Z

Can you open a PR so we can review it with more context?

purkhusid · 2023-09-05T15:27:14Z

Can you open a PR so we can review it with more context?

Sure thing.

BMorinDrifter · 2023-09-05T19:01:34Z

Cross-posting what I said in the PR here:

See MetricsService in https://github.com/BMorinDrifter/metrics-cloudwatch-embedded/blob/main/src/lambda.rs for how I did this for the embedded metrics crate.

bnusunny · 2023-09-10T16:10:15Z

Internal extension could be the solution. You could create a library to start a background thread to interact with Lambda Extension API, register for invoke events. The extension will get a notification event when an invoke happens and can perform all kinds of tasks. It could even continue running after the handler returns. It is a good place to flush logs/traces.

ramosbugs · 2023-12-03T21:46:14Z

I added an example in #744 that uses an internal extension to flush telemetry after the handler finishes (once there's a new release of lambda-extension that includes the Extension::register API). Just before the handler returns, it notifies the extension via a Tokio channel.

As OP mentioned, handlers sometimes panic, so for a production use case I would wrap the bulk of the handler in something like futures_util::future::FutureExt::catch_unwind so that you can still clean up and flush telemetry even if the handler panics.

calavera · 2023-12-04T00:29:16Z

Internal extensions are an option, but I think we can make something simpler for this use case. I'm working on an RFC that I think will benefit the runtime in general providing a simpler solution.

calavera · 2023-12-04T04:52:41Z

This is what I had in mind #747

calavera · 2024-03-22T02:52:13Z

The layering system that I described in #747 has landed in the main branch. Look at this example on how to flush traces/metrics after the handler loop has finished: https://github.com/awslabs/aws-lambda-rust-runtime/blob/main/examples/opentelemetry-tracing/src/main.rs

I'm marking this as resolved. I'll release a new version with those improvements next week after adding some extra documentation. I'm closing this issue as resolved since it's possible to implement what you explained in this issue now.

github-actions · 2024-03-22T02:52:29Z

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.

purkhusid mentioned this issue Sep 5, 2023

Allow running handler with a callback #694

Closed

calavera mentioned this issue Dec 4, 2023

[RFC] A layering system for the runtime #747

Closed

borchero mentioned this issue Mar 18, 2024

feat: Implement RFC for layering of runtime #845

Merged

2 tasks

calavera closed this as completed Mar 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the possibility to flush traces/metrics after a handler loop has finished #691

Add the possibility to flush traces/metrics after a handler loop has finished #691

purkhusid commented Sep 4, 2023 •

edited

Loading

purkhusid commented Sep 5, 2023

calavera commented Sep 5, 2023

purkhusid commented Sep 5, 2023

BMorinDrifter commented Sep 5, 2023 •

edited

Loading

bnusunny commented Sep 10, 2023

ramosbugs commented Dec 3, 2023

calavera commented Dec 4, 2023

calavera commented Dec 4, 2023

calavera commented Mar 22, 2024

github-actions bot commented Mar 22, 2024

Add the possibility to flush traces/metrics after a handler loop has finished #691

Add the possibility to flush traces/metrics after a handler loop has finished #691

Comments

purkhusid commented Sep 4, 2023 • edited Loading

purkhusid commented Sep 5, 2023

calavera commented Sep 5, 2023

purkhusid commented Sep 5, 2023

BMorinDrifter commented Sep 5, 2023 • edited Loading

bnusunny commented Sep 10, 2023

ramosbugs commented Dec 3, 2023

calavera commented Dec 4, 2023

calavera commented Dec 4, 2023

calavera commented Mar 22, 2024

github-actions bot commented Mar 22, 2024

purkhusid commented Sep 4, 2023 •

edited

Loading

BMorinDrifter commented Sep 5, 2023 •

edited

Loading