Skip to content

feat: Added flushes/close functionality to logging handlers #917

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 31, 2025

Conversation

gkevinzheng
Copy link
Contributor

For design doc see go/python-logging-background-thread

Fixes #<issue_number_goes_here> 🦕

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: logging Issues related to the googleapis/python-logging API. labels Jul 10, 2024
@gkevinzheng gkevinzheng added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 2, 2024
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 2, 2024
@gkevinzheng gkevinzheng marked this pull request as ready for review August 2, 2024 18:39
@gkevinzheng gkevinzheng requested review from a team as code owners August 2, 2024 18:39
@gkevinzheng gkevinzheng requested review from cindy-peng and removed request for a team August 2, 2024 18:39
if self.transport is None:
self.transport = self._transport_cls(
self.client, self.name, resource=self.resource
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this mean the handler has been closed? Why not just raise an error, instead of re-creating the transport?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based this off of some logging handlers I saw in the Python standard library, which will reopen a closed connection if that connection is closed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think this adds extra complexity, so I'd prefer to keep them permanently closed if we can get away with that

But if you think this behavior would help us integrate with the standard library, it makes sense to keep it.

def close(self):
"""Closes the log handler and cleans up all Transport objects used."""
self.transport.close()
self.transport = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the idea of using transport == None as a way to check if the handler is closed. That makes the types quite a bit more complicated.

Can't we use a new flag for this?

_CLOSE_THREAD_SHUTDOWN_ERROR_MSG = (
"CloudLoggingHandler shutting down, cannot send logs entries to Cloud Logging due to "
"inconsistent threading behavior at shutdown. To avoid this issue, flush the logging handler "
"manually or switch to StructuredLogHandler."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention client.close or handler.close too?

"""Callback that attempts to send pending logs before termination."""
if not self.is_alive:
return

# Print different messages to the user depending on whether or not the
# program is shutting down. This is because this function now handles both
# the atexit handler and the regular close.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we decouple this? Maybe let the caller pass in a message or something?

@product-auto-label product-auto-label bot added the stale: old Pull request is old and needs attention. label Aug 10, 2024
@product-auto-label product-auto-label bot added stale: extraold Pull request is critically old and needs prioritization. and removed stale: old Pull request is old and needs attention. labels Sep 9, 2024
@otorreno
Copy link
Contributor

otorreno commented Mar 8, 2025

I am not fully sure this will solve #850 and #855. I see 2 possible scenarios:

  1. A user calls flush on the logging client and then the program exit. That would call the _close function which would either attempt to send what is in the queue if the main thread has not terminated or it will just not send whatever is in the queue.
  2. Even if a user would call flush and then close, nothing guarantees that between these 2 calls some message got to the queue. I think there will be similar outcomes to the previous point.

Have you considered flushing the queue at termination? Something like the following (conceptually):

  1. Mark as shutting down (nothing else gets queued
  2. Flush the queue
  3. Mark as terminated

I wonder if the atexit hook should remain. I believe the main thread will be in a state where is_alive may return true, but underneath it would still error out because it can’t create new threads (this is a guess). What do you think about letting users explicitly calling the teardown/close?

@gkevinzheng gkevinzheng changed the base branch from experimental-v4 to main March 10, 2025 17:19
@gkevinzheng gkevinzheng changed the base branch from main to experimental-v4 March 10, 2025 17:20
@gkevinzheng gkevinzheng force-pushed the cloud-logging-handler-flush branch from db3be06 to b46779e Compare March 10, 2025 17:49
@gkevinzheng gkevinzheng requested a review from a team as a code owner March 10, 2025 17:49
@gkevinzheng gkevinzheng requested review from parthea and removed request for a team March 10, 2025 17:49
Copy link

snippet-bot bot commented Mar 10, 2025

No region tags are edited in this PR.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@product-auto-label product-auto-label bot removed the size: l Pull request size is large. label Mar 10, 2025
@product-auto-label product-auto-label bot added the size: xl Pull request size is extra large. label Mar 10, 2025
@gkevinzheng gkevinzheng changed the base branch from experimental-v4 to main March 10, 2025 17:49
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: xl Pull request size is extra large. labels Mar 13, 2025
@gkevinzheng
Copy link
Contributor Author

I am not fully sure this will solve #850 and #855. I see 2 possible scenarios:

  1. A user calls flush on the logging client and then the program exit. That would call the _close function which would either attempt to send what is in the queue if the main thread has not terminated or it will just not send whatever is in the queue.
  2. Even if a user would call flush and then close, nothing guarantees that between these 2 calls some message got to the queue. I think there will be similar outcomes to the previous point.

Have you considered flushing the queue at termination? Something like the following (conceptually):

  1. Mark as shutting down (nothing else gets queued
  2. Flush the queue
  3. Mark as terminated

I wonder if the atexit hook should remain. I believe the main thread will be in a state where is_alive may return true, but underneath it would still error out because it can’t create new threads (this is a guess). What do you think about letting users explicitly calling the teardown/close?

Part of this PR is also adding functionality to explicitly let users call teardown/close for clients. Flushing the queue at termination is not an option because in order to send log entries over the wire to Cloud Logging, gRPC threads are needed, which we cannot create new ones at shutdown, causing the two bugs.

self.transport = transport(client, name, resource=resource)
self.transport_open = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be made private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@gkevinzheng gkevinzheng enabled auto-merge (squash) March 31, 2025 20:57
@gkevinzheng gkevinzheng merged commit d179304 into main Mar 31, 2025
15 checks passed
@gkevinzheng gkevinzheng deleted the cloud-logging-handler-flush branch March 31, 2025 21:23
otorreno added a commit to otorreno/python-logging that referenced this pull request Apr 21, 2025
…pened

There was a recent release (3.12.0) that included the changes introduced in
googleapis#917. The newly introduced close method seems to be called by AppEngine
Python runtime at shutdown, so if you would call it explicitly before the runtime
does it, then the close function throws an exception because transport is None.
gkevinzheng pushed a commit that referenced this pull request Apr 21, 2025
…pened (#990)

There was a recent release (3.12.0) that included the changes introduced in
#917. The newly introduced close method seems to be called by AppEngine
Python runtime at shutdown, so if you would call it explicitly before the runtime
does it, then the close function throws an exception because transport is None.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: logging Issues related to the googleapis/python-logging API. size: l Pull request size is large. stale: extraold Pull request is critically old and needs prioritization.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants