-
Notifications
You must be signed in to change notification settings - Fork 123
logging support #227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
logging support #227
Conversation
81e4a60
to
09563c1
Compare
CC @adam-fowler who may want to integrate with this into his AWS SDK? And @tanner0101 for Vapor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, some notes inline.
Can't comment if this is "enough" logging but good to get the pattern in here.
modified[metadataKey: "ahc-prev-request"] = request.map { self.requestInfo($0) } ?? "<unknown>" | ||
return modified | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice pattern, I've not thought of this one but like it 👍
What I'm doing is usually:
context.log.info("bla", metadata: self.metadata)
context.log.info("bla", metadata: related.metadata)
but this pattern you do here is quite interesting 👍 I need to try it out in my work
taskEventLoop: EventLoop, | ||
deadline: NIODeadline?, | ||
setupComplete: EventLoopFuture<Void>, | ||
logger: Logger) -> EventLoopFuture<Connection> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah passing it around sounds okey (the only thing we really can do tbh to preserve metadata).
There's two things to keep in mind, though I'm okey with punting on them for now.
- once we have a "context thing" we're somewhat likely to want to break those APIs and pass around context, rather than the logger. Context would be able to offer a logger though. Are we ok with breaking APIs then, or would we "add them" next to these and deprecate these...?
- I'm not sold on the "context" as last parameter and would want to eventually talk about it being the first parameter... Both have downsides, let's shelf this discussion for now, but as a heads up. More interested in knowing how we'd change those APIs once we have context. ( https://golang.org/pkg/context/ and systems like akka etc... one kind of learns to just ignore the "context" param when reading methods, but the trailing one is a bit weird thb...) anyway, let's bikeshed that at some point I guess -- as we'd want all APIs be consistent
- I guess "last" in those APIs since you want it to be optional... anyway, long bikeshed topic when we'll have context objects and PoCs ready 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktoso yeah, we've discussed first vs. last before, internally, a long time ago.
Here's why we landed on last: Swift's APIs are supposed to be fluent. So for example you want to write
let foreground = Color(red: 32, green: 64, blue: 128)
let newPart = factory.makeWidget(gears: 42, spindles: 14)
let ref = Link(target: destination)
And it's supposed to read like
Factory, make a widget with 42 gears and 14 spindles.
If we now add a logger or a context into the picture, then we can either
let newPart = factory.makeWidget(context: context, gears: 42, spindles: 14)
which will become
Factory, make a widget with context, 42 gears, and 14 spindles.
or
let newPart = factory.makeWidget(gears: 42, spindles: 14, context: context)
Factory, make a widget with 42 gears, 14 spindles, and context.
I think arguably, reading out "context" or "logger" should just be skipped because it's less important to decide what'll be going on. Therefore, I really think the best place is at the very end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktoso yeah, we've discussed first vs. last before, internally, a long time ago.
I know, and I challenge us to rethink this before we enforce guidelines to the wider ecosystem, or at least well-define all the cases and and rationale for the various scenarios.
Notably, the "context last" is not a consistent guideline IMHO and "consistency is king" in any such guidelines.
E.g. if you're any library which offers context to users, today those look like this:
func channelRead(context: ChannelHandlerContext, data: NIOAny) ...
func readHandshakeResponse(context: ChannelHandlerContext, bytes: InboundIn) ...
// or
... = .handleThings { context, things in ... }
func onThingsArrived(context: MyFramework.Context, things: Things) {
// context.baggage // trace context / baggage context
// context.log // logger with pre populated trace / baggage metadata
so it's natural when I delegate to my functions to carry like
// context, thingy
let x = factory.makeWidget(context, thingy: thingy)
makeWidget(context, x: x)
yet the guideline of "context last because it does not matter" goes against this.
I find it to be inconsistent and fatiguing if one has to constantly mentally flip around the parameter order if "library" or "my code" is concerned.
Minor nitpick about the spelling of the snippet:
let newPart = factory.makeWidget(context: context, gears: 42, spindles: 14)
is spelled very verbosely, for APIs intended to be called by library users / their own code, I would expect it spelled as
let newPart = factory.makeWidget(context, gears: 42, spindles: 14)
which makes it shorter (and we already talk about "learn to not read out loud context).
The argument for "context last" goes as you say:
I think arguably, reading out "context" or "logger" should just be skipped because it's less important to decide what'll be going on. Therefore, I really think the best place is at the very end.
so we learn/encourage to not read it out loud anyway (!).
Putting it last however also means it is easier to miss propagating somewhere -- which is a huge problem if we end up with baggage context (trace context) - "dropped" traces are a huge pain in the neck.
I can sympathise with the "context is ugly, let's put it in the end" however the confusion around "but this type of context goes first, but the other context type goes last" adds a lot of mental overhead when building APIs which I don't think is helping arrive at clean and consistent APIs throughout user code and library code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also to keep in mind... in this specific library / example you argue to use lastParam: Logger = noop
implying that if you forget to pass one "it's fine" and it really is low priority and some optional thing one may want to pass sometimes -- in that context it being last sounds fine.
However, if we want to get to a world where we don't accidentally "drop" traces and metadata, passing the context is not optional - whenever you forget to pass around a context you broke tracing. So while "reading it (context) out loud" definitely is not something we care about, "making sure it is being carried around" is super important, as is taking the mental overhead off people wether they use some library or their code and "where" the context should be.
I also understand a lot of this is "because we're used to X" but I'd really want to discuss the tradeoffs and consistency aspects on more examples, before we make a strong recommendation (esp. if we make a guideline that a lot of APIs end up ignoring).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw, I am also in the "context comes first" (or, "first, lets set some context") camp, for many of the same reasons @ktoso highlights
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @airspeedswift for API advise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it's all about consistency (which is 👑). Since today is not really upheld in the libs that's where my problem lies - less about "first vs last", to me at least. As we're "early enough" perhaps it's the right time to make the rule and enforce it...
So, in order to make the consistency/👑 happy... would you say we try all new (server at least) APIs stick to the "last" rule, and if automated tooling to migrate existed, would NIO follow that perhaps as well (more below)?
I'm happy to discover that:
- gRPC sticks to the last rule already, so that's good: https://github.com/grpc/grpc-swift/blob/3109a71355f84881887957cfdabbac41150e0793/Sources/Examples/HelloWorld/Model/helloworld.grpc.swift#L66 we're able to make it such in http-client and
- RediStack does not seem to do context things in its APIs yet
- HTTPClient we're doing now and can do the "last"
- Vapor does not seem to have such thing?
Since I'm mostly concerned about the medium to high level libs all doing the same (which are vetted by the SSWG) -- that's where most of the context"passing around" is happening -- and where most of the mix of various libraries exists if we're able to pick a rule and stick to it that's the most important to me.
I'd be okey with sweeping NIO's current context-first under the rug, BUT we should be explicit about it when we propose the strict guideline about passing around contexts, and it'd be good that NIO addressess this when able to (which means years+ probably, but still worth aligning all server APIs on this IMO, that's what Go does right -- "same style" for these things).
With swift-syntax I believe we can easily provide a reformatter that would shift around parameters in signatures in end user's code.
So...
- existing and ongoing development other libs should switch to "context last" (oh boy, I have 1,000s of places to change now, oh boy 🤣)
- our work in the GSoC context lib should assume and show in examples "context last"
- I'm kind of betting on NIO and related libs could follow suit when they're able to (we're taking multi year here, no rush, but intent matters -- it's annoying to have a "rule" with core libs being exceptions)
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An important thing here also: It is context: Context
and NOT Context?
- this is hugely important.
The optional around Logger?
in this proposal had me quite worried -- dropping a context is disastrous, so we'd not want it be optional and default to nothing (like we do here for the logger).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ktoso I agree that consistency is important and if we could redo NIO, I'm pretty sure we'd do it last. Said that, NIO's ChannelHandlerContext
isn't quite the same as a Context
. In almost all ChannelHandler
method implementation you will have to use context
, or else your program will just drop events, so it won't work correctly.
Regarding Optional<Logger>
and Optional<Context>
: I think non-optional is better here, possibly defaulted to some .disableLogging
/ Context.empty
/ Context()
? But maybe not, can always have the user type Context()
to pass an empty one. If they already got one, they'll just pass it along which is the right thing.
I don't have a strong opinion yet. I do think however Optional<Thing>
can be very dangerous in API design, example here. You need to be very mindful that the user will be (by Xcode) offered a .none
completion. This can be very misleading, for example in AsyncHTTPClient's redirect configuration redirects: .none
means "default redirects", ie. 5 redirects max and not "no redirects". redirects: .disallow
is what makes you have "no redirects". Both of those wouldn't be an issue with Optional<Logger>
and Optional<Context>
because .none
gives you the right idea but we should still be mindful I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok two topics here :)
@ktoso I agree that consistency is important and if we could redo NIO, I'm pretty sure we'd do it last.
Said that, NIO's ChannelHandlerContext isn't quite the same as a Context. In almost all ChannelHandler method implementation you will have to use context, or else your program will just drop events, so it won't work correctly.
I had a long chat about with Johannes now about similar APIs to this but which are not quite as low level as NIO, and what we should do there. Overall consistency is what matters and if we can get most APIs to do "last", and NIO remains with it's historical difference I guess we can live with that -- what matters is that high level tools like http clients, grpc and others follow the same style -- and we have time now to make it so.
I do agree that last reads well in general... and poked around at some APIs where we had it first at how last would look like, seems managable.
Optional / default values for context
Regarding Optional and Optional: I think non-optional is better here,
Yes, absolutely. Optional is actively hurtful for such, especially baggage / trace contexts. The moment it becomes "easy" to drop the context, people will even accidentally and if it's hard to spot where the drop happened then fixing those is a a pain as well. A dropped context / trace means the entire trace becomes dangling and useless.
I don't have a strong opinion yet. I do think however Optional can be very dangerous in API design, [...]
possibly defaulted to some .disableLogging / Context.empty / Context()?
Please no default values for baggage contexts 😱 The logging case here for the time being I could swallow, since nothing "too bad" happens if you don't pass it -- things won't be logged, end of story. But as mentioned above, any time it becomes easy or even automatic to drop context it will be dropped by someone, rendering the entire trace propagation useless (aborting a trace -- no end of a trace -> trace never shows up -> the search beings who/where/how dropped the trace in which library and how to fix/update the lib).
But maybe not, can always have the user type Context() to pass an empty one. If they already got one, they'll just pass it along which is the right thing.
Yes, exactly -- we have to make the right thing, the natural/simple thing. The natural and right thing is most of the time: I got some baggage and I just pass it around. Usually it's the web framework or something that always gives you some baggage context already anyway, so passing it around should not be too hard :)
eventLoop eventLoopPreference: EventLoopPreference, | ||
deadline: NIODeadline? = nil, | ||
logger originalLogger: Logger = HTTPClient.loggingDisabled) -> Task<Delegate.Response> { | ||
let logger = originalLogger.attachingRequestInformation(request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
self.metadata[key] = newValue | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Such thing is part of what we'd like to propose as small packages with @glbrntt https://forums.swift.org/t/metrics-extras/36616/11
(totally fine to have an in place one here, just cross linking)
Also an interesting PR for you to skim @slashmo I think. So today this passes around an explicit logger, but perhaps it would some day rather accept a context propagation object... we'll see where we end up with those APIs soon enough. |
@ktoso Thanks for the mention! I agree that the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start, thanks, @weissi. Added some comments re formatting logs.
Also, I think closing/evicting a connection deserves a log line.
Also, I see a lot of meaningless information in logs, e.g. |
@@ -54,6 +75,8 @@ public class HTTPClient { | |||
var state: State | |||
private let stateLock = Lock() | |||
|
|||
public static let loggingDisabled = Logger(label: "AHC-do-not-log", factory: { _ in DoNotLogger() }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need this? Why not just a new Logger
and let clients decide if they want to log stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, by default I think a HTTP client shouldn't log. I added some more info here: https://github.com/swift-server/async-http-client/pull/227/files#diff-e92f784d7349c391e323afad21d5702e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[the code doesn't reflect the design yet :P ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This design is a bit weird for me. Usually, as a client of a library, I configure logging using levels and configuration. That allows to configure all dependencies in one way and in one place. With this design I will have to modify my code if I want to log. Having default level for different components allows to turn on logging with a config and a restart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@artemredkin no, you don't have to modify your code. You just hand AHC your logger and it'll log on that one. The loggingDisabled
is just the default if you do not hand us a logger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@artemredkin You will have to pass a logger in any system that you want to log if you're using structured logging. If you don't explicitly pass loggers, there's no way we can preserve your metadata like your request ID.
Let's say you have a system that does
logger[metadataKey: "request-id"] = "really cool request"
httpClient.get("https://...", logger: logger)
httpClient.post("https://...", logger: logger)
then you want your "really cool request" request ID be attached to all messages logged throughout the program that have anything to do with your request. AHC must accept a Logger
in order to get hold of the metadata you supply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See here for the unit test that makes sure this works: https://github.com/swift-server/async-http-client/pull/227/files#diff-bd03a2521c79f3da4bd5825512ca9979R1761-R1780
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In RediStack I log by default with my own metadata and label. If you pass me a logger, then I will use your logger with my own metadata being attached to it with the label you gave it.
Right now I'm leaving it up to Logging backends to provide a way to disable RediStack as a system
In this way, RediStack's logs blend into your logs, so you don't necessarily realize that the library is even providing logs, depending on the log level you're set to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mordil but in RediStack at the moment you set the the logging on the client. That won't work if you have two concurrent requests or am I missing something.
As an example
thread1 { logger in
let requestID = UUID()
var logger = logger
logger["my-req-id"] = "\(requestID)"
redisClient.setLogger(to: logger)
redisClient.send(command: foo)
}
thread2 { logger in
let requestID = UUID()
var logger = logger
logger["my-req-id"] = "\(requestID)"
redisClient.setLogger(to: logger)
redisClient.send(command: bar)
}
Now, probably the metadata is actually mixed up if thread1 and 2 run concurrently.
I think you'll need to switch to passing the logger with the request, like so
thread1 { logger in
let requestID = UUID()
var logger = logger
logger["my-req-id"] = "\(requestID)"
redisClient.send(command: foo, logger: logger)
}
thread2 { logger in
let requestID = UUID()
var logger = logger
logger["my-req-id"] = "\(requestID)"
redisClient.send(command: bar, logger: logger)
}
or am I missing somethign?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the logging implementation is a bit naïve, as I was sloppy and rushed an implementation that solved a use case for Vapor that matched how Vapor wanted to do things (at the time).
I have this bug against RediStack that I'm going to use for addressing it - but I was going to wait until seeing the AHC implementation go through so I can take notes from it: https://gitlab.com/Mordil/RediStack/-/issues/79
The point I was making is that my intention is RediStack always logs, but the user can override the label and provide their own metadata that RS will attach 1 or 2 pieces to.
09563c1
to
fec94fd
Compare
One thing to consider when you use metadata in log messages from a framework: you might want to make sure to avoid clashes of the key names with other frameworks or the app itself. This is a common issue for log aggregators, especially when collecting logs from many apps. For example ELK proposes a common schema to follow: https://www.elastic.co/guide/en/ecs/current/index.html
For example “connection-id” could be a very generic name that can easily mean sth Else depending on app or context.
… Am 26.05.2020 um 19:25 schrieb Johannes Weiss ***@***.***>:
@weissi commented on this pull request.
In Sources/AsyncHTTPClient/ConnectionPool.swift:
> switch action {
case .lease(let connection, let waiter):
// if connection is became inactive, we create a new one.
connection.cancelIdleTimeout().whenComplete { _ in
if connection.isActiveEstimation {
+ logger.debug("reusing active connection \(connection)")
@tachyonics that's what I was thinking but on the decision whether to still use static messages everywhere (or not, or have static prefixes, or ...) I was planning to stand on the shoulders of you giants who have more operational expertise than I do.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Fundamentally, AsyncHTTPClient performs a simple task, it makes a HTTP request and communicates the outcome back via its API. In normal usage, we would not expect people to want AsyncHTTPClient to log. In certain scenarios, for example when debugging why a request takes longer than expected it may however be useful to get information about AsyncHTTPClient's connection pool. That is when enabling logging may become useful. | ||
</details> | ||
<details> | ||
<summary>Each request will get a globally unique request ID (`ahc-request-id`) that will be attached (as metadata) to each log message relevant to a request.</summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we offload the burden of setting unique ahc-request-id
's to a client? We already accept a logger for every method, for the client it'll be extremely simple to set a unique uuid in it. UUID creation is not free (and when passed in metadata it's actually materialised unless some custom code is written), I think we'll be fine with just giving client a way to set a unique uuid per http request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such "at beginning inject some metadata to context (logger metadata)" is what the baggage context infrastructure is being designed for.
Specifically "if no trace header detected, make a new span id and store it" is the most common implementation of a tracer performing an "extract (from headers to context). So that's something a tracer / baggage context "instrument" (we don't have settled names for those yet) would be doing...
If you want to do something ad hoc for now that's also ok, but let's keep an eye out for the overlap here and how the context infra would be used to achieve the same (and more, trace header extraction) in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a really good point - we should design this feature with an eye towards the distributed tracing work we kicked off in GSoC. to be clear, we should be careful not to hold back progress here, but we should design this in a way that can be evolved into full distributed tracing.
one option to explore could be some kind of a "request Id generator" closure that the user can inject, so later we can plug to use the span id from the distributed tracing library, or something along these lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be okey with doing "whatever is easiest" here for the time being and once the baggage baggage context matures I'm expecting it'd be medium effort to plug the right things anyway, and likely not "just" change one generator for another.
So let's just keep in mind that this part is very likely to change, and let's do so when the other lib is ready.
// cc @slashmo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may be missing something, but why do we ever want to generate ids inside the library?
It can be achieved with a wrapper (since it can take Logger
, inject some variables and pass it to a delegate). So does the tracing or whatever keys we want to pass inside. Basically, we can provide a wrapper that solves the problem and shelve this work until we know how tracing API will look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@avolokhov / @ktoso / @tomerd The reason I went ahead adding a ahc-request-id
(which is just an atomically incrementing Int
) is that without it, correlating the log messages is impossible. Sure, the user can add an ID but if without adding one in AHC, they have to do it to make sense of the log messages. I thought that's too much effort required to make use of the log messages and therefore went ahead and just generated one. In this case, the ID is very cheap because I just increment a global atomic int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like ad-hoc solving things to be honest... but since we don't have the "real thing" which the context and "middleware" which can inject/extract into it I guess that keeps happening (like myself, also manually injecting some metadata in my libraries...).
I guess it's fine but we do expect to change things in the future if/when context propagation things land.
docs/logging-design.md
Outdated
|
||
As you can see above, the log message doesn't actually contain the request or the network connection. Both of those pieces of information are in the `metadata`. | ||
|
||
The rationale is that many people use log aggregation systems where it may be very useful to aggregate by log message which is much simplified by using a constant string instead of one with the information (such as the connection) in the middle of the log message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may also worth mentioning log filtering, from comments above it looks like more common use case.
For example, AsyncHTTPClient may tell you in two log messages that it `got a connection` (from the connection pool) and a little later that it's `parking connection` (in the connection pool). Just like all messages, both of them will have an associated `ahc-request-id` which makes it possible to correlate the two log messages. The message that logs that we actually got a network connection will also include information about this network connection. The message that we're now parking the connection however _will not_. The information which connection is being parked can be found by filtering all other log messages with the same `ahc-request-id`. | ||
</details> | ||
<details> | ||
<summary>In `trace`, AsyncHTTPClient may log _a lot_.</summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Good
docs/logging-design.md
Outdated
<details> | ||
<summary>In `trace`, AsyncHTTPClient may log _a lot_.</summary> | ||
In the `.trace` log level, AsyncHTTPClient basically logs all the information that it has handily available. The frugality considerations we take in `.debug` do not apply here. We just want to log as much information as possible. This is useful almost exclusively for local debugging and should almost certainly not be sent into a log aggregation system where the information might be persisted for a long time. This also means, handing AsyncHTTPClient a logger in `logLevel = .trace` may have a fairly serious performance impact. | ||
</details> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the past I ran into situations where http clients and proxies had to include log redacting logic to avoid logging potentially sensitive information. usually some heuristics or configuration around headers and metadata keys. this is always nasty to need and do, so something to think about when we decide what information to include when we log, or if we need to offer some kind of extensibility mechanism to allow the user to control such
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomerd Very good point! What do you propose? Including in the labels if they may contain private info or not? Or having a per-HTTPClient
switch to switch on/off logging of private info?
Thanks @t089 ! I totally agree. For now, I was prefixing (most of) them with
Drawbacks of the explicit nesting:
{ "message": "connection established", "ahc": { "connection-id": "123", "request-id": "456" } } See how the logging backend now needs to merge the two dictionaries in a nested way? I think I'm in favour of doing ad-hoc nesting like |
dfb9158
to
0396669
Compare
7f31474
to
e57bd6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
26baf4f
to
17ef901
Compare
17ef901
to
06eba62
Compare
@@ -421,7 +586,7 @@ public class HTTPClient { | |||
/// `HTTPClient` configuration. | |||
public struct Configuration { | |||
/// TLS configuration, defaults to `TLSConfiguration.forClient()`. | |||
public var tlsConfiguration: TLSConfiguration? | |||
public var tlsConfiguration: Optional<TLSConfiguration> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it some auto formatting went wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@avolokhov No, sadly in classes/structs, Optional<X>
is different to X?
.
Optional<X>
needs to be manually initialised ininit
(or else compiler error)X?
magically gets pre-initialised tonil
first. So if you forget to set it ininit
you will not get a compiler error :'( . Also it's slightly less efficient because if you do set it, it gets initialised twice. I know it doesn't make any sense but it is true: SR-11777
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😜
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sadly, we've got a problem: nicklockwood/SwiftFormat#636
SwiftFormat wants to force me to use the wrong one. I think I need to learn how to configure SwiftFormat to not do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nicklockwood responded super quickly, so I added --disable typeSugar
which fixes this and leaves the Optional<Foo>
s alone.
06eba62
to
5b00fcf
Compare
cf11470
to
1bc8c3f
Compare
Coming back to the
So that's a quite deep topic in context propagation / tracing libs... Merging/deep-merging is required in the general case, but I feel we're too early to make calls on specifics here... What I'd actually love to do and pull off is proper "baggage context" as in the tracingplane papers, where nesting is done, but is done using numeric identifiers and storage is compressed... but we're a far cry away from such things... TL;DR; let's do the simple things and just prefix for now. For what it's worth, I've been nesting keys like: |
@@ -9,5 +9,6 @@ | |||
--patternlet inline | |||
--stripunusedargs unnamed-only | |||
--ranges nospace | |||
--disable typeSugar # https://github.com/nicklockwood/SwiftFormat/issues/636 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Annoying that it did so, but great that quick turnaround on the ticket :)
taskEventLoop: EventLoop, | ||
deadline: NIODeadline?, | ||
setupComplete: EventLoopFuture<Void>, | ||
logger: Logger) -> EventLoopFuture<Connection> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Separate thread, as this is now about the backgroundActivityLogger
I do not think this is the best way to handle separate loggers for separate parts of an app; we lose control over the label which feels quite weird; I would rather expose settings on the client that can configure the desired log level and metadata. @weissi disagrees, since logger is level + metadata
but it also is label IMO, and it also feels somewhat wrong to just keep exposing all loggers we might potentially need -- there can be many loggers in a "subsystem" which share a log level and metadata, but they do not share label.
In this specific http client it's all so "single purpose" and small enough that things kind of work out fine, but I don't agree this is a good general pattern. Perhaps we'll need to allow other constructors to Logger()
, we'll see, but present-day swift-log and "always pass in a logger" for everything I do not think mix well in non trivial cases.
Having that said... we do anticipate bigger API changes in this entire library if/when context propagation is mature enough... we'd want to deprecate all APIs accepting a logger then IMO, with that I believe @weissi agrees. So... whatever we do here is likely to change.
So it's a decision of an API experiment in this library that we know we'll change; or punting and doing nothing.
In this small library there seems to be basically "2" logger patterns, so this does not look too crazy, but in general this is not a pattern I'd endorse in general... Here it's impact is small enough that it matters perhaps less.
To spell out what the alternative is, and more scalable to bigger libs IMHO: the settings/configuration object having a logging.someSubsystem = .debug
etc settings, rather than setting the logger.
In the end... since this likely is not going to be the final shapes of the APIs and we need to build and learn more about baggage
/ context
... if you want to push forward with this but we do expect to change it around in the future once we learn and build context, I'm ok with this PR as is, but I do not buy the general argument that it's a great pattern for all libraries do adopt.
Nitpick, if we end up going ahead with this:
it's a bit inconsistently named here... one can claim the user should "guess it" but if we stick to boringly naming it explicitly in other public API it should be boring named here as well:
func getConnection(..., logger: Logger) -> EventLoopFuture<Connection> {
// I could "guess" if I'm thinking smart here that that'll be the backgroundActivityLogger
// but it's not documented nor the param implying that; (I can "guess" if I know that the background
// one is for the connection pool... but I don't think that should require guesswork)
it's not clear from the name or docs (none for the param) that that's the "background one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To Konrad's point, we're losing an ability to change log level for a subsystem: imagine you have a big project, and some particular part of it misbehaves. With only one logger passed into every library, changing a log level will be all-or-nothing. Given logging is not free, we may make it impossible to enable debug logging for heavily enough loaded production systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been meaning to ask this question as well, are we sure that passing loggers around is a good idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we sure that passing loggers around is a good idea?
I think more context is important to this question (no pun intended), as there's somewhat 2 cases I believe:
- I have some context (baggage / trace id / logger metadata) and want the lib to keep using the same as I have, e.g. a http client should "carry my trace identifiers" that this request is part of
- ^ where there has to be some form of passing around; I hope this would be addressed by the ongoing gsoc work eventually;
- "the (sub-)system" does some stuff and I want it to log things; it's not part of my "workflow" / "trace" / "request hot path", I'd want to enable "background processes logging"
- ^ where I'd see it as a setting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@artemredkin / @avolokhov The question is what is a logger? From SwiftLog's point of view a logger is label+metadata+logLevel+logging backend. Nothing else.
The logging backend is mostly a global constant (the configured LoggingSystem.factory
). So really it's label+metadata+logLevel.
- We all agree (I think) that we need to pass metadata around.
- I think it's also important being able to control the log level of a sub-system that may be logging
- I think we also agree that given that we have the option of using non-standard logging backends, we need to pass that to libraries.
- It seems that many people want
label
to be controlled by the sub-system (because it's used as "source information")
If we unpack this:
- we would like to pass metadata around
- we would like to control the log-level (ie. pass it around)
- we would like to pass the logging backend around
- we would like the sub-system to control the label
The first three is pretty much everything that you achieve passing the Logger
around, it's no different to passing the metadata and the log level around.
The sticking point seems to be label
. In SwiftLog, when passing around a Logger
, you also pass around the label
and you can't change it. The label itself was added quite late in the design of SwiftLog and I never cared about it much because I mostly use one root-logger (label: "my-app-name"
) and the go from there, passing that to everything. Adjusting metadata/logLevel as needed which works fine because Logger
is a value type regarding metadata/log level.
The thing that is missing today is that we can't override the label
. So I'm thinking what would be needed is either:
func subSystemCall(logger: Logger) {
var logger = logger
logger[metadataKey: "sub-system-req-id"] = "\(UUID())"
logger.label = "com.my.sub.system"
}
or
func subSystemCall(logger: Logger) {
var logger = Logger(label: "com.my.sub.system", delegating: Logger)
logger[metadataKey: "sub-system-req-id"] = "\(UUID())"
}
which are kinda equivalent.
Due to what I consider a design mistake in SwiftLog, we're actually handing label
to the logging backend on construction. So we can't actually just change it :(.
What we could do is when the label is being changed, re-construct the logging backend with the changed label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@avolokhov I think we can sample at the log handler level, no? We don't have to update it in lots of places, just say SamplingLogHandler(wrapping: RealLogHandler, sampling: ["label1", "label2", rate: 500])
, or even have a parser so we can configure it in the properties without recompiling anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not if we want to sample a full request trace. We can workaround with setting a special field in metadata and then filter based on this field, but it's not very efficient and looks like a workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@artemredkin (sorry on phone, can’t get quoting to work)
Regarding source
more along the lines of: We’d offer a new type with (tb bikeshed) name LoggerWithSource
which is a struct containing Logger + source. So you’d do
let logger = LoggerWithSource(logger, “com.foo.bar”)
logger.info(“hello”) // would log with source=com.foo.bar instead of MyModule
LoggerWithSource would have the same API as Logger. Obviously you’d never hand a LoggerWithSource to a subsys. If you called a subsys, you’d do subSys(logger: loggerWithSource.logger)
to strip off the source again.
This way you should give you all the benefits of having automatic source as well as correctly passing all information around.
regarding custom backends
I think custom backends are really powerful. For example lets assume I’ve got an annoying library that logs way too much. What I could do is pass it a Logger with an in-memory backend. Then I’d run that annoying library and if it succeeds, I drop all the in-memory messages. But if it fails, I forward all the log messages into my actual log storage. That way, I could get all the messages but only if there was a failure.
regarding label
Label is an identifier for a ‘root logger’. It doesn’t tell you where the log message is originating from (unless you create one Logger per “source”). If you create one logger per source, you lose about every feature SwiftLog has:
- Metadata doesn’t get propagated (unless you manually attach a metadata bundle to every message)
- logLevel cannot be set which is super useful for sampling a whole request
- custom logging backends become impossible
Sure, you can get those features back if you propagate each of those things separately but then your users need to propagate 4 things instead of 1. Logger is literally just a bunch of stuff that can get propagated.
regarding class metadata
Why don’t you attach it to the Logger when you receive it?
regarding file
Yes, it’s super ugly. And it’s brittle. Unfortunately, there is no #module
which would provide a good default value. With SwiftPM however we can kinda find the module name from #file
. It’s not a 100% correct solution but probably works for almost all packages in existence (either take the second last path component or something slightly more clever).
If we do add source
(which is totally not decided or fully designed) we could default it to module (from file path) and if that doesn’t work well for your package the you can change It (with something like LoggerWithSource).
How does that sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's talk about the "source" things on the swift-log repo; it has some potential 👍 I'll have to try it out a bit to see if the new struct gets annoying or not too much, but it could work out.
Not to block this PR though, let's move there and PR can do what it wants for the time being I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That topic moved to: apple/swift-log#135
Fundamentally, AsyncHTTPClient performs a simple task, it makes a HTTP request and communicates the outcome back via its API. In normal usage, we would not expect people to want AsyncHTTPClient to log. In certain scenarios, for example when debugging why a request takes longer than expected it may however be useful to get information about AsyncHTTPClient's connection pool. That is when enabling logging may become useful. | ||
</details> | ||
<details> | ||
<summary>Each request will get a globally unique request ID (`ahc-request-id`) that will be attached (as metadata) to each log message relevant to a request.</summary> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like ad-hoc solving things to be honest... but since we don't have the "real thing" which the context and "middleware" which can inject/extract into it I guess that keeps happening (like myself, also manually injecting some metadata in my libraries...).
I guess it's fine but we do expect to change things in the future if/when context propagation things land.
1bc8c3f
to
ccc5d13
Compare
Motivation: AsyncHTTPClient is not a simple piece of software and nowadays also quite stateful. To debug issues, the user may want logging. Modification: Support passing a logger to the request methods. Result: Debugging simplified.
ccc5d13
to
cf7f367
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm generally ok with this PR, there are some questions about the Logging patterns, but I'm happy to "go with the flow here" I guess.
For what it's worth I'm ok with this PR because we anticipate that things will be changing in the future (because "context" work and because potential changes in swift-log). Let's continue context and "source" discussions on the swift log repo though 👍 |
@ktoso / @artemredkin / @avolokhov I see you all approved this PR. Should we merge this and iterate from there or are there any reservations still? |
I'm ok with merging and moving forward here; don't mean to block all progress in various repos until tracing work has gotten to a usable state after all. I do think all those API calls will end up deprecated then though, but I believe we're in agreement on that. |
Same here, sorry @weissi for taking so long on this PR! |
Sounds good! So I'm pretty sure once baggage/context/tracing/... lands the
No apology needed, we needed to discuss all this. |
I'm happy with merging. |
Motivation:
AsyncHTTPClient is not a simple piece of software and nowadays also
quite stateful. To debug issues, the user may want logging.
Modification:
Support passing a logger to the request methods.
Result:
Debugging simplified.
Example
.debug
logs:Example
.trace
logs: