-
Notifications
You must be signed in to change notification settings - Fork 37
fix memory buildup from JoinSet #229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Cool 😻 happy to add the dep! A few bigger picture thoughts for this PR:
I'm also tempted to switch out |
sounds good to me :) Only a couple of questions. I see the
cool, will do and consolidate on
👍 |
Yes that's what I was thinking, you create one |
I've added the changes as suggested to replace |
simln-lib/src/lib.rs
Outdated
self.tasks.close(); | ||
self.tasks.wait().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the only place I called them alone to wait for any tasks and return
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pre-existing from the current design, but WDYT about having no task management in this method at all (except for producer_tasks
which is internal), and then just wrapping it in a thin method that does the task management for us?
Eg:
pub fn run() {
self.internal_run()
self.tasks.close()
self.tasks.wait()
}
fn internal_run() {
<this method logic>
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like it better with everything in run
but I don't have strong opinions so I can do the internal_run
if you prefer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like it better with everything in run
motivation?
My thinking was that it's a little more future proof to have an internal_run
just handle task cleanup, so that we don't run the risk of forgetting to handle tasks if we add a new exit point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
motivation?
no reason really 😅
My thinking was that it's a little more future proof to have an
internal_run
just handle task cleanup,
yeah that's why I thought about maybe putting them in the shutdown
method so that it could handle that cleanup too but shutdown
isn't always called. Anyways, I'm sold, I'll move the cleanup to an internal_run
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
Only one major comment about how we're going to clean up run
a bit 👍
simln-lib/src/sim_node.rs
Outdated
} | ||
} | ||
|
||
self.tasks.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can get rid of this method completely and make a note that SimGraph
expects the caller to wait for TaskTracker
to close + complete? This method makes less sense in a world where we're sharing trackers IMO.
Ah, this is done in the last commit (reviewing one commit at a time). Could we move that into the first commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry I didn't clean up the commits. I was thinking this could all be squashed into one commit as it all encapsulates the change to TaskTracker
from JoinSet
or should I have 3 separate ones as I currently have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking this could all be squashed into one commit as it all encapsulates the change to TaskTracker from JoinSet
That sounds good! I was just looking at them individually because they weren't fixups, but that makes sense.
Why don't you go ahead and squash the next time you push, and then we can do fixups if there are any further changes? Just so it's easier to track in review.
simln-lib/src/lib.rs
Outdated
/// track all tasks spawned for use in the simulation. | ||
tasks: TaskTracker, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitnit: capitalize "track" and make a note that the run
function will wait for all tasks to complete so that people passing a tracker in that they're using elsewhere know that it'll be handled.
I think that we can put that both here and on run
because it's important.
simln-lib/src/lib.rs
Outdated
self.tasks.close(); | ||
self.tasks.wait().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pre-existing from the current design, but WDYT about having no task management in this method at all (except for producer_tasks
which is internal), and then just wrapping it in a thin method that does the task management for us?
Eg:
pub fn run() {
self.internal_run()
self.tasks.close()
self.tasks.wait()
}
fn internal_run() {
<this method logic>
}
I've squashed everything into one commit and tried to address the comments from your last review (: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last few comments from me, sorry I didn't pick them up on the last review round!
simln-lib/src/lib.rs
Outdated
@@ -518,6 +518,10 @@ pub struct Simulation { | |||
activity: Vec<ActivityDefinition>, | |||
/// Results logger that holds the simulation statistics. | |||
results: Arc<Mutex<PaymentResultLogger>>, | |||
/// Track all tasks spawned for use in the simulation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comments can wrap at 100, here and a few other places
simln-lib/src/sim_node.rs
Outdated
@@ -682,24 +682,10 @@ impl SimGraph { | |||
Ok(SimGraph { | |||
nodes, | |||
channels: Arc::new(Mutex::new(channels)), | |||
tasks: JoinSet::new(), | |||
tasks: TaskTracker::new(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be consistent with Simulation
, let's pass tasks
in here as well?
Even though we'll probably use a different constructor to run with a SimGraph
when we implement this internally, we probably want other use cases that are using sim-ln
as a library to have to provide (and manage) their own tasks
- just gives more flexibility.
simln-lib/src/lib.rs
Outdated
@@ -1351,7 +1341,7 @@ async fn produce_simulation_results( | |||
results: Sender<(Payment, PaymentResult)>, | |||
listener: Listener, | |||
) -> Result<(), SimulationError> { | |||
let mut set = tokio::task::JoinSet::new(); | |||
let set = TaskTracker::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can pass the top level task set in here as well rather than having a new set?
This is pre-existing, sorry I didn't mention it earlier!
JoinSet was causing memory to build up while running because we were waiting until shutdown to clean those tasks with join_next. Using a TaskTracker because it will free the memory from the tasks immediately after they exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tACK on 3a94d61, thanks for picking this one up!
@@ -638,8 +638,9 @@ pub struct SimGraph { | |||
/// channels maps the scid of a channel to its current simulation state. | |||
channels: Arc<Mutex<HashMap<ShortChannelID, SimulatedChannel>>>, | |||
|
|||
/// track all tasks spawned to process payments in the graph. | |||
tasks: JoinSet<()>, | |||
/// track all tasks spawned to process payments in the graph. Note that handling the shutdown of tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: capitalize comment
/// run until the simulation completes or we hit an error. | ||
/// Note that it will wait for the tasks in self.tasks to complete | ||
/// before returning. | ||
pub async fn run(&self) -> Result<(), SimulationError> { | ||
self.internal_run().await?; | ||
// Close our TaskTracker and wait for any background tasks | ||
// spawned during internal_run to complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: line wrapping on comments
I ended up taking a different approach from what I mentioned here #221 (comment) to fix the memory buildup.
This adds a new dependency
tokio-util
but since it's from tokio which is already used, I though it would be ok. If not, let me know and I can try something else.It uses a
TaskTracker
fromtokio-util
which works very similar to aJoinSet
but with the difference that it will free the memory from tasks when they are finished without having to call something likejoin_next
for aJoinSet
.From their docs https://docs.rs/tokio-util/latest/tokio_util/task/task_tracker/struct.TaskTracker.html#comparison-to-joinset:
I tested it with the scenario from here: https://github.com/carlaKC/sim-ln/tree/review-club-memleak and it's avoiding the memory buildup that was happening with the
JoinSet
.