Skip to content

Don't lock up 256KiB buffers when adding small files #4508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 31, 2017

Conversation

Stebalien
Copy link
Member

@Stebalien Stebalien commented Dec 20, 2017

This is part of #4505.

It's WIP because we still need to use some form of buffer pool. As is, allocating and throwing away 256KiB buffers is killing GC.

@ghost ghost assigned Stebalien Dec 20, 2017
@ghost ghost added the status/in-progress In progress label Dec 20, 2017
License: MIT
Signed-off-by: Steven Allen <[email protected]>
License: MIT
Signed-off-by: Steven Allen <[email protected]>
License: MIT
Signed-off-by: Steven Allen <[email protected]>
@Stebalien Stebalien changed the title [WIP] Don't lock up 256KiB buffers when adding small files Don't lock up 256KiB buffers when adding small files Dec 20, 2017
@Stebalien
Copy link
Member Author

So, this uses the buffer pool from go-msgio. Should we break that into a separate package? It's used all over the place.

I'd use a chunker-specific buffer pool but we allow different sized chunkers...

kevina
kevina previously approved these changes Dec 20, 2017
Copy link
Contributor

@kevina kevina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might cause a slight slow down due to the extra copying, especially when the buffer is almost pull, but in practice I don't think this will be a problem.

Looks like the P.R. was updated when I wrote this. Need to have another look.

@kevina kevina self-requested a review December 20, 2017 03:22
@whyrusleeping
Copy link
Member

It would be great if we could figure out a way to put all these buffers back into the buffer pool in the common case

License: MIT
Signed-off-by: Steven Allen <[email protected]>
@kevina
Copy link
Contributor

kevina commented Dec 20, 2017

( @Stebalien for future reference, I find it annoying when people do a rebase that makes a major change. (In this case using the mpool from go-msgio). At least I would appreciate it if you just add new commits and save rebasing for fixups. )

@Stebalien
Copy link
Member Author

It would be great if we could figure out a way to put all these buffers back into the buffer pool in the common case.

That would be nice but would require some form of refcounting on blocks (at least given the current interfaces).

( @Stebalien for future reference, I find it annoying when people do a rebase that makes a major change. (In this case using the mpool from go-msgio). At least I would appreciate it if you just add new commits and save rebasing for fixups. )

The new commit completely changed my approach so I figured I'd just do it over. I could have opened a new PR but I figured nobody had had looked at this one yet (that's when I start trying to preserve history). Sorry about that.

@kevina
Copy link
Contributor

kevina commented Dec 20, 2017

@Stebalien How is using a memory pool helping here? In the original code before the rebase you reused the 256Kib buffer unless the buffer was full. You are doing the same basic thing here. I don't think it is worth the extra dependency unless we provide (and use) a method to return the returned byte to the pool.

@kevina
Copy link
Contributor

kevina commented Dec 20, 2017

More to the point, it looks like you are just using the mpool as a better allocator, is the default go allocator so bad that it helps to the a custom run?

Or is it the case that NewSizeSplitter is called for each new file, in which case I withdraw my objection.

@kevina kevina dismissed their stale review December 20, 2017 04:01

No Longer Applies.

@Stebalien
Copy link
Member Author

Or is it the case that NewSizeSplitter is called for each new file, in which case I withdraw my objection.

It is. I'm using it for the case where we add a bunch of small files. Without this, memory usage is still unstable because the GC lags a bit.

The alternative is to re-use the splitter somehow. I toyed with a reset function but that would be a more invasive change.

@kevina
Copy link
Contributor

kevina commented Dec 20, 2017

It is. I'm using it for the case where we add a bunch of small files. Without this, memory usage is still unstable because the GC lags a bit.

Ok.

I was going to say what's one more dependency but:

So, this uses the buffer pool from go-msgio. Should we break that into a separate package? It's used all over the place.

Yes i think we should. I am not particularly happy with bringing in a whole package just to use one of its utility sub-packages.

Other than that it LGTM.

@whyrusleeping whyrusleeping merged commit 8f17968 into master Dec 31, 2017
@whyrusleeping whyrusleeping deleted the fix/add-small-files branch December 31, 2017 22:29
@ghost ghost removed the status/in-progress In progress label Dec 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants