Skip to content

MallocMS page accounting #689

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Nov 7, 2022
Merged

Conversation

qinsoon
Copy link
Member

@qinsoon qinsoon commented Oct 27, 2022

This PR changes how our malloc mark sweep accounts for memory. We are currently accounting bytes used, which is wrong. This PR changes that to page-based accounting. This PR addresses the issue #649 for malloc mark sweep (further work is still needed for the issue for our malloc API).

  • by default, we do not use bulk XOR for malloc mark sweep (which is known to have issues with page accounting)
  • Allow different malloc page size for library malloc

@qinsoon qinsoon added the PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead) label Oct 27, 2022
@k-sareen
Copy link
Collaborator

k-sareen commented Oct 27, 2022

I have a minor fix for a concurrency bug I noticed. Should I make the PR instead?

EDIT: My fix actually simplifies some of the PR as it removes the extra parameter from initialize_object_metadata. See here: k-sareen@ab2c758

@qinsoon
Copy link
Member Author

qinsoon commented Oct 27, 2022

I have a minor fix for a concurrency bug I noticed. Should I make the PR instead?

Sure. You could also commit to this PR if you would like to.

@k-sareen
Copy link
Collaborator

k-sareen commented Oct 27, 2022

I have a minor fix for a concurrency bug I noticed. Should I make the PR instead?

Sure. You could also commit to this PR if you would like to.

Okay will do 👍

EDIT: Hmm @qinsoon I think the PR is from your personal mmtk-core fork so I don't know how I can push to it.

Use a compare_exchange for accessing active page metadata instead of
load followed by a store
@k-sareen k-sareen changed the title Malloc ms page accounting MallocMS page accounting Oct 27, 2022
@k-sareen
Copy link
Collaborator

@qinsoon It's an OOM error for JikesRVM as we've changed how the memory accounting works. We should just be able to bump the heap size and it should pass (theoretically).

@qinsoon
Copy link
Member Author

qinsoon commented Oct 28, 2022

@qinsoon It's an OOM error for JikesRVM as we've changed how the memory accounting works. We should just be able to bump the heap size and it should pass (theoretically).

Right. I am doing that in mmtk/mmtk-jikesrvm#128.

@k-sareen
Copy link
Collaborator

Ah right -- missed that PR. Thank you 👍

@qinsoon
Copy link
Member Author

qinsoon commented Oct 28, 2022

binding-refs
JIKESRVM_BINDING_REPO=qinsoon/mmtk-jikesrvm
JIKESRVM_BINDING_REF=update-pr-689

@qinsoon qinsoon added PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead) and removed PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead) labels Oct 28, 2022
Copy link
Collaborator

@wks wks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I identified some performance issue. If an object is too large (more than one page), it will need to set multiple metadata bits, and the current implementation is inefficient. See in-line comments for possible solutions.

// It is important to go to the end of the object, which may span a page boundary
while page < start + size {
if compare_exchange_set_page_mark(page) {
self.active_pages.fetch_add(1, Ordering::SeqCst);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cmpxchg and the fetch_add happen as two separate operations. One possible optimisation is working out locally how many page marks are set in this loop, and do one single fetch_all(n) after the loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Now I am using a local variable to store the number of pages, and do one fetch_add in the end. I also did the same for unset_page_mark.


let mut page = chunk_start;
while page < chunk_start + BYTES_IN_CHUNK {
self.unset_page_mark(page);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. You probably don't want to set one bit at a time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to unset_page_mark(start, size). It does something similar to the set_page_mark(), and use bulk zero in the end.

Copy link
Collaborator

@wks wks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me now. I don't see more problems.

@qinsoon
Copy link
Member Author

qinsoon commented Nov 1, 2022

The JikesRVM binding keeps failing. I will look at that tomorrow.

@k-sareen
Copy link
Collaborator

k-sareen commented Nov 2, 2022

The JikesRVM binding keeps failing. I will look at that tomorrow.

Yeah I'm not so sure of the bzero for the page metadata. I think it's correct but it's harder to reason about than the set/unset each bit. Since we're going through each page to update the local variable regardless, may as well change it from an is_page_marked() to compare_exchange_unset_page_mark() which might fix the issue with the JikesRVM binding as well.

@qinsoon
Copy link
Member Author

qinsoon commented Nov 3, 2022

The JikesRVM binding keeps failing. I will look at that tomorrow.

Yeah I'm not so sure of the bzero for the page metadata. I think it's correct but it's harder to reason about than the set/unset each bit. Since we're going through each page to update the local variable regardless, may as well change it from an is_page_marked() to compare_exchange_unset_page_mark() which might fix the issue with the JikesRVM binding as well.

That seems to be the problem, though I don't know why bulk zero does not work. I have reverted that part of the code.

@qinsoon qinsoon added PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead) and removed PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead) labels Nov 4, 2022
@qinsoon
Copy link
Member Author

qinsoon commented Nov 7, 2022

After fixing the issue above, I am still seeing failures frequently. But I suspect it is the same problem as mmtk/mmtk-jikesrvm#108 (which is not reproducible on our dev machines). As this PR changes how we account memory for malloc mark sweep and we actually increase the min heap size for benchmarks, GCs are triggered more frequently and therefore we see failures more frequently. I changed the heap size we use for malloc mark sweep in JikesRVM to mitigate the issue.

@qinsoon qinsoon merged commit 83fa8ec into mmtk:master Nov 7, 2022
qinsoon added a commit to mmtk/mmtk-jikesrvm that referenced this pull request Nov 7, 2022
This PR updates to mmtk-core mmtk/mmtk-core#689. With the changes in mmtk-core, mark sweep does page accounting, will require a larger heap size to run some benchmarks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR-testing Run binding tests for the pull request (deprecated: use PR-extended-testing instead)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants