Skip to content

Fix reference processing crash #309

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 12, 2025
Merged

Fix reference processing crash #309

merged 2 commits into from
May 12, 2025

Conversation

wks
Copy link
Contributor

@wks wks commented May 9, 2025

Fixed a bug that when enqueuing references, it swapped the reference pending list with the last node of the new linked list. It should swap with the first node instead.

Worked around a fact that the list given by the method ReferenceGlue::enqueue_references may contain duplicated elements. This is because the reference processor in mmtk-core was designed to work with both of the two use patterns: (1) registering references when created, and (2) discovering references while tracing. Therefore, the reference processor kept the list of Reference instances discovered in the previous GC (in the from-space). But as the OpenJDK binding scans new Reference instances, it will add to-space addresses of the same Reference instances to the reference processor. After the references are processed, both pointers will be normalized to the two-space copy, but as two different entries. We work around this problem in the OpenJDK binding by simply skipping visited entries.

Fixed a bug that when enqueuing references, it swapped the reference
pending list with the last node of the new linked list.  It should swap
with the *first* node instead.

Worked around a fact that the list given by the method
`ReferenceGlue::enqueue_references` may contain duplicated elements.
This is because the reference processor in mmtk-core was designed to
work with both of the two use patterns: (1) registering references when
created, and (2) discovering references while tracing.  Therefore, the
reference processor kept the list of Reference instances discovered in
the previous GC (in the from-space).  But as the OpenJDK binding scans
new Reference instances, it will add to-space addresses of the same
Reference instances to the reference processor.  After the references
are processed, both pointers will be normalized to the two-space copy,
but as two different entries.  We work around this problem in the
OpenJDK binding by simply skipping visited entries.
@wks
Copy link
Contributor Author

wks commented May 9, 2025

An interesting side effect of this PR is that lusearch in DaCapo Chopin runs faster with reference processing enabled.

Edit: This is not always true. If I increase the heap size, GenImmix will become slower with ref proc. But StickyImmix is still faster with ref proc. The jython benchmark is the opposite: it is always faster without ref proc.

wks@luna ~/o/dacapo> MMTK_NO_REFERENCE_TYPES=false MMTK_PLAN=GenImmix taskset -c 0-15 ~/projects/mmtk-github/parallels/fix/refproc-discover/openjdk/build/linux-x86_64-normal-server-release/jdk/bin/java -XX:+UseThirdPartyHeap --add-exports java.base/jdk.internal.ref=ALL-UNNAMED -Djava.library.path=$HOME/projects/mmtk-github/probes/out -Dprobes=RustMMTk -cp $HOME/projects/mmtk-github/probes/out/probes.jar:dacapo-23.11-MR2-chopin.jar -Xm{s,x}33M Harness -c probe.DacapoChopinCallback lusearch
Using scaled threading model. 16 processors detected, 16 threads used to drive the workload, in a possible range of [1,2048]
Version: lucene 9.7.0 (use -p to print nominal benchmark stats)
===== DaCapo 23.11-MR2-chopin lusearch starting =====
Starting 524288 requests...
Completing query batches: 100%
Completed requests
============================ MMTk Statistics Totals ============================
GC      time.other      time.stw        majorGC total-work.count        total-work.time.total   total-work.time.max     total-work.time.min
2884    2771.51 1827.13 450     0       0.000   -inf    inf
Total time: 4598.65 ms
------------------------------ End MMTk Statistics -----------------------------
===== DaCapo 23.11-MR2-chopin lusearch PASSED in 4598 msec =====
===== DaCapo processed 524288 requests in 4595 msec, 114099 requests per second =====
===== DaCapo tail latency, simple: 50% 30 usec, 90% 273 usec, 99% 1473 usec, 99.9% 4298 usec, 99.99% 17655 usec, max 52813 usec, measured over 524288 events =====
===== DaCapo tail latency, metered 100ms smoothing: 50% 216 usec, 90% 1857 usec, 99% 8366 usec, 99.9% 23957 usec, 99.99% 36975 usec, max 61184 usec, measured over 524288 events =====
===== DaCapo tail latency, metered full smoothing: 50% 234338 usec, 90% 358797 usec, 99% 373417 usec, 99.9% 376738 usec, 99.99% 377318 usec, max 382308 usec, measured over 524288 events =====
wks@luna ~/o/dacapo> MMTK_NO_REFERENCE_TYPES=true MMTK_PLAN=GenImmix taskset -c 0-15 ~/projects/mmtk-github/parallels/fix/refproc-discover/openjdk/build/linux-x86_64-normal-server-release/jdk/bin/java -XX:+UseThirdPartyHeap --add-exports java.base/jdk.internal.ref=ALL-UNNAMED -Djava.library.path=$HOME/projects/mmtk-github/probes/out -Dprobes=RustMMTk -cp $HOME/projects/mmtk-github/probes/out/probes.jar:dacapo-23.11-MR2-chopin.jar -Xm{s,x}33M Harness -c probe.DacapoChopinCallback lusearch
Using scaled threading model. 16 processors detected, 16 threads used to drive the workload, in a possible range of [1,2048]
Version: lucene 9.7.0 (use -p to print nominal benchmark stats)
===== DaCapo 23.11-MR2-chopin lusearch starting =====
Starting 524288 requests...
Completing query batches: 100%
Completed requests
============================ MMTk Statistics Totals ============================
GC      time.other      time.stw        majorGC total-work.count        total-work.time.total   total-work.time.max     total-work.time.min
3068    2864.54 2866.29 1941    0       0.000   -inf    inf
Total time: 5730.83 ms
------------------------------ End MMTk Statistics -----------------------------
===== DaCapo 23.11-MR2-chopin lusearch PASSED in 5731 msec =====
===== DaCapo processed 524288 requests in 5727 msec, 91546 requests per second =====
===== DaCapo tail latency, simple: 50% 32 usec, 90% 298 usec, 99% 1653 usec, 99.9% 4437 usec, 99.99% 17297 usec, max 47304 usec, measured over 524288 events =====
===== DaCapo tail latency, metered 100ms smoothing: 50% 99 usec, 90% 1670 usec, 99% 6810 usec, 99.9% 26532 usec, 99.99% 35641 usec, max 50804 usec, measured over 524288 events =====
===== DaCapo tail latency, metered full smoothing: 50% 50 usec, 90% 231101 usec, 99% 304568 usec, 99.9% 306834 usec, 99.99% 309198 usec, max 326663 usec, measured over 524288 events =====

Reference processing tests now incclude all test cases in
`ci-test-only-normal.sh`.
@wks wks marked this pull request as ready for review May 12, 2025 06:18
@wks wks requested a review from qinsoon May 12, 2025 06:19
@wks
Copy link
Contributor Author

wks commented May 12, 2025

@qinsoon I marked this as ready. Since it is a correctness fix, and we plan to move to OpenJDK's own ref processing, I'll not do detailed performance analysis now. It should have no performance impact if reference processing is not enabled. This PR does not change the default value of the option MMTK_NO_REFERENCE_TYPES.

Copy link
Member

@qinsoon qinsoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the fix.

@wks wks merged commit cf7879f into mmtk:master May 12, 2025
56 of 57 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants