-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[native-image] Native image takes more time than regular java application #974
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is most likely caused by: 1) the slow implementation of array copy that we use at the moment and 2) that native-image is probably not inlining We are currently working on a faster array copy. Let's see how well do we perform after that fix is made. |
Native image is slower in many more benchmarks, see here: https://benchmarksgame-team.pages.debian.net/benchmarksgame/faster/java-substratevm.html (they are calling Graal's nativeimage SubstrateVM). |
System: openjdk version "11.0.1" 2018-10-16
GraalVM RC12 CE (native-image)
Tomorrow I will provide some perf data. This will show why the native-image is that much slower. |
vs.
Yikes! |
sudo perf record -F 99 --call-graph fp ~/Development/GitHub/GraalVMTest/oracleIssue/build/graal/hello-world FlameGraph from recorded native-image data: https://gist.github.com/SergejIsbrecht/83c89c66caa9b06cc3e457db82546b17 It looks like most of the cpu cycles will be used for: page faults make up 5.3 percent of captured stacks I will also provide a FlameGraph for JDK but I needs some time to set it up. Note: Active FP burn lots of CPU cycles, see #916. I could possibly use lbr as call-graph param to reduce FP overhead. |
@vjovanov, do you have a branch on github for this change? I would like to look into it if possible. |
No branch yet, for now, I have moved all the For inlining the |
I have observed the same behaviour on multiple benchmarks (GraalVM-EE 1.0.0-RC12). You can find the benchmarks I wrote here: https://github.com/turing85/graal-playground If you compare execution time of the JMH benchmarks of the execution time of the Fibonacci-, PrimeNumber- and stream-examples with the execution time of their native counterpart, you will see that the native image is up to 5x slower. |
@turing85, probably create a new ticket sad case, because it could not be related to Array.copy, as this is the case here. Also Java is faster by design when the C2 gets really hot. The native-image does make sense in an embedded environment or for function as a service, when quick startup is really important. |
Latest measurements with PGO:
Closing the ticket as the gap is closed. |
Following code, when compiled and ran it as java application, completes it by 34 seconds
whereas when I build the native image and ran it, it took 1 min 30 seconds. Am I miss something here?
I used ./native-image -cp . writer.Test to create the image
$ time java -cp bin writer.Test
real 0m34.762s
user 0m25.064s
sys 0m10.100s
$ time ./writer.test
real 1m30.609s
user 1m9.447s
sys 0m20.056s
package writer;
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;
public class Test {
}
The text was updated successfully, but these errors were encountered: