[native-image] Native image takes more time than regular java application #974

moorsu · 2019-02-12T11:00:12Z

Following code, when compiled and ran it as java application, completes it by 34 seconds
whereas when I build the native image and ran it, it took 1 min 30 seconds. Am I miss something here?

I used ./native-image -cp . writer.Test to create the image

$ time java -cp bin writer.Test

real 0m34.762s
user 0m25.064s
sys 0m10.100s

$ time ./writer.test

real 1m30.609s
user 1m9.447s
sys 0m20.056s

package writer;

import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;

public class Test {

public static void main(String arg[]) {
	
	BufferedWriter bw = null;
	FileWriter fw = null;

	int classes = 2500;
	int filesCount = 14;
	int records = 5000;
	int columns = 10;
	
	try {

		for (int k=0; k<classes; k++) {
			for (int j=0; j<filesCount; j++) {
				StringBuilder builder = new StringBuilder();
				for (int i=0; i<records; i++) {
					for (int c=0; c<columns; c++) {
						builder.append(c);
						builder.append(",");
					}
				}

				fw = new FileWriter("/tmp/" + k + "_" + j + ".txt");
				bw = new BufferedWriter(fw);
				bw.write(builder.toString());
				bw.close();
				fw.close();
			}
		}

	} catch (IOException e) {
		e.printStackTrace();

	} finally {
		try {

			if (bw != null)
				bw.close();

			if (fw != null)
				fw.close();

		} catch (IOException ex) {
			ex.printStackTrace();
		}
	}
}

}

The text was updated successfully, but these errors were encountered:

vjovanov · 2019-02-12T19:32:03Z

This is most likely caused by: 1) the slow implementation of array copy that we use at the moment and 2) that native-image is probably not inlining builder.append completely.

We are currently working on a faster array copy. Let's see how well do we perform after that fix is made.

tarsa · 2019-02-12T21:26:00Z

Native image is slower in many more benchmarks, see here: https://benchmarksgame-team.pages.debian.net/benchmarksgame/faster/java-substratevm.html (they are calling Graal's nativeimage SubstrateVM).

SergejIsbrecht · 2019-02-13T02:16:48Z

System:
Linux sergej-P50 4.18.0-13-generic #14-Ubuntu SMP Wed Dec 5 09:04:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

openjdk version "11.0.1" 2018-10-16
OpenJDK Runtime Environment (build 11.0.1+13-Ubuntu-3ubuntu3.18.10.1)
OpenJDK 64-Bit Server VM (build 11.0.1+13-Ubuntu-3ubuntu3.18.10.1, mixed mode, sharing)

sergej@sergej-P50:~/Development/GitHub/GraalVMTest/oracleIssue/build/classes/java/main$ /usr/bin/time -v java -cp . GraalIssue974
        Command being timed: "java -cp . GraalIssue974"
        User time (seconds): 14.24
        System time (seconds): 2.96
        Percent of CPU this job got: 106%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.21
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 375772
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 89396
        Voluntary context switches: 3197
        Involuntary context switches: 1929
        Swaps: 0
        File system inputs: 8
        File system outputs: 7000064
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

GraalVM RC12 CE (native-image)

sergej@sergej-P50:~/Development/GitHub/GraalVMTest/oracleIssue/build/graal$ /usr/bin/time -v ./hello-world 
        Command being timed: "./hello-world"
        User time (seconds): 61.32
        System time (seconds): 8.68
        Percent of CPU this job got: 99%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:10.07
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 271604
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 5579856
        Voluntary context switches: 19
        Involuntary context switches: 576
        Swaps: 0
        File system inputs: 24
        File system outputs: 7000000
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Tomorrow I will provide some perf data. This will show why the native-image is that much slower.

olpaw · 2019-02-13T08:23:32Z

Command being timed: "java -cp . GraalIssue974"
Minor (reclaiming a frame) page faults: 89396

vs.

Command being timed: "./hello-world"
Minor (reclaiming a frame) page faults: 5579856

Yikes!

SergejIsbrecht · 2019-02-13T23:41:52Z

sudo perf record -F 99 --call-graph fp ~/Development/GitHub/GraalVMTest/oracleIssue/build/graal/hello-world

FlameGraph from recorded native-image data: https://gist.github.com/SergejIsbrecht/83c89c66caa9b06cc3e457db82546b17

It looks like most of the cpu cycles will be used for:
String_getChars_c1abef6cde0009d1f55ec852da6e07b19eb1cade -> ArraycopySnippets_doArraycopy_043961c4b8a78c521bb6ac0bcc8243e2da5fd903 (28%)

page faults make up 5.3 percent of captured stacks

I will also provide a FlameGraph for JDK but I needs some time to set it up.

Note: Active FP burn lots of CPU cycles, see #916. I could possibly use lbr as call-graph param to reduce FP overhead.

SergejIsbrecht · 2019-02-13T23:56:12Z

@vjovanov, do you have a branch on github for this change? I would like to look into it if possible.

vjovanov · 2019-02-14T08:50:01Z

No branch yet, for now, I have moved all the ArrayCopy snippets from HotSpot to common code. Now, this needs to be used in native-image and all of the array copy calls need to be implemented as foreign functions for native-image.

For inlining the StringBuilder.append you can try and increase the inlining threshold: -H=MaxNodesInTrivialMethod, -H:MaxNodesInTrivialLeafMethod, -H:MaxInvokesInTrivialMethod.

turing85 · 2019-02-14T19:32:32Z

I have observed the same behaviour on multiple benchmarks (GraalVM-EE 1.0.0-RC12). You can find the benchmarks I wrote here: https://github.com/turing85/graal-playground

If you compare execution time of the JMH benchmarks of the execution time of the Fibonacci-, PrimeNumber- and stream-examples with the execution time of their native counterpart, you will see that the native image is up to 5x slower.

SergejIsbrecht · 2019-02-14T20:29:45Z

@turing85, probably create a new ticket sad case, because it could not be related to Array.copy, as this is the case here. Also Java is faster by design when the C2 gets really hot. The native-image does make sense in an embedded environment or for function as a service, when quick startup is really important.

vjovanov · 2022-11-10T13:32:21Z

Latest measurements with PGO:

$ time java Test

________________________________________________________
Executed in   17.60 secs    fish           external
   usr time   14.05 secs  202.00 micros   14.05 secs
   sys time    3.92 secs  104.00 micros    3.92 secs

$ time ./test

________________________________________________________
Executed in   48.40 secs    fish           external
   usr time   42.91 secs  222.00 micros   42.91 secs
   sys time    5.45 secs  116.00 micros    5.45 secs


$ time ./test-pgo-native-architecture

________________________________________________________
Executed in   20.97 secs    fish           external
   usr time   15.83 secs  275.00 micros   15.83 secs
   sys time    5.11 secs  146.00 micros    5.11 secs

Closing the ticket as the gap is closed.

dougxc assigned cstancu Feb 12, 2019

dougxc added the native-image label Feb 12, 2019

cstancu assigned vjovanov and unassigned cstancu Feb 12, 2019

turing85 mentioned this issue Feb 14, 2019

[native-image] native image Performance is worse than runs on JVM #979

Open

vjovanov closed this as completed Nov 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[native-image] Native image takes more time than regular java application #974

[native-image] Native image takes more time than regular java application #974

moorsu commented Feb 12, 2019 •

edited

Loading

vjovanov commented Feb 12, 2019

tarsa commented Feb 12, 2019 •

edited

Loading

SergejIsbrecht commented Feb 13, 2019

olpaw commented Feb 13, 2019

SergejIsbrecht commented Feb 13, 2019 •

edited

Loading

SergejIsbrecht commented Feb 13, 2019

vjovanov commented Feb 14, 2019

turing85 commented Feb 14, 2019 •

edited

Loading

SergejIsbrecht commented Feb 14, 2019

vjovanov commented Nov 10, 2022

[native-image] Native image takes more time than regular java application #974

[native-image] Native image takes more time than regular java application #974

Comments

moorsu commented Feb 12, 2019 • edited Loading

vjovanov commented Feb 12, 2019

tarsa commented Feb 12, 2019 • edited Loading

SergejIsbrecht commented Feb 13, 2019

olpaw commented Feb 13, 2019

SergejIsbrecht commented Feb 13, 2019 • edited Loading

SergejIsbrecht commented Feb 13, 2019

vjovanov commented Feb 14, 2019

turing85 commented Feb 14, 2019 • edited Loading

SergejIsbrecht commented Feb 14, 2019

vjovanov commented Nov 10, 2022

moorsu commented Feb 12, 2019 •

edited

Loading

tarsa commented Feb 12, 2019 •

edited

Loading

SergejIsbrecht commented Feb 13, 2019 •

edited

Loading

turing85 commented Feb 14, 2019 •

edited

Loading