|
2 | 2 | Benchmark to measure CSV read/write performance
|
3 | 3 | ================================================================================================
|
4 | 4 |
|
5 |
| -OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure |
| 5 | +OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure |
6 | 6 | AMD EPYC 7763 64-Core Processor
|
7 | 7 | Parsing quoted values: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
8 | 8 | ------------------------------------------------------------------------------------------------------------------------
|
9 |
| -One quoted string 23353 23432 75 0.0 467067.4 1.0X |
| 9 | +One quoted string 23962 24182 316 0.0 479231.3 1.0X |
10 | 10 |
|
11 |
| -OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure |
| 11 | +OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure |
12 | 12 | AMD EPYC 7763 64-Core Processor
|
13 | 13 | Wide rows with 1000 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
14 | 14 | ------------------------------------------------------------------------------------------------------------------------
|
15 |
| -Select 1000 columns 56825 57244 679 0.0 56825.1 1.0X |
16 |
| -Select 100 columns 20482 20568 86 0.0 20481.7 2.8X |
17 |
| -Select one column 16968 17000 36 0.1 16967.7 3.3X |
18 |
| -count() 3366 3378 11 0.3 3366.4 16.9X |
19 |
| -Select 100 columns, one bad input field 28347 28379 30 0.0 28346.6 2.0X |
20 |
| -Select 100 columns, corrupt record field 32401 32450 42 0.0 32401.2 1.8X |
| 15 | +Select 1000 columns 56724 57115 570 0.0 56724.1 1.0X |
| 16 | +Select 100 columns 20740 20855 115 0.0 20739.7 2.7X |
| 17 | +Select one column 17304 17377 114 0.1 17304.3 3.3X |
| 18 | +count() 3719 3740 21 0.3 3719.0 15.3X |
| 19 | +Select 100 columns, one bad input field 24943 24999 69 0.0 24943.2 2.3X |
| 20 | +Select 100 columns, corrupt record field 28306 28341 31 0.0 28306.2 2.0X |
21 | 21 |
|
22 |
| -OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure |
| 22 | +OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure |
23 | 23 | AMD EPYC 7763 64-Core Processor
|
24 | 24 | Count a dataset with 10 columns: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
25 | 25 | ------------------------------------------------------------------------------------------------------------------------
|
26 |
| -Select 10 columns + count() 11174 11195 18 0.9 1117.4 1.0X |
27 |
| -Select 1 column + count() 7666 7694 24 1.3 766.6 1.5X |
28 |
| -count() 2042 2048 5 4.9 204.2 5.5X |
| 26 | +Select 10 columns + count() 10977 10982 5 0.9 1097.7 1.0X |
| 27 | +Select 1 column + count() 7406 7554 131 1.4 740.6 1.5X |
| 28 | +count() 1550 1558 9 6.5 155.0 7.1X |
29 | 29 |
|
30 |
| -OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure |
| 30 | +OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure |
31 | 31 | AMD EPYC 7763 64-Core Processor
|
32 | 32 | Write dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
33 | 33 | ------------------------------------------------------------------------------------------------------------------------
|
34 |
| -Create a dataset of timestamps 854 882 27 11.7 85.4 1.0X |
35 |
| -to_csv(timestamp) 6166 6174 13 1.6 616.6 0.1X |
36 |
| -write timestamps to files 6480 6575 158 1.5 648.0 0.1X |
37 |
| -Create a dataset of dates 948 949 1 10.6 94.8 0.9X |
38 |
| -to_csv(date) 4471 4474 3 2.2 447.1 0.2X |
39 |
| -write dates to files 4599 4616 15 2.2 459.9 0.2X |
| 34 | +Create a dataset of timestamps 845 847 3 11.8 84.5 1.0X |
| 35 | +to_csv(timestamp) 5546 5597 57 1.8 554.6 0.2X |
| 36 | +write timestamps to files 5760 5768 8 1.7 576.0 0.1X |
| 37 | +Create a dataset of dates 1053 1064 9 9.5 105.3 0.8X |
| 38 | +to_csv(date) 4115 4122 9 2.4 411.5 0.2X |
| 39 | +write dates to files 4102 4108 5 2.4 410.2 0.2X |
40 | 40 |
|
41 |
| -OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure |
| 41 | +OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure |
42 | 42 | AMD EPYC 7763 64-Core Processor
|
43 | 43 | Read dates and timestamps: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
44 | 44 | -----------------------------------------------------------------------------------------------------------------------------------------------------
|
45 |
| -read timestamp text from files 1200 1213 12 8.3 120.0 1.0X |
46 |
| -read timestamps from files 11576 11601 22 0.9 1157.6 0.1X |
47 |
| -infer timestamps from files 23234 23253 16 0.4 2323.4 0.1X |
48 |
| -read date text from files 1115 1162 44 9.0 111.5 1.1X |
49 |
| -read date from files 10978 11006 43 0.9 1097.8 0.1X |
50 |
| -infer date from files 22588 22604 13 0.4 2258.8 0.1X |
51 |
| -timestamp strings 1224 1236 21 8.2 122.4 1.0X |
52 |
| -parse timestamps from Dataset[String] 13566 13595 41 0.7 1356.6 0.1X |
53 |
| -infer timestamps from Dataset[String] 25057 25094 36 0.4 2505.7 0.0X |
54 |
| -date strings 1618 1626 7 6.2 161.8 0.7X |
55 |
| -parse dates from Dataset[String] 12784 12816 34 0.8 1278.4 0.1X |
56 |
| -from_csv(timestamp) 12008 12088 69 0.8 1200.8 0.1X |
57 |
| -from_csv(date) 11930 11938 12 0.8 1193.0 0.1X |
58 |
| -infer error timestamps from Dataset[String] with default format 14366 14394 35 0.7 1436.6 0.1X |
59 |
| -infer error timestamps from Dataset[String] with user-provided format 14380 14412 52 0.7 1438.0 0.1X |
60 |
| -infer error timestamps from Dataset[String] with legacy format 14439 14453 21 0.7 1443.9 0.1X |
| 45 | +read timestamp text from files 1107 1119 16 9.0 110.7 1.0X |
| 46 | +read timestamps from files 9511 9553 49 1.1 951.1 0.1X |
| 47 | +infer timestamps from files 19084 19114 27 0.5 1908.4 0.1X |
| 48 | +read date text from files 1036 1046 14 9.7 103.6 1.1X |
| 49 | +read date from files 8299 8309 15 1.2 829.9 0.1X |
| 50 | +infer date from files 17290 17294 4 0.6 1729.0 0.1X |
| 51 | +timestamp strings 1188 1197 7 8.4 118.8 0.9X |
| 52 | +parse timestamps from Dataset[String] 11442 11458 14 0.9 1144.2 0.1X |
| 53 | +infer timestamps from Dataset[String] 21076 21116 39 0.5 2107.6 0.1X |
| 54 | +date strings 1651 1659 10 6.1 165.1 0.7X |
| 55 | +parse dates from Dataset[String] 10181 10186 5 1.0 1018.1 0.1X |
| 56 | +from_csv(timestamp) 10023 10062 34 1.0 1002.3 0.1X |
| 57 | +from_csv(date) 9335 9351 15 1.1 933.5 0.1X |
| 58 | +infer error timestamps from Dataset[String] with default format 11187 11205 16 0.9 1118.7 0.1X |
| 59 | +infer error timestamps from Dataset[String] with user-provided format 11201 11216 13 0.9 1120.1 0.1X |
| 60 | +infer error timestamps from Dataset[String] with legacy format 11210 11227 17 0.9 1121.0 0.1X |
61 | 61 |
|
62 |
| -OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1018-azure |
| 62 | +OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure |
63 | 63 | AMD EPYC 7763 64-Core Processor
|
64 | 64 | Filters pushdown: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
|
65 | 65 | ------------------------------------------------------------------------------------------------------------------------
|
66 |
| -w/o filters 4302 4383 137 0.0 43020.6 1.0X |
67 |
| -pushdown disabled 4206 4220 13 0.0 42058.8 1.0X |
68 |
| -w/ filters 776 784 10 0.1 7756.3 5.5X |
| 66 | +w/o filters 4365 4377 13 0.0 43653.8 1.0X |
| 67 | +pushdown disabled 4348 4370 22 0.0 43477.7 1.0X |
| 68 | +w/ filters 695 713 29 0.1 6950.2 6.3X |
| 69 | + |
| 70 | +OpenJDK 64-Bit Server VM 21.0.3+9-LTS on Linux 6.5.0-1022-azure |
| 71 | +AMD EPYC 7763 64-Core Processor |
| 72 | +Interval: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative |
| 73 | +------------------------------------------------------------------------------------------------------------------------ |
| 74 | +Read as Intervals 7089 7096 7 0.4 2362.1 1.0X |
| 75 | +Read Raw Strings 2071 2075 6 1.4 690.1 3.4X |
69 | 76 |
|
70 | 77 |
|
0 commit comments