Skip to content

Commit 557ac03

Browse files
newrengitster
authored andcommitted
merge-ort: begin performance work; instrument with trace2_region_* calls
Add some timing instrumentation for both merge-ort and diffcore-rename; I used these to measure and optimize performance in both, and several future patch series will build on these to reduce the timings of some select testcases. === Setup === The primary testcase I used involved rebasing a random topic in the linux kernel (consisting of 35 patches) against an older version. I added two variants, one where I rename a toplevel directory, and another where I only rebase one patch instead of the whole topic. The setup is as follows: $ git clone git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git $ git branch hwmon-updates fd8bdb23b91876ac1e624337bb88dc1dcc21d67e $ git branch hwmon-just-one fd8bdb23b91876ac1e624337bb88dc1dcc21d67e~34 $ git branch base 4703d9119972bf586d2cca76ec6438f819ffa30e $ git switch -c 5.4-renames v5.4 $ git mv drivers pilots # Introduce over 26,000 renames $ git commit -m "Rename drivers/ to pilots/" $ git config merge.renameLimit 30000 $ git config merge.directoryRenames true === Testcases === Now with REBASE standing for either "git rebase [--merge]" (using merge-recursive) or "test-tool fast-rebase" (using merge-ort), the testcases are: Testcase #1: no-renames $ git checkout v5.4^0 $ REBASE --onto HEAD base hwmon-updates Note: technically the name is misleading; there are some renames, but very few. Rename detection only takes about half the overall time. Testcase #2: mega-renames $ git checkout 5.4-renames^0 $ REBASE --onto HEAD base hwmon-updates Testcase #3: just-one-mega $ git checkout 5.4-renames^0 $ REBASE --onto HEAD base hwmon-just-one === Timing results === Overall timings, using hyperfine (1 warmup run, 3 runs for mega-renames, 10 runs for the other two cases): merge-recursive merge-ort no-renames: 18.912 s ± 0.174 s 14.263 s ± 0.053 s mega-renames: 5964.031 s ± 10.459 s 5504.231 s ± 5.150 s just-one-mega: 149.583 s ± 0.751 s 158.534 s ± 0.498 s A single re-run of each with some breakdowns: --- no-renames --- merge-recursive merge-ort overall runtime: 19.302 s 14.257 s inexact rename detection: 7.603 s 7.906 s everything else: 11.699 s 6.351 s --- mega-renames --- merge-recursive merge-ort overall runtime: 5950.195 s 5499.672 s inexact rename detection: 5746.309 s 5487.120 s everything else: 203.886 s 17.552 s --- just-one-mega --- merge-recursive merge-ort overall runtime: 151.001 s 158.582 s inexact rename detection: 143.448 s 157.835 s everything else: 7.553 s 0.747 s === Timing observations === 0) Maximum speedup The "everything else" row represents the maximum speedup we could achieve if we were to somehow infinitely parallelize inexact rename detection, but leave everything else alone. The fact that this is so much smaller than the real runtime (even in the case with virtually no renames) makes it clear just how overwhelmingly large the time spent on rename detection can be. 1) no-renames 1a) merge-ort is faster than merge-recursive, which is nice. However, this still should not be considered good enough. Although the "merge" backend to rebase (merge-recursive) is sometimes faster than the "apply" backend, this is one of those cases where it is not. In fact, even merge-ort is slower. The "apply" backend can complete this testcase in 6.940 s ± 0.485 s which is about 2x faster than merge-ort and 3x faster than merge-recursive. One goal of the merge-ort performance work will be to make it faster than git-am on this (and similar) testcases. 2) mega-renames 2a) Obviously rename detection is a huge cost; it's where most the time is spent. We need to cut that down. If we could somehow infinitely parallelize it and drive its time to 0, the merge-recursive time would drop to about 204s, and the merge-ort time would drop to about 17s. I think this particular stat shows I've subtly baked a couple performance improvements into merge-ort and into fast-rebase already. 3) just-one-mega 3a) not much to say here, it just gives some flavor for how rebasing only one patch compares to rebasing 35. === Goals === This patch is obviously just the beginning. Here are some of my goals that this measurement will help us achieve: * Drive the cost of rename detection down considerably for merges * After the above has been achieved, see if there are other slowness factors (which would have previously been overshadowed by rename detection costs) which we can then focus on and also optimize. * Ensure our rebase testcase that requires little rename detection is noticeably faster with merge-ort than with apply-based rebase. Signed-off-by: Elijah Newren <[email protected]> Acked-by: Taylor Blau <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 5ced7c3 commit 557ac03

File tree

2 files changed

+65
-0
lines changed

2 files changed

+65
-0
lines changed

diffcore-rename.c

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -465,6 +465,7 @@ void diffcore_rename(struct diff_options *options)
465465
int num_destinations, dst_cnt;
466466
struct progress *progress = NULL;
467467

468+
trace2_region_enter("diff", "setup", options->repo);
468469
if (!minimum_score)
469470
minimum_score = DEFAULT_RENAME_SCORE;
470471

@@ -510,14 +511,17 @@ void diffcore_rename(struct diff_options *options)
510511
register_rename_src(p);
511512
}
512513
}
514+
trace2_region_leave("diff", "setup", options->repo);
513515
if (rename_dst_nr == 0 || rename_src_nr == 0)
514516
goto cleanup; /* nothing to do */
515517

518+
trace2_region_enter("diff", "exact renames", options->repo);
516519
/*
517520
* We really want to cull the candidates list early
518521
* with cheap tests in order to avoid doing deltas.
519522
*/
520523
rename_count = find_exact_renames(options);
524+
trace2_region_leave("diff", "exact renames", options->repo);
521525

522526
/* Did we only want exact renames? */
523527
if (minimum_score == MAX_SCORE)
@@ -545,6 +549,7 @@ void diffcore_rename(struct diff_options *options)
545549
break;
546550
}
547551

552+
trace2_region_enter("diff", "inexact renames", options->repo);
548553
if (options->show_rename_progress) {
549554
progress = start_delayed_progress(
550555
_("Performing inexact rename detection"),
@@ -600,11 +605,13 @@ void diffcore_rename(struct diff_options *options)
600605
if (detect_rename == DIFF_DETECT_COPY)
601606
rename_count += find_renames(mx, dst_cnt, minimum_score, 1);
602607
free(mx);
608+
trace2_region_leave("diff", "inexact renames", options->repo);
603609

604610
cleanup:
605611
/* At this point, we have found some renames and copies and they
606612
* are recorded in rename_dst. The original list is still in *q.
607613
*/
614+
trace2_region_enter("diff", "write back to queue", options->repo);
608615
DIFF_QUEUE_CLEAR(&outq);
609616
for (i = 0; i < q->nr; i++) {
610617
struct diff_filepair *p = q->queue[i];
@@ -680,5 +687,6 @@ void diffcore_rename(struct diff_options *options)
680687
strintmap_clear(break_idx);
681688
FREE_AND_NULL(break_idx);
682689
}
690+
trace2_region_leave("diff", "write back to queue", options->repo);
683691
return;
684692
}

merge-ort.c

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -752,7 +752,9 @@ static int collect_merge_info(struct merge_options *opt,
752752
init_tree_desc(t + 1, side1->buffer, side1->size);
753753
init_tree_desc(t + 2, side2->buffer, side2->size);
754754

755+
trace2_region_enter("merge", "traverse_trees", opt->repo);
755756
ret = traverse_trees(NULL, 3, t, &info);
757+
trace2_region_leave("merge", "traverse_trees", opt->repo);
756758

757759
return ret;
758760
}
@@ -2105,9 +2107,12 @@ static void detect_regular_renames(struct merge_options *opt,
21052107
diff_opts.show_rename_progress = opt->show_rename_progress;
21062108
diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
21072109
diff_setup_done(&diff_opts);
2110+
2111+
trace2_region_enter("diff", "diffcore_rename", opt->repo);
21082112
diff_tree_oid(&merge_base->object.oid, &side->object.oid, "",
21092113
&diff_opts);
21102114
diffcore_std(&diff_opts);
2115+
trace2_region_leave("diff", "diffcore_rename", opt->repo);
21112116

21122117
if (diff_opts.needed_rename_limit > renames->needed_limit)
21132118
renames->needed_limit = diff_opts.needed_rename_limit;
@@ -2206,9 +2211,12 @@ static int detect_and_process_renames(struct merge_options *opt,
22062211

22072212
memset(&combined, 0, sizeof(combined));
22082213

2214+
trace2_region_enter("merge", "regular renames", opt->repo);
22092215
detect_regular_renames(opt, merge_base, side1, MERGE_SIDE1);
22102216
detect_regular_renames(opt, merge_base, side2, MERGE_SIDE2);
2217+
trace2_region_leave("merge", "regular renames", opt->repo);
22112218

2219+
trace2_region_enter("merge", "directory renames", opt->repo);
22122220
need_dir_renames =
22132221
!opt->priv->call_depth &&
22142222
(opt->detect_directory_renames == MERGE_DIRECTORY_RENAMES_TRUE ||
@@ -2230,8 +2238,11 @@ static int detect_and_process_renames(struct merge_options *opt,
22302238
&renames->dir_renames[1],
22312239
&renames->dir_renames[2]);
22322240
QSORT(combined.queue, combined.nr, compare_pairs);
2241+
trace2_region_leave("merge", "directory renames", opt->repo);
22332242

2243+
trace2_region_enter("merge", "process renames", opt->repo);
22342244
clean &= process_renames(opt, &combined);
2245+
trace2_region_leave("merge", "process renames", opt->repo);
22352246

22362247
/* Free memory for renames->pairs[] and combined */
22372248
for (s = MERGE_SIDE1; s <= MERGE_SIDE2; s++) {
@@ -2913,20 +2924,30 @@ static void process_entries(struct merge_options *opt,
29132924
STRING_LIST_INIT_NODUP,
29142925
NULL, 0 };
29152926

2927+
trace2_region_enter("merge", "process_entries setup", opt->repo);
29162928
if (strmap_empty(&opt->priv->paths)) {
29172929
oidcpy(result_oid, opt->repo->hash_algo->empty_tree);
29182930
return;
29192931
}
29202932

29212933
/* Hack to pre-allocate plist to the desired size */
2934+
trace2_region_enter("merge", "plist grow", opt->repo);
29222935
ALLOC_GROW(plist.items, strmap_get_size(&opt->priv->paths), plist.alloc);
2936+
trace2_region_leave("merge", "plist grow", opt->repo);
29232937

29242938
/* Put every entry from paths into plist, then sort */
2939+
trace2_region_enter("merge", "plist copy", opt->repo);
29252940
strmap_for_each_entry(&opt->priv->paths, &iter, e) {
29262941
string_list_append(&plist, e->key)->util = e->value;
29272942
}
2943+
trace2_region_leave("merge", "plist copy", opt->repo);
2944+
2945+
trace2_region_enter("merge", "plist special sort", opt->repo);
29282946
plist.cmp = string_list_df_name_compare;
29292947
string_list_sort(&plist);
2948+
trace2_region_leave("merge", "plist special sort", opt->repo);
2949+
2950+
trace2_region_leave("merge", "process_entries setup", opt->repo);
29302951

29312952
/*
29322953
* Iterate over the items in reverse order, so we can handle paths
@@ -2937,6 +2958,7 @@ static void process_entries(struct merge_options *opt,
29372958
* (because it allows us to know whether the directory is still in
29382959
* the way when it is time to process the file at the same path).
29392960
*/
2961+
trace2_region_enter("merge", "processing", opt->repo);
29402962
for (entry = &plist.items[plist.nr-1]; entry >= plist.items; --entry) {
29412963
char *path = entry->string;
29422964
/*
@@ -2955,7 +2977,9 @@ static void process_entries(struct merge_options *opt,
29552977
process_entry(opt, path, ci, &dir_metadata);
29562978
}
29572979
}
2980+
trace2_region_leave("merge", "processing", opt->repo);
29582981

2982+
trace2_region_enter("merge", "process_entries cleanup", opt->repo);
29592983
if (dir_metadata.offsets.nr != 1 ||
29602984
(uintptr_t)dir_metadata.offsets.items[0].util != 0) {
29612985
printf("dir_metadata.offsets.nr = %d (should be 1)\n",
@@ -2970,6 +2994,7 @@ static void process_entries(struct merge_options *opt,
29702994
string_list_clear(&plist, 0);
29712995
string_list_clear(&dir_metadata.versions, 0);
29722996
string_list_clear(&dir_metadata.offsets, 0);
2997+
trace2_region_leave("merge", "process_entries cleanup", opt->repo);
29732998
}
29742999

29753000
/*** Function Grouping: functions related to merge_switch_to_result() ***/
@@ -3128,19 +3153,23 @@ void merge_switch_to_result(struct merge_options *opt,
31283153
if (result->clean >= 0 && update_worktree_and_index) {
31293154
struct merge_options_internal *opti = result->priv;
31303155

3156+
trace2_region_enter("merge", "checkout", opt->repo);
31313157
if (checkout(opt, head, result->tree)) {
31323158
/* failure to function */
31333159
result->clean = -1;
31343160
return;
31353161
}
3162+
trace2_region_leave("merge", "checkout", opt->repo);
31363163

3164+
trace2_region_enter("merge", "record_conflicted", opt->repo);
31373165
if (record_conflicted_index_entries(opt, opt->repo->index,
31383166
&opti->paths,
31393167
&opti->conflicted)) {
31403168
/* failure to function */
31413169
result->clean = -1;
31423170
return;
31433171
}
3172+
trace2_region_leave("merge", "record_conflicted", opt->repo);
31443173
}
31453174

31463175
if (display_update_msgs) {
@@ -3150,6 +3179,8 @@ void merge_switch_to_result(struct merge_options *opt,
31503179
struct string_list olist = STRING_LIST_INIT_NODUP;
31513180
int i;
31523181

3182+
trace2_region_enter("merge", "display messages", opt->repo);
3183+
31533184
/* Hack to pre-allocate olist to the desired size */
31543185
ALLOC_GROW(olist.items, strmap_get_size(&opti->output),
31553186
olist.alloc);
@@ -3171,6 +3202,8 @@ void merge_switch_to_result(struct merge_options *opt,
31713202
/* Also include needed rename limit adjustment now */
31723203
diff_warn_rename_limit("merge.renamelimit",
31733204
opti->renames.needed_limit, 0);
3205+
3206+
trace2_region_leave("merge", "display messages", opt->repo);
31743207
}
31753208

31763209
merge_finalize(opt, result);
@@ -3212,6 +3245,7 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
32123245
int i;
32133246

32143247
/* Sanity checks on opt */
3248+
trace2_region_enter("merge", "sanity checks", opt->repo);
32153249
assert(opt->repo);
32163250

32173251
assert(opt->branch1 && opt->branch2);
@@ -3250,11 +3284,13 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
32503284
assert(!opt->priv->toplevel_dir ||
32513285
0 == strlen(opt->priv->toplevel_dir));
32523286
}
3287+
trace2_region_leave("merge", "sanity checks", opt->repo);
32533288

32543289
/* Default to histogram diff. Actually, just hardcode it...for now. */
32553290
opt->xdl_opts = DIFF_WITH_ALG(opt, HISTOGRAM_DIFF);
32563291

32573292
/* Initialization of opt->priv, our internal merge data */
3293+
trace2_region_enter("merge", "allocate/init", opt->repo);
32583294
if (opt->priv) {
32593295
clear_or_reinit_internal_opts(opt->priv, 1);
32603296
trace2_region_leave("merge", "allocate/init", opt->repo);
@@ -3292,6 +3328,8 @@ static void merge_start(struct merge_options *opt, struct merge_result *result)
32923328
* subset of the overall paths that have special output.
32933329
*/
32943330
strmap_init(&opt->priv->output);
3331+
3332+
trace2_region_leave("merge", "allocate/init", opt->repo);
32953333
}
32963334

32973335
/*** Function Grouping: merge_incore_*() and their internal variants ***/
@@ -3307,6 +3345,7 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt,
33073345
{
33083346
struct object_id working_tree_oid;
33093347

3348+
trace2_region_enter("merge", "collect_merge_info", opt->repo);
33103349
if (collect_merge_info(opt, merge_base, side1, side2) != 0) {
33113350
/*
33123351
* TRANSLATORS: The %s arguments are: 1) tree hash of a merge
@@ -3319,10 +3358,16 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt,
33193358
result->clean = -1;
33203359
return;
33213360
}
3361+
trace2_region_leave("merge", "collect_merge_info", opt->repo);
33223362

3363+
trace2_region_enter("merge", "renames", opt->repo);
33233364
result->clean = detect_and_process_renames(opt, merge_base,
33243365
side1, side2);
3366+
trace2_region_leave("merge", "renames", opt->repo);
3367+
3368+
trace2_region_enter("merge", "process_entries", opt->repo);
33253369
process_entries(opt, &working_tree_oid);
3370+
trace2_region_leave("merge", "process_entries", opt->repo);
33263371

33273372
/* Set return values */
33283373
result->tree = parse_tree_indirect(&working_tree_oid);
@@ -3423,9 +3468,15 @@ void merge_incore_nonrecursive(struct merge_options *opt,
34233468
struct tree *side2,
34243469
struct merge_result *result)
34253470
{
3471+
trace2_region_enter("merge", "incore_nonrecursive", opt->repo);
3472+
3473+
trace2_region_enter("merge", "merge_start", opt->repo);
34263474
assert(opt->ancestor != NULL);
34273475
merge_start(opt, result);
3476+
trace2_region_leave("merge", "merge_start", opt->repo);
3477+
34283478
merge_ort_nonrecursive_internal(opt, merge_base, side1, side2, result);
3479+
trace2_region_leave("merge", "incore_nonrecursive", opt->repo);
34293480
}
34303481

34313482
void merge_incore_recursive(struct merge_options *opt,
@@ -3434,9 +3485,15 @@ void merge_incore_recursive(struct merge_options *opt,
34343485
struct commit *side2,
34353486
struct merge_result *result)
34363487
{
3488+
trace2_region_enter("merge", "incore_recursive", opt->repo);
3489+
34373490
/* We set the ancestor label based on the merge_bases */
34383491
assert(opt->ancestor == NULL);
34393492

3493+
trace2_region_enter("merge", "merge_start", opt->repo);
34403494
merge_start(opt, result);
3495+
trace2_region_leave("merge", "merge_start", opt->repo);
3496+
34413497
merge_ort_internal(opt, merge_bases, side1, side2, result);
3498+
trace2_region_leave("merge", "incore_recursive", opt->repo);
34423499
}

0 commit comments

Comments
 (0)