Skip to content

fixup! object-file.c: do fsync() and close() before post-write die() #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

chiyutianyi
Copy link

I think it is something lost in the previous discussion [1].

  1. https://lore.kernel.org/git/[email protected]/

gitster and others added 30 commits April 3, 2022 15:03
* ds/partial-bundle-more:
  pack-objects: lazily set up "struct rev_info", don't leak
  bundle: output hash information in 'verify'
  bundle: move capabilities to end of 'verify'
  pack-objects: parse --filter directly into revs.filter
  pack-objects: move revs out of get_object_list()
  list-objects-filter: remove CL_ARG__FILTER
Fix a memory leak that's been with us since f950026 (fast-rebase:
write conflict state to working tree, index, and HEAD, 2021-05-20)
changed this code to move these strbuf_release() into an if/else
block.

We'll also add to "reflog_msg" in the "else" arm of the "if" block
being modified here, and we'll append to "branch_msg" in both
cases. But after f950026 only the "if" block would free these two
"struct strbuf".

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Amend a freeing pattern added in 0906ac2 (blame: use changed-path
Bloom filters, 2020-04-16) to use a "goto cleanup", so that we can be
sure that we call cleanup_scoreboard().

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Follow-up on the introduction of string_list_init_nodup() and
string_list_init_dup() in the series merged in bd4232f (Merge
branch 'ab/struct-init', 2021-07-16) and convert code that implicitly
relied on xcalloc() being equivalent to the initializer to use
xmalloc() and string_list_init_{no,}dup() instead.

In the case of get_unmerged() in merge-recursive.c we used the
combination of xcalloc() and assigning "1" to "strdup_strings" to get
what we'd get via string_list_init_dup(), let's use that instead.

Adjacent code in cmd_format_patch() will be changed in a subsequent
commit, since we're changing that let's change the other in-tree
patterns that do the same. Let's also convert a "x == NULL" to "!x"
per our CodingGuidelines, as we need to change the "if" line anyway.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Fix two memory leaks in "struct rev_info" by freeing that memory in
cmd_format_patch(). These two are unusual special-cases in being in
the "struct rev_info", but not being "owned" by the code in
revision.c. I.e. they're members of the struct so that this code in
"builtin/log.c" can conveniently pass information code in
"log-tree.c".

See e.g. the make_cover_letter() caller of log_write_email_headers()
here in "builtin/log.c", and [1] for a demonstration of where the
"extra_headers" and "ref_message_ids" struct members are used.

See 20ff068 (format-patch: resurrect extra headers from config,
2006-06-02) and d1566f7 (git-format-patch: Make the second and
subsequent mails replies to the first, 2006-07-14) for the initial
introduction of "extra_headers" and "ref_message_ids".

We can count on repo_init_revisions() memset()-ing this data to 0
however, so we can count on it being either NULL or something we
allocated. In the case of "extra_headers" let's add a local "char *"
variable to hold it, to avoid the eventual cast from "const char *"
when we free() it.

1. https://lore.kernel.org/git/[email protected]/

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Add and apply coccinelle rules to remove "if (E)" before
"free_commit_list(E)", the function can accept NULL, and further
change cases where "E = NULL" followed to also be unconditionally.

The code changes in this commit were entirely made by the coccinelle
rule being added here, and applied with:

    make contrib/coccinelle/free.cocci.patch
    patch -p1 <contrib/coccinelle/free.cocci.patch

The only manual intervention here is that the the relevant code in
commit.c has been manually re-indented.

Suggested-by: Phillip Wood <[email protected]>
Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
The users of the revision.[ch] API's "struct rev_info" are a major
source of memory leaks in the test suite under SANITIZE=leak, which in
turn adds a lot of noise when trying to mark up tests with
"TEST_PASSES_SANITIZE_LEAK=true".

The users of that API are largely one-shot, e.g. "git rev-list" or
"git log", or the "git checkout" and "git stash" being modified here

For these callers freeing the memory is arguably a waste of time, but
in many cases they've actually been trying to free the memory, and
just doing that in a buggy manner.

Let's provide a release_revisions() function for these users, and
start migrating them over per the plan outlined in [1]. Right now this
only handles the "pending" member of the struct, but more will be
added in subsequent commits.

Even though we only clear the "pending" member now, let's not leave a
trap in code like the pre-image of index_differs_from(), where we'd
start doing the wrong thing as soon as the release_revisions() learned
to clear its "diffopt". I.e. we need to call release_revisions() after
we've inspected any state in "struct rev_info".

This leaves in place e.g. clear_pathspec(&rev.prune_data) in
stash_working_tree() in builtin/stash.c, subsequent commits will teach
release_revisions() to free "prune_data" and other members that in
some cases are individually cleared by users of "struct rev_info" by
reaching into its members. Those subsequent commits will remove the
relevant calls to e.g. clear_pathspec().

We avoid amending code in index_differs_from() in diff-lib.c as well
as wt_status_collect_changes_index(), has_unstaged_changes() and
has_uncommitted_changes() in wt-status.c in a way that assumes that we
are already clearing the "diffopt" member. That will be handled in a
subsequent commit.

1. https://lore.kernel.org/git/[email protected]/

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Add a release_revisions() to various users of "struct rev_list" in
those straightforward cases where we only need to add the
release_revisions() call to the end of a block, and don't need to
e.g. refactor anything to use a "goto cleanup" pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
A subsequent commit will add "REV_INFO_INIT" macro adjacent to
repo_init_revisions(), unfortunately between the "struct rev_info"
itself and that function we've added various miscellaneous code
between the two over the years.

Let's move that code either lower in revision.h, giving it API docs
while we're at it, or in cases where it wasn't public API at all move
it into revision.c No lines of code are changed here, only moved
around. The only changes are the addition of new API comments.

The "tree_difference" variable could also be declared like this, which
I think would be a lot clearer, but let's leave that for now to keep
this a move-only change:

	static enum {
		REV_TREE_SAME,
		REV_TREE_NEW, /* Only new files */
		REV_TREE_OLD, /* Only files removed */
		REV_TREE_DIFFERENT, /* Mixed changes */
	} tree_difference = REV_TREE_SAME;

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Use release_revisions() to various users of "struct rev_list" which
need to have their "struct rev_info" zero-initialized before we can
start using it.

For the bundle.c code see the early exit case added in
3bbbe46 (bundle verify: error out if called without an object
database, 2019-05-27).

For the relevant bisect.c code see 45b6370 (bisect: libify
`check_good_are_ancestors_of_bad` and its dependents, 2020-02-17).

For the submodule.c code see the "goto" on "(!left || !right || !sub)"
added in 8e6df65 (submodule: refactor show_submodule_summary with
helper function, 2016-08-31).

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Change the initialization of the "revision" member of "struct
stash_info" to be initialized vi a macro, and more importantly that
that initializing function be tasked to free it, usually via a "goto
cleanup" pattern.

Despite the "revision" name (and the topic of the series containing
this commit) the "stash info" has nothing to do with the "struct
rev_info". I'm making this change because in the subsequent commit
when we do want to free the "struct rev_info" via a "goto cleanup"
pattern we'd otherwise free() uninitialized memory in some cases, as
we only strbuf_init() the string in get_stash_info().

So while it's not the smallest possible change, let's convert all
users of this pattern in the file while we're at it.

A good follow-up to this change would be to change all the "ret = -1;
goto done;" in this file to instead use a "goto cleanup", and
initialize "int ret = -1" at the start of the relevant functions. That
would allow us to drop a lot of needless brace verbosity on two-line
"if" statements, but let's leave that alone for now.

To ensure that there's a 1=1 mapping between owners of the "struct
stash_info" and free_stash_info() change the assert_stash_ref()
function to be a trivial get_stash_info_assert() wrapper. The caller
will call free_stash_info(), and by returning -1 we'll eventually (via
!!ret) exit with status 1 anyway.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Add a release_revisions() to various users of "struct rev_info" which
requires a minor refactoring to a "goto cleanup" pattern to use that
function.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
In the case of cmd_main() in http-push.c we need to move the
deceleration of the "struct rev-list" into the loop over the
"remote_refs" when adding a release_revisions().

We'd previously set up the "revs" for each remote, but would
potentially leak memory on each one.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
In preparation for having the "log" family of functions make wider use
of release_revisions() let's have them call it just before
exiting. This changes the "log", "whatchanged", "show",
"format-patch", etc. commands, all of which live in this file.

The release_revisions() API still only frees the "pending" member, but
will learn to release more members of "struct rev_info" in subsequent
commits.

In the case of "format-patch" revert the addition of UNLEAK() in
dee839a (format-patch: mark rev_info with UNLEAK, 2021-12-16),
which will cause several tests that previously passed under
"TEST_PASSES_SANITIZE_LEAK=true" to start failing.

In subsequent commits we'll now be able to use those tests to check
whether that part of the API is really leaking memory, and will fix
all of those memory leaks. Removing the UNLEAK() allows us to make
incremental progress in that direction. See [1] for further details
about this approach.

Note that the release_revisions() will not be sufficient to deal with
the code in cmd_show() added in 5d7eeee (git-show: grok blobs,
trees and tags, too, 2006-12-14) which clobbers the "pending" array in
the case of "OBJ_COMMIT". That will need to be dealt with by some
future follow-up work.

1. https://lore.kernel.org/git/[email protected]/

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Use a release_revisions() with those "struct rev_list" users which
already "UNLEAK" the struct. It may seem odd to simultaneously attempt
to free() memory, but also to explicitly ignore whether we have memory
leaks in the same.

As explained in preceding commits this is being done to use the
built-in commands as a guinea pig for whether the release_revisions()
function works as expected, we'd like to test e.g. whether we segfault
as we change it. In subsequent commits we'll then remove these
UNLEAK() as the function is made to free the memory that caused us to
add them in the first place.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Use release_revisions() for users of "struct rev_list" that reach into
the "struct rev_info" and clear the "prune_data" already.

In a subsequent commit we'll teach release_revisions() to clear this
itself, but in the meantime let's invoke release_revisions() here to
clear anything else we may have missed, and for reasons of having
consistent boilerplate.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Extend the the release_revisions() function so that it frees the
"commits" in the "struct rev_info".

We don't expect to use this "struct rev_info" again, so there's no
reason to NULL out revs->commits, as e.g. simplify_merges() and
create_boundary_commit_list() do.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Extend the the release_revisions() function so that it frees the
"mailmap" in the "struct rev_info".

The log family of functions now calls the clear_mailmap() function
added in fa8afd18e5a (revisions API: provide and use a
release_revisions(), 2021-09-19), allowing us to whitelist some tests
with "TEST_PASSES_SANITIZE_LEAK=true".

Unfortunately having a pointer to a mailmap in "struct rev_info"
instead of an embedded member that we "own" get a bit messy, as can be
seen in the change to builtin/commit.c.

When we free() this data we won't be able to tell apart a pointer to a
"mailmap" on the heap from one on the stack. As seen in
ea57bc0 (log: add --use-mailmap option, 2013-01-05) the "log"
family allocates it on the heap, but in the find_author_by_nickname()
code added in ea16794 (commit: search author pattern against
mailmap, 2013-08-23) we allocated it on the stack instead.

Ideally we'd simply change that member to a "struct string_list
mailmap" and never free() the "mailmap" itself, but that would be a
much larger change to the revisions API.

We have code that needs to hand an existing "mailmap" to a "struct
rev_info", while we could change all of that, let's not go there
now.

The complexity isn't in the ownership of the "mailmap" per-se, but
that various things assume a "rev_info.mailmap == NULL" means "doesn't
want mailmap", if we changed that to an init'd "struct string_list
we'd need to carefully refactor things to change those assumptions.

Let's instead always free() it, and simply declare that if you add
such a "mailmap" it must be allocated on the heap. Any modern libc
will correctly panic if we free() a stack variable, so this should be
safe going forward.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Extend the the release_revisions() function so that it frees the
"cmdline" in the "struct rev_info". This in combination with a
preceding change to free "commits" and "mailmap" means that we can
whitelist another test under "TEST_PASSES_SANITIZE_LEAK=true".

There was a proposal in [1] to do away with xstrdup()-ing this
add_rev_cmdline(), perhaps that would be worthwhile, but for now let's
just free() it.

We could also make that a "char *" in "struct rev_cmdline_entry"
itself, but since we own it let's expose it as a constant to outside
callers. I proposed that in [2] but have since changed my mind. See
14d30cd (ref-filter: fix memory leak in `free_array_item()`,
2019-07-10), c514c62 (checkout: fix leak of non-existent branch
names, 2020-08-14) and other log history hits for "free((char *)" for
prior art.

This includes the tests we had false-positive passes on before my
6798b08 (perl Git.pm: don't ignore signalled failure in
_cmd_close(), 2022-02-01), now they pass for real.

Since there are 66 tests matching t/t[0-9]*git-svn*.sh it's easier to
list those that don't pass than to touch most of those 66. So let's
introduce a "TEST_FAILS_SANITIZE_LEAK=true", which if set in the tests
won't cause lib-git-svn.sh to set "TEST_PASSES_SANITIZE_LEAK=true.

This change also marks all the tests that we removed
"TEST_FAILS_SANITIZE_LEAK=true" from in an earlier commit due to
removing the UNLEAK() from cmd_format_patch(), we can now assert that
its API use doesn't leak any "struct rev_info" memory.

This change also made commit "t5503-tagfollow.sh" pass on current
master, but that would regress when combined with
ps/fetch-atomic-fixup's de004e8 (t5503: simplify setup of test
which exercises failure of backfill, 2022-03-03) (through no fault of
that topic, that change started using "git clone" in the test, which
has an outstanding leak). Let's leave that test out for now to avoid
in-flight semantic conflicts.

1. https://lore.kernel.org/git/YUj%[email protected]/
2. https://lore.kernel.org/git/[email protected]/

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Extend the the release_revisions() function so that it frees the
"filter" in the "struct rev_info". This in combination with a
preceding change to free "cmdline" means that we can mark another set
of tests as passing under "TEST_PASSES_SANITIZE_LEAK=true".

The "filter" member was added recently in ffaa137 (revision: put
object filter into struct rev_info, 2022-03-09), and this fixes leaks
intruded in the subsequent leak 7940941 (pack-objects: use
rev.filter when possible, 2022-03-09) and 105c6f1 (bundle: parse
filter capability, 2022-03-09).

The "builtin/pack-objects.c" leak in 7940941 was effectively with
us already, but the variable was referred to by a "static" file-scoped
variable. The "bundle.c " leak in 105c6f1 was newly introduced
with the new "filter" feature for bundles.

The "t5600-clone-fail-cleanup.sh" change here to add
"TEST_PASSES_SANITIZE_LEAK=true" is one of the cases where
run-command.c in not carrying the abort() exit code upwards would have
had that test passing before, but now it *actually* passes[1]. We
should fix the lack of 1=1 mapping of SANITIZE=leak testing to actual
leaks some other time, but it's an existing edge case, let's just mark
the really-passing test as passing for now.

1. https://lore.kernel.org/git/[email protected]/

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Extend the the release_revisions() function so that it frees the
"grep_filter" in the "struct rev_info".This allows us to mark a test
as passing under "TEST_PASSES_SANITIZE_LEAK=true".

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Extend the the release_revisions() function so that it frees the
"prune_data" in the "struct rev_info". This means that any code that
calls "release_revisions()" already can get rid of adjacent calls to
clear_pathspec().

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Clear the "boundary_commits" object_array in release_revisions(). This
makes a few more tests pass under SANITIZE=leak, including
"t/t4126-apply-empty.sh" which started failed as an UNLEAK() in
cmd_format_patch() was removed in a preceding commit.

This also re-marks the various tests relying on "git format-patch" as
passing under "SANITIZE=leak", in the preceding "revisions API users:
use release_revisions() in builtin/log.c" commit those were marked as
failing as we removed the UNLEAK(rev) from cmd_format_patch() in
"builtin/log.c".

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Add a missing reflog_walk_info_release() to "reflog-walk.c" and use it
in release_revisions().

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Call diff_free() on the "pruning" member of "struct rev_info".  Doing
so makes several tests pass under SANITIZE=leak.

This was also the last missing piece that allows us to remove the
UNLEAK() in "cmd_diff" and "cmd_diff_index", which allows us to use
those commands as a canary for general leaks in the revisions API. See
[1] for further rationale, and 886e108 (builtin/: add UNLEAKs,
2017-10-01) for the commit that added the UNLEAK() there.

1. https://lore.kernel.org/git/[email protected]/

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Extend the the release_revisions() function so that it frees the
"date_mode" in the "struct ref_info".

This uses the date_mode_release() function added in 974c919 (date
API: add and use a date_mode_release(), 2022-02-16). As that commit
notes "t7004-tag.sh" tests for the leaks that are being fixed
here. That test now fails "only" 44 tests, instead of the 46 it failed
before this change.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Refactor the existing reset_topo_walk() into a thin wrapper for a
release_revisions_topo_walk_info() + resetting the member to "NULL",
and call release_revisions_topo_walk_info() from release_revisions().

This fixes memory leaks that have been with us ever since
"topo_walk_info" was added to revision.[ch] in
f0d9cc4 (revision.c: begin refactoring --topo-order logic,
2018-11-01).

Due to various other leaks this makes no tests pass in their entirety,
but e.g. before this running this on git.git:

    ./git -P log --pretty=tformat:"%P   %H | %s" --parents --full-history --topo-order -3 -- README.md

Would report under SANITIZE=leak:

    SUMMARY: LeakSanitizer: 531064 byte(s) leaked in 6 allocation(s).

Now we'll free all of that memory.

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Add a TODO comment indicating that we should release "diffopt" in
release_revisions(). In a preceding commit we started releasing the
"pruning" member of the same type, but handling "diffopt" will require
us to untangle the "no_free" conditions I added in e900d49 (diff:
add an API for deferred freeing, 2021-02-11).

Let's leave a TODO comment to that effect, and so that we don't forget
refactor code that was changed to use release_revisions() in earlier
commits to stop using the "diffopt" member after a call to
release_revisions(). This works currently, but would become a logic
error as soon as we started freeing "diffopt". Doing that change now
doesn't harm anything, and future-proofs us against a later change to
release_revisions().

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
In b92cb86 (travis-ci: check that all build artifacts are
.gitignore-d, 2017-12-31), a function was introduced with a code style
that is different from the surrounding code: it added the opening curly
brace on its own line, when all the existing functions in the same file
cuddle that brace on the same line as the function name.

Let's make the code style consistent again.

Signed-off-by: Johannes Schindelin <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
The code writing JUnit XML is interspersed directly with all the code in
`t/test-lib.sh`, and it is therefore not only ill-separated, but
introducing yet another output format would make the situation even
worse.

Let's introduce an abstraction layer by hiding the JUnit XML code behind
four new functions that are supposed to be called before and after each
test and test case.

This is not just an academic exercise, refactoring for refactoring's
sake. We _actually_ want to introduce such a new output format, to
make it substantially easier to diagnose test failures in our GitHub
workflow, therefore we do need this refactoring.

This commit is best viewed with `git show --color-moved
--color-moved-ws=allow-indentation-change <commit>`.

Signed-off-by: Johannes Schindelin <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
avar added a commit that referenced this pull request Jan 9, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Jan 10, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Jan 12, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Jan 12, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Jan 13, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar pushed a commit that referenced this pull request Jan 18, 2023
There is an out-of-bounds read possible when parsing gitattributes that
have an attribute that is 2^31+1 bytes long. This is caused due to an
integer overflow when we assign the result of strlen(3P) to an `int`,
where we use the wrapped-around value in a subsequent call to
memcpy(3P). The following code reproduces the issue:

    blob=$(perl -e 'print "a" x 2147483649 . " attr"' | git hash-object -w --stdin)
    git update-index --add --cacheinfo 100644,$blob,.gitattributes
    git check-attr --all file

    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==8451==ERROR: AddressSanitizer: SEGV on unknown address 0x7f93efa00800 (pc 0x7f94f1f8f082 bp 0x7ffddb59b3a0 sp 0x7ffddb59ab28 T0)
    ==8451==The signal is caused by a READ memory access.
        #0 0x7f94f1f8f082  (/usr/lib/libc.so.6+0x176082)
        #1 0x7f94f2047d9c in __interceptor_strspn /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:752
        #2 0x560e190f7f26 in parse_attr_line attr.c:375
        #3 0x560e190f9663 in handle_attr_line attr.c:660
        #4 0x560e190f9ddd in read_attr_from_index attr.c:769
        #5 0x560e190f9f14 in read_attr attr.c:797
        git#6 0x560e190fa24e in bootstrap_attr_stack attr.c:867
        #7 0x560e190fa4a5 in prepare_attr_stack attr.c:902
        git#8 0x560e190fb5dc in collect_some_attrs attr.c:1097
        git#9 0x560e190fb93f in git_all_attrs attr.c:1128
        git#10 0x560e18e6136e in check_attr builtin/check-attr.c:67
        git#11 0x560e18e61c12 in cmd_check_attr builtin/check-attr.c:183
        git#12 0x560e18e15993 in run_builtin git.c:466
        git#13 0x560e18e16397 in handle_builtin git.c:721
        git#14 0x560e18e16b2b in run_argv git.c:788
        git#15 0x560e18e17991 in cmd_main git.c:926
        git#16 0x560e190ae2bd in main common-main.c:57
        git#17 0x7f94f1e3c28f  (/usr/lib/libc.so.6+0x2328f)
        git#18 0x7f94f1e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        git#19 0x560e18e110e4 in _start ../sysdeps/x86_64/start.S:115

    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV (/usr/lib/libc.so.6+0x176082)
    ==8451==ABORTING

Fix this bug by converting the variable to a `size_t` instead.

Signed-off-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
avar pushed a commit that referenced this pull request Jan 18, 2023
It is possible to trigger an integer overflow when parsing attribute
names when there are more than 2^31 of them for a single pattern. This
can either lead to us dying due to trying to request too many bytes:

     blob=$(perl -e 'print "f" . " a=" x 2147483649' | git hash-object -w --stdin)
     git update-index --add --cacheinfo 100644,$blob,.gitattributes
     git attr-check --all file

    =================================================================
    ==1022==ERROR: AddressSanitizer: requested allocation size 0xfffffff800000032 (0xfffffff800001038 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x10000000000 (thread T0)
        #0 0x7fd3efabf411 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
        #1 0x5563a0a1e3d3 in xcalloc wrapper.c:150
        #2 0x5563a058d005 in parse_attr_line attr.c:384
        #3 0x5563a058e661 in handle_attr_line attr.c:660
        #4 0x5563a058eddb in read_attr_from_index attr.c:769
        #5 0x5563a058ef12 in read_attr attr.c:797
        git#6 0x5563a058f24c in bootstrap_attr_stack attr.c:867
        #7 0x5563a058f4a3 in prepare_attr_stack attr.c:902
        git#8 0x5563a05905da in collect_some_attrs attr.c:1097
        git#9 0x5563a059093d in git_all_attrs attr.c:1128
        git#10 0x5563a02f636e in check_attr builtin/check-attr.c:67
        git#11 0x5563a02f6c12 in cmd_check_attr builtin/check-attr.c:183
        git#12 0x5563a02aa993 in run_builtin git.c:466
        git#13 0x5563a02ab397 in handle_builtin git.c:721
        git#14 0x5563a02abb2b in run_argv git.c:788
        git#15 0x5563a02ac991 in cmd_main git.c:926
        git#16 0x5563a05432bd in main common-main.c:57
        git#17 0x7fd3ef82228f  (/usr/lib/libc.so.6+0x2328f)

    ==1022==HINT: if you don't care about these errors you may set allocator_may_return_null=1
    SUMMARY: AddressSanitizer: allocation-size-too-big /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77 in __interceptor_calloc
    ==1022==ABORTING

Or, much worse, it can lead to an out-of-bounds write because we
underallocate and then memcpy(3P) into an array:

    perl -e '
        print "A " . "\rh="x2000000000;
        print "\rh="x2000000000;
        print "\rh="x294967294 . "\n"
    ' >.gitattributes
    git add .gitattributes
    git commit -am "evil attributes"

    $ git clone --quiet /path/to/repo
    =================================================================
    ==15062==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000002550 at pc 0x5555559884d5 bp 0x7fffffffbc60 sp 0x7fffffffbc58
    WRITE of size 8 at 0x602000002550 thread T0
        #0 0x5555559884d4 in parse_attr_line attr.c:393
        #1 0x5555559884d4 in handle_attr_line attr.c:660
        #2 0x555555988902 in read_attr_from_index attr.c:784
        #3 0x555555988902 in read_attr_from_index attr.c:747
        #4 0x555555988a1d in read_attr attr.c:800
        #5 0x555555989b0c in bootstrap_attr_stack attr.c:882
        git#6 0x555555989b0c in prepare_attr_stack attr.c:917
        #7 0x555555989b0c in collect_some_attrs attr.c:1112
        git#8 0x55555598b141 in git_check_attr attr.c:1126
        git#9 0x555555a13004 in convert_attrs convert.c:1311
        git#10 0x555555a95e04 in checkout_entry_ca entry.c:553
        git#11 0x555555d58bf6 in checkout_entry entry.h:42
        git#12 0x555555d58bf6 in check_updates unpack-trees.c:480
        git#13 0x555555d5eb55 in unpack_trees unpack-trees.c:2040
        git#14 0x555555785ab7 in checkout builtin/clone.c:724
        git#15 0x555555785ab7 in cmd_clone builtin/clone.c:1384
        git#16 0x55555572443c in run_builtin git.c:466
        git#17 0x55555572443c in handle_builtin git.c:721
        git#18 0x555555727872 in run_argv git.c:788
        git#19 0x555555727872 in cmd_main git.c:926
        git#20 0x555555721fa0 in main common-main.c:57
        git#21 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308
        git#22 0x555555723f39 in _start (git+0x1cff39)

    0x602000002552 is located 0 bytes to the right of 2-byte region [0x602000002550,0x602000002552) allocated by thread T0 here:
        #0 0x7ffff768c037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
        #1 0x555555d7fff7 in xcalloc wrapper.c:150
        #2 0x55555598815f in parse_attr_line attr.c:384
        #3 0x55555598815f in handle_attr_line attr.c:660
        #4 0x555555988902 in read_attr_from_index attr.c:784
        #5 0x555555988902 in read_attr_from_index attr.c:747
        git#6 0x555555988a1d in read_attr attr.c:800
        #7 0x555555989b0c in bootstrap_attr_stack attr.c:882
        git#8 0x555555989b0c in prepare_attr_stack attr.c:917
        git#9 0x555555989b0c in collect_some_attrs attr.c:1112
        git#10 0x55555598b141 in git_check_attr attr.c:1126
        git#11 0x555555a13004 in convert_attrs convert.c:1311
        git#12 0x555555a95e04 in checkout_entry_ca entry.c:553
        git#13 0x555555d58bf6 in checkout_entry entry.h:42
        git#14 0x555555d58bf6 in check_updates unpack-trees.c:480
        git#15 0x555555d5eb55 in unpack_trees unpack-trees.c:2040
        git#16 0x555555785ab7 in checkout builtin/clone.c:724
        git#17 0x555555785ab7 in cmd_clone builtin/clone.c:1384
        git#18 0x55555572443c in run_builtin git.c:466
        git#19 0x55555572443c in handle_builtin git.c:721
        git#20 0x555555727872 in run_argv git.c:788
        git#21 0x555555727872 in cmd_main git.c:926
        git#22 0x555555721fa0 in main common-main.c:57
        git#23 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308

    SUMMARY: AddressSanitizer: heap-buffer-overflow attr.c:393 in parse_attr_line
    Shadow bytes around the buggy address:
      0x0c047fff8450: fa fa 00 02 fa fa 00 07 fa fa fd fd fa fa 00 00
      0x0c047fff8460: fa fa 02 fa fa fa fd fd fa fa 00 06 fa fa 05 fa
      0x0c047fff8470: fa fa fd fd fa fa 00 02 fa fa 06 fa fa fa 05 fa
      0x0c047fff8480: fa fa 07 fa fa fa fd fd fa fa 00 01 fa fa 00 02
      0x0c047fff8490: fa fa 00 03 fa fa 00 fa fa fa 00 01 fa fa 00 03
    =>0x0c047fff84a0: fa fa 00 01 fa fa 00 02 fa fa[02]fa fa fa fa fa
      0x0c047fff84b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff84f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
      Shadow gap:              cc
    ==15062==ABORTING

Fix this bug by using `size_t` instead to count the number of attributes
so that this value cannot reasonably overflow without running out of
memory before already.

Reported-by: Markus Vervier <[email protected]>
Signed-off-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
avar pushed a commit that referenced this pull request Jan 18, 2023
When using a padding specifier in the pretty format passed to git-log(1)
we need to calculate the string length in several places. These string
lengths are stored in `int`s though, which means that these can easily
overflow when the input lengths exceeds 2GB. This can ultimately lead to
an out-of-bounds write when these are used in a call to memcpy(3P):

        ==8340==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1ec62f97fe at pc 0x7f2127e5f427 bp 0x7ffd3bd63de0 sp 0x7ffd3bd63588
    WRITE of size 1 at 0x7f1ec62f97fe thread T0
        #0 0x7f2127e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
        #1 0x5628e96aa605 in format_and_pad_commit pretty.c:1762
        #2 0x5628e96aa7f4 in format_commit_item pretty.c:1801
        #3 0x5628e97cdb24 in strbuf_expand strbuf.c:429
        #4 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
        #5 0x5628e96acd0f in pretty_print_commit pretty.c:2161
        git#6 0x5628e95a44c8 in show_log log-tree.c:781
        #7 0x5628e95a76ba in log_tree_commit log-tree.c:1117
        git#8 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
        git#9 0x5628e922c35b in cmd_log_walk builtin/log.c:549
        git#10 0x5628e922f1a2 in cmd_log builtin/log.c:883
        git#11 0x5628e9106993 in run_builtin git.c:466
        git#12 0x5628e9107397 in handle_builtin git.c:721
        git#13 0x5628e9107b07 in run_argv git.c:788
        git#14 0x5628e91088a7 in cmd_main git.c:923
        git#15 0x5628e939d682 in main common-main.c:57
        git#16 0x7f2127c3c28f  (/usr/lib/libc.so.6+0x2328f)
        git#17 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        git#18 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115

    0x7f1ec62f97fe is located 2 bytes to the left of 4831838265-byte region [0x7f1ec62f9800,0x7f1fe62f9839)
    allocated by thread T0 here:
        #0 0x7f2127ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x5628e98774d4 in xrealloc wrapper.c:136
        #2 0x5628e97cb01c in strbuf_grow strbuf.c:99
        #3 0x5628e97ccd42 in strbuf_addchars strbuf.c:327
        #4 0x5628e96aa55c in format_and_pad_commit pretty.c:1761
        #5 0x5628e96aa7f4 in format_commit_item pretty.c:1801
        git#6 0x5628e97cdb24 in strbuf_expand strbuf.c:429
        #7 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
        git#8 0x5628e96acd0f in pretty_print_commit pretty.c:2161
        git#9 0x5628e95a44c8 in show_log log-tree.c:781
        git#10 0x5628e95a76ba in log_tree_commit log-tree.c:1117
        git#11 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
        git#12 0x5628e922c35b in cmd_log_walk builtin/log.c:549
        git#13 0x5628e922f1a2 in cmd_log builtin/log.c:883
        git#14 0x5628e9106993 in run_builtin git.c:466
        git#15 0x5628e9107397 in handle_builtin git.c:721
        git#16 0x5628e9107b07 in run_argv git.c:788
        git#17 0x5628e91088a7 in cmd_main git.c:923
        git#18 0x5628e939d682 in main common-main.c:57
        git#19 0x7f2127c3c28f  (/usr/lib/libc.so.6+0x2328f)
        git#20 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        git#21 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
    Shadow bytes around the buggy address:
      0x0fe458c572a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0fe458c572e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    =>0x0fe458c572f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]
      0x0fe458c57300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      0x0fe458c57340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==8340==ABORTING

The pretty format can also be used in `git archive` operations via the
`export-subst` attribute. So this is what in our opinion makes this a
critical issue in the context of Git forges which allow to download an
archive of user supplied Git repositories.

Fix this vulnerability by using `size_t` instead of `int` to track the
string lengths. Add tests which detect this vulnerability when Git is
compiled with the address sanitizer.

Reported-by: Joern Schneeweisz <[email protected]>
Original-patch-by: Joern Schneeweisz <[email protected]>
Modified-by: Taylor  Blau <[email protected]>
Signed-off-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
avar pushed a commit that referenced this pull request Jan 18, 2023
With the `%>>(<N>)` pretty formatter, you can ask git-log(1) et al to
steal spaces. To do so we need to look ahead of the next token to see
whether there are spaces there. This loop takes into account ANSI
sequences that end with an `m`, and if it finds any it will skip them
until it finds the first space. While doing so it does not take into
account the buffer's limits though and easily does an out-of-bounds
read.

Add a test that hits this behaviour. While we don't have an easy way to
verify this, the test causes the following failure when run with
`SANITIZE=address`:

    ==37941==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000000baf at pc 0x55ba6f88e0d0 bp 0x7ffc84c50d20 sp 0x7ffc84c50d10
    READ of size 1 at 0x603000000baf thread T0
        #0 0x55ba6f88e0cf in format_and_pad_commit pretty.c:1712
        #1 0x55ba6f88e7b4 in format_commit_item pretty.c:1801
        #2 0x55ba6f9b1ae4 in strbuf_expand strbuf.c:429
        #3 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
        #4 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
        #5 0x55ba6f7884c8 in show_log log-tree.c:781
        git#6 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
        #7 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
        git#8 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
        git#9 0x55ba6f4131a2 in cmd_log builtin/log.c:883
        git#10 0x55ba6f2ea993 in run_builtin git.c:466
        git#11 0x55ba6f2eb397 in handle_builtin git.c:721
        git#12 0x55ba6f2ebb07 in run_argv git.c:788
        git#13 0x55ba6f2ec8a7 in cmd_main git.c:923
        git#14 0x55ba6f581682 in main common-main.c:57
        git#15 0x7f2d08c3c28f  (/usr/lib/libc.so.6+0x2328f)
        git#16 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        git#17 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115

    0x603000000baf is located 1 bytes to the left of 24-byte region [0x603000000bb0,0x603000000bc8)
    allocated by thread T0 here:
        #0 0x7f2d08ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x55ba6fa5b494 in xrealloc wrapper.c:136
        #2 0x55ba6f9aefdc in strbuf_grow strbuf.c:99
        #3 0x55ba6f9b0a06 in strbuf_add strbuf.c:298
        #4 0x55ba6f9b1a25 in strbuf_expand strbuf.c:418
        #5 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
        git#6 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
        #7 0x55ba6f7884c8 in show_log log-tree.c:781
        git#8 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
        git#9 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
        git#10 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
        git#11 0x55ba6f4131a2 in cmd_log builtin/log.c:883
        git#12 0x55ba6f2ea993 in run_builtin git.c:466
        git#13 0x55ba6f2eb397 in handle_builtin git.c:721
        git#14 0x55ba6f2ebb07 in run_argv git.c:788
        git#15 0x55ba6f2ec8a7 in cmd_main git.c:923
        git#16 0x55ba6f581682 in main common-main.c:57
        git#17 0x7f2d08c3c28f  (/usr/lib/libc.so.6+0x2328f)
        git#18 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        git#19 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow pretty.c:1712 in format_and_pad_commit
    Shadow bytes around the buggy address:
      0x0c067fff8120: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
      0x0c067fff8130: fd fd fa fa fd fd fd fd fa fa fd fd fd fa fa fa
      0x0c067fff8140: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
      0x0c067fff8150: fa fa fd fd fd fd fa fa 00 00 00 fa fa fa fd fd
      0x0c067fff8160: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
    =>0x0c067fff8170: fd fd fd fa fa[fa]00 00 00 fa fa fa 00 00 00 fa
      0x0c067fff8180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8190: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff81c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb

Luckily enough, this would only cause us to copy the out-of-bounds data
into the formatted commit in case we really had an ANSI sequence
preceding our buffer. So this bug likely has no security consequences.

Fix it regardless by not traversing past the buffer's start.

Reported-by: Patrick Steinhardt <[email protected]>
Reported-by: Eric Sesterhenn <[email protected]>
Signed-off-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
avar pushed a commit that referenced this pull request Jan 18, 2023
An out-of-bounds read can be triggered when parsing an incomplete
padding format string passed via `--pretty=format` or in Git archives
when files are marked with the `export-subst` gitattribute.

This bug exists since we have introduced support for truncating output
via the `trunc` keyword a7f01c6 (pretty: support truncating in %>, %<
and %><, 2013-04-19). Before this commit, we used to find the end of the
formatting string by using strchr(3P). This function returns a `NULL`
pointer in case the character in question wasn't found. The subsequent
check whether any character was found thus simply checked the returned
pointer. After the commit we switched to strcspn(3P) though, which only
returns the offset to the first found character or to the trailing NUL
byte. As the end pointer is now computed by adding the offset to the
start pointer it won't be `NULL` anymore, and as a consequence the check
doesn't do anything anymore.

The out-of-bounds data that is being read can in fact end up in the
formatted string. As a consequence, it is possible to leak memory
contents either by calling git-log(1) or via git-archive(1) when any of
the archived files is marked with the `export-subst` gitattribute.

    ==10888==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000398 at pc 0x7f0356047cb2 bp 0x7fff3ffb95d0 sp 0x7fff3ffb8d78
    READ of size 1 at 0x602000000398 thread T0
        #0 0x7f0356047cb1 in __interceptor_strchrnul /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:725
        #1 0x563b7cec9a43 in strbuf_expand strbuf.c:417
        #2 0x563b7cda7060 in repo_format_commit_message pretty.c:1869
        #3 0x563b7cda8d0f in pretty_print_commit pretty.c:2161
        #4 0x563b7cca04c8 in show_log log-tree.c:781
        #5 0x563b7cca36ba in log_tree_commit log-tree.c:1117
        git#6 0x563b7c927ed5 in cmd_log_walk_no_free builtin/log.c:508
        #7 0x563b7c92835b in cmd_log_walk builtin/log.c:549
        git#8 0x563b7c92b1a2 in cmd_log builtin/log.c:883
        git#9 0x563b7c802993 in run_builtin git.c:466
        git#10 0x563b7c803397 in handle_builtin git.c:721
        git#11 0x563b7c803b07 in run_argv git.c:788
        git#12 0x563b7c8048a7 in cmd_main git.c:923
        git#13 0x563b7ca99682 in main common-main.c:57
        git#14 0x7f0355e3c28f  (/usr/lib/libc.so.6+0x2328f)
        git#15 0x7f0355e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        git#16 0x563b7c7fe0e4 in _start ../sysdeps/x86_64/start.S:115

    0x602000000398 is located 0 bytes to the right of 8-byte region [0x602000000390,0x602000000398)
    allocated by thread T0 here:
        #0 0x7f0356072faa in __interceptor_strdup /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:439
        #1 0x563b7cf7317c in xstrdup wrapper.c:39
        #2 0x563b7cd9a06a in save_user_format pretty.c:40
        #3 0x563b7cd9b3e5 in get_commit_format pretty.c:173
        #4 0x563b7ce54ea0 in handle_revision_opt revision.c:2456
        #5 0x563b7ce597c9 in setup_revisions revision.c:2850
        git#6 0x563b7c9269e0 in cmd_log_init_finish builtin/log.c:269
        #7 0x563b7c927362 in cmd_log_init builtin/log.c:348
        git#8 0x563b7c92b193 in cmd_log builtin/log.c:882
        git#9 0x563b7c802993 in run_builtin git.c:466
        git#10 0x563b7c803397 in handle_builtin git.c:721
        git#11 0x563b7c803b07 in run_argv git.c:788
        git#12 0x563b7c8048a7 in cmd_main git.c:923
        git#13 0x563b7ca99682 in main common-main.c:57
        git#14 0x7f0355e3c28f  (/usr/lib/libc.so.6+0x2328f)
        git#15 0x7f0355e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        git#16 0x563b7c7fe0e4 in _start ../sysdeps/x86_64/start.S:115

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:725 in __interceptor_strchrnul
    Shadow bytes around the buggy address:
      0x0c047fff8020: fa fa fd fd fa fa 00 06 fa fa 05 fa fa fa fd fd
      0x0c047fff8030: fa fa 00 02 fa fa 06 fa fa fa 05 fa fa fa fd fd
      0x0c047fff8040: fa fa 00 07 fa fa 03 fa fa fa fd fd fa fa 00 00
      0x0c047fff8050: fa fa 00 01 fa fa fd fd fa fa 00 00 fa fa 00 01
      0x0c047fff8060: fa fa 00 06 fa fa 00 06 fa fa 05 fa fa fa 05 fa
    =>0x0c047fff8070: fa fa 00[fa]fa fa fd fa fa fa fd fd fa fa fd fd
      0x0c047fff8080: fa fa fd fd fa fa 00 00 fa fa 00 fa fa fa fd fa
      0x0c047fff8090: fa fa fd fd fa fa 00 00 fa fa fa fa fa fa fa fa
      0x0c047fff80a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff80b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c047fff80c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==10888==ABORTING

Fix this bug by checking whether `end` points at the trailing NUL byte.
Add a test which catches this out-of-bounds read and which demonstrates
that we used to write out-of-bounds data into the formatted message.

Reported-by: Markus Vervier <[email protected]>
Original-patch-by: Markus Vervier <[email protected]>
Signed-off-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
avar pushed a commit that referenced this pull request Jan 18, 2023
The return type of both `utf8_strwidth()` and `utf8_strnwidth()` is
`int`, but we operate on string lengths which are typically of type
`size_t`. This means that when the string is longer than `INT_MAX`, we
will overflow and thus return a negative result.

This can lead to an out-of-bounds write with `--pretty=format:%<1)%B`
and a commit message that is 2^31+1 bytes long:

    =================================================================
    ==26009==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000001168 at pc 0x7f95c4e5f427 bp 0x7ffd8541c900 sp 0x7ffd8541c0a8
    WRITE of size 2147483649 at 0x603000001168 thread T0
        #0 0x7f95c4e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
        #1 0x5612bbb1068c in format_and_pad_commit pretty.c:1763
        #2 0x5612bbb1087a in format_commit_item pretty.c:1801
        #3 0x5612bbc33bab in strbuf_expand strbuf.c:429
        #4 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
        #5 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
        git#6 0x5612bba0a4d5 in show_log log-tree.c:781
        #7 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
        git#8 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
        git#9 0x5612bb69235b in cmd_log_walk builtin/log.c:549
        git#10 0x5612bb6951a2 in cmd_log builtin/log.c:883
        git#11 0x5612bb56c993 in run_builtin git.c:466
        git#12 0x5612bb56d397 in handle_builtin git.c:721
        git#13 0x5612bb56db07 in run_argv git.c:788
        git#14 0x5612bb56e8a7 in cmd_main git.c:923
        git#15 0x5612bb803682 in main common-main.c:57
        git#16 0x7f95c4c3c28f  (/usr/lib/libc.so.6+0x2328f)
        git#17 0x7f95c4c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        git#18 0x5612bb5680e4 in _start ../sysdeps/x86_64/start.S:115

    0x603000001168 is located 0 bytes to the right of 24-byte region [0x603000001150,0x603000001168)
    allocated by thread T0 here:
        #0 0x7f95c4ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x5612bbcdd556 in xrealloc wrapper.c:136
        #2 0x5612bbc310a3 in strbuf_grow strbuf.c:99
        #3 0x5612bbc32acd in strbuf_add strbuf.c:298
        #4 0x5612bbc33aec in strbuf_expand strbuf.c:418
        #5 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
        git#6 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
        #7 0x5612bba0a4d5 in show_log log-tree.c:781
        git#8 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
        git#9 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
        git#10 0x5612bb69235b in cmd_log_walk builtin/log.c:549
        git#11 0x5612bb6951a2 in cmd_log builtin/log.c:883
        git#12 0x5612bb56c993 in run_builtin git.c:466
        git#13 0x5612bb56d397 in handle_builtin git.c:721
        git#14 0x5612bb56db07 in run_argv git.c:788
        git#15 0x5612bb56e8a7 in cmd_main git.c:923
        git#16 0x5612bb803682 in main common-main.c:57
        git#17 0x7f95c4c3c28f  (/usr/lib/libc.so.6+0x2328f)

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
    Shadow bytes around the buggy address:
      0x0c067fff81d0: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
      0x0c067fff81e0: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
      0x0c067fff81f0: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
      0x0c067fff8200: fd fd fd fa fa fa fd fd fd fd fa fa 00 00 00 fa
      0x0c067fff8210: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
    =>0x0c067fff8220: fd fa fa fa fd fd fd fa fa fa 00 00 00[fa]fa fa
      0x0c067fff8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8250: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8260: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8270: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==26009==ABORTING

Now the proper fix for this would be to convert both functions to return
an `size_t` instead of an `int`. But given that this commit may be part
of a security release, let's instead do the minimal viable fix and die
in case we see an overflow.

Add a test that would have previously caused us to crash.

Signed-off-by: Patrick Steinhardt <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
avar added a commit that referenced this pull request Jan 23, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 1, 2023
For those tests that have 4 occurances of 'test_i18ngrep', replace it
with 'grep'. Splitting this incremental conversion by number of
occurances in the file is arbitrary, but help so keep the commit size
down.

The 'test_i18ngrep' wrapper has been deprecated since
d162b25 (tests: remove support for GIT_TEST_GETTEXT_POISON,
2021-01-20).

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
avar added a commit that referenced this pull request Feb 2, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 2, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 3, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 3, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 6, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 6, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 6, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 7, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 7, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 8, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 8, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Feb 10, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Mar 6, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Mar 7, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request Mar 7, 2023
* avar/nuke-test_i18ngrep:
  tests with git#10 test_i18ngrep: replace it with 'grep'
  tests with git#9 test_i18ngrep: replace it with 'grep'
  tests with git#8 test_i18ngrep: replace it with 'grep'
  tests with #7 test_i18ngrep: replace it with 'grep'
  tests with git#6 test_i18ngrep: replace it with 'grep'
  tests with #5 test_i18ngrep: replace it with 'grep'
  tests with #4 test_i18ngrep: replace it with 'grep'
  tests with #3 test_i18ngrep: replace it with 'grep'
  tests with #2 test_i18ngrep: replace it with 'grep'
  tests with #1 test_i18ngrep: replace it with 'grep'
avar added a commit that referenced this pull request May 1, 2023
Adjust the code added in 1ff21c0 (config: store "git -c" variables
using more robust format, 2021-01-12) to avoid allocating a "key"
until after we've done the sanity checks on it.

There reason for allocating it this in the function is because the
"env_name" pointer was about to be incremented, let's instead note our
position at that point. The key length is the number of characters
before the first "=".

As a result of this early allocation we'd have a memory leak as
reported by valgrind, just not the "still reachable" kind we usually
care about[1] with SANITIZE=leak:

	$ valgrind --leak-check=full --show-leak-kinds=all ./git --config-env=foo.flag= config --bool foo.flag
        [...]
	==6540== 13 bytes in 1 blocks are still reachable in loss record 3 of 17
	==6540==    at 0x48437B4: malloc (vg_replace_malloc.c:381)
	==6540==    by 0x40278B: do_xmalloc (wrapper.c:51)
	==6540==    by 0x402884: do_xmallocz (wrapper.c:85)
	==6540==    by 0x4028C0: xmallocz (wrapper.c:93)
	==6540==    by 0x4028FD: xmemdupz (wrapper.c:109)
	==6540==    by 0x402962: xstrndup (wrapper.c:115)
	==6540==    by 0x342F53: strip_path_suffix (path.c:1300)
	==6540==    by 0x2B4C89: system_prefix (exec-cmd.c:50)
	==6540==    by 0x2B5057: system_path (exec-cmd.c:268)
	==6540==    by 0x2C5E17: git_setup_gettext (gettext.c:109)
	==6540==    by 0x21DC57: main (common-main.c:44)

Since the "key" was reachable when the "die()" ran the semantics of
SANITIZE=leak wouldn't show it, but "clang" at higher optimization
levels would optimize this to the point of thinking the variable
wasn't reachable, e.g. with Debian clang 14.0.6-2 with CFLAGS="-O3
-g":

	$ ./git --config-env=foo.flag= config --bool foo.flag
	fatal: missing environment variable name for configuration 'foo.flag'

	=================================================================
	==18018==ERROR: LeakSanitizer: detected memory leaks

	Direct leak of 9 byte(s) in 1 object(s) allocated from:
	    #0 0x55e40aad7cd2 in __interceptor_malloc (/home/avar/g/git/git+0x78cd2) (BuildId: b89fa8f797ccb02ef1148beed300071a5f9b9ab1)
	    #1 0x55e40ad41071 in do_xmalloc /home/avar/g/git/wrapper.c:51:8
	    #2 0x55e40ad41071 in do_xmallocz /home/avar/g/git/wrapper.c:85:8
	    #3 0x55e40ad41071 in xmallocz /home/avar/g/git/wrapper.c:93:9
	    #4 0x55e40ad41071 in xmemdupz /home/avar/g/git/wrapper.c:109:16
	    #5 0x55e40abec960 in git_config_push_env /home/avar/g/git/config.c:521:8
	    git#6 0x55e40aadb8b9 in handle_options /home/avar/g/git/git.c:268:4
	    #7 0x55e40aada68d in cmd_main /home/avar/g/git/git.c:896:2
	    git#8 0x55e40abae219 in main /home/avar/g/git/common-main.c:57:11
	    git#9 0x7fbb5287e209 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
	    git#10 0x7fbb5287e2bb in __libc_start_main csu/../csu/libc-start.c:389:3
	    git#11 0x55e40aaac130 in _start (/home/avar/g/git/git+0x4d130) (BuildId: b89fa8f797ccb02ef1148beed300071a5f9b9ab1)

	SUMMARY: LeakSanitizer: 9 byte(s) leaked in 1 allocation(s).
	Aborted

Whatever clang has to say about it it makes sense to save ourselves
the xmemdupz() if we can help it, but it's worth noting why this came
up.

The actual meaningful fix here is that we don't need to do this
allocation at all. The only reason it's needed is because there hasn't
been a variant of "sq_quote_buf()" in quote.c that takes a "len"
parameter.

In previous commits we added one, and will move to using it in the
subsequent commit, but let's first make this smaller change.

1. https://lore.kernel.org/git/[email protected]/

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
avar added a commit that referenced this pull request May 1, 2023
For those tests that have 4 occurances of 'test_i18ngrep', replace it
with 'grep'. Splitting this incremental conversion by number of
occurances in the file is arbitrary, but help so keep the commit size
down.

The 'test_i18ngrep' wrapper has been deprecated since
d162b25 (tests: remove support for GIT_TEST_GETTEXT_POISON,
2021-01-20).

Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants