Fix generation_number_overflow 32-bit tag creation #1698

EliahKagan · 2024-11-23T10:11:37Z

Overview

In the "Coupling to gitoxide's own repo" section of #1687, I said:

Fortunately, for this 32-bit Debian container on CI, only one case of this had to be addressed, largely because GIX_TEST_IGNORE_ARCHIVES is not being used.

But the situation in containers is actually better than that. Setting GIX_TEST_IGNORE_ARCHIVES=1 on a 32-bit Debian 12 system--both in the container on CI or locally outside the container--only causes one failure. This failure is not related to coupling to gitoxide's repo, and it is a 32-bit specific failure.

This pull request describes and fixes that failure, but I don't know if this fix is optimal. Even if this fix is good, I suspect further changes are warranted.

The current test failure output

The failing test is gix-commitgraph::commitgraph access::generation_numbers_overflow_is_handled_in_chained_graph:

        FAIL [   0.036s] gix-commitgraph::commitgraph access::generation_numbers_overflow_is_handled_in_chained_graph

--- STDOUT:              gix-commitgraph::commitgraph access::generation_numbers_overflow_is_handled_in_chained_graph ---

running 1 test
test access::generation_numbers_overflow_is_handled_in_chained_graph ... FAILED

failures:

failures:
    access::generation_numbers_overflow_is_handled_in_chained_graph

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 7 filtered out; finished in 0.03s


--- STDERR:              gix-commitgraph::commitgraph access::generation_numbers_overflow_is_handled_in_chained_graph ---
failed to extract 'tests/fixtures/generated-archives/generation_number_overflow.tar': Ignoring archive at 'tests/fixtures/generated-archives/generation_number_overflow.tar' as GIX_TEST_IGNORE_ARCHIVES is set.
thread 'access::generation_numbers_overflow_is_handled_in_chained_graph' panicked at tests/tools/src/lib.rs:567:17:
fixture script of cd "tests/fixtures/generated-do-not-edit/generation_number_overflow/2968487077-unix" && env -u GIT_ALTERNATE_OBJECT_DIRECTORIES -u GIT_ASKPASS -u GIT_COMMON_DIR -u GIT_DIR -u GIT_INDEX_FILE -u GIT_OBJECT_DIRECTORY -u GIT_WORK_TREE -u SSH_ASKPASS GIT_AUTHOR_DATE="2000-01-01 00:00:00 +0000" GIT_AUTHOR_EMAIL="[email protected]" GIT_AUTHOR_NAME="author" GIT_COMMITTER_DATE="2000-01-02 00:00:00 +0000" GIT_COMMITTER_EMAIL="[email protected]" GIT_COMMITTER_NAME="committer" GIT_CONFIG_COUNT="4" GIT_CONFIG_GLOBAL="/dev/null" GIT_CONFIG_KEY_0="commit.gpgsign" GIT_CONFIG_KEY_1="tag.gpgsign" GIT_CONFIG_KEY_2="init.defaultBranch" GIT_CONFIG_KEY_3="protocol.file.allow" GIT_CONFIG_NOSYSTEM="1" GIT_CONFIG_VALUE_0="false" GIT_CONFIG_VALUE_1="false" GIT_CONFIG_VALUE_2="main" GIT_CONFIG_VALUE_3="always" GIT_TERMINAL_PROMPT="false" MSYS=" winsymlinks:nativestrict" "/__w/gitoxide/gitoxide/gix-commitgraph/tests/fixtures/generation_number_overflow.sh" failed: stdout: Initialized empty Git repository in /__w/gitoxide/gitoxide/gix-commitgraph/tests/fixtures/generated-do-not-edit/generation_number_overflow/2968487077-unix/.git/
[main (root-commit) 631f540] future-1
 Author: author <[email protected]>
 1 file changed, 1 insertion(+)
 create mode 100644 future-1.t

stderr: fatal: Timestamp too large for this system: 4147483646

note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Analysis of the current failure

The test calls graph_and_expected:

gitoxide/gix-commitgraph/tests/access/mod.rs

Line 58 in 438ee47

    
           let (cg, mut refs) = graph_and_expected("generation_number_overflow.sh", &names);

graph_and_expected calls graph_and_expected_named, which calls scripted_fixture_read_only to run the generation_number_overflow.sh fixture. It wasn't immediately obvious, from the output shown above, what the failure is. But temporarily adding set +x revealed it:

+ commit future-1 '@4147483646 +0000'
+ local message=future-1
+ local 'date=@4147483646 +0000'
+ local file=future-1.t
+ echo future-1
+ git add -- future-1.t
+ '[' -n '@4147483646 +0000' ']'
+ export 'GIT_COMMITTER_DATE=@4147483646 +0000'
+ GIT_COMMITTER_DATE='@4147483646 +0000'
+ git commit -m future-1
+ git tag future-1
fatal: Timestamp too large for this system: 4147483646

git commit accepts the timestamp 4147483646 as the committer date. But then, when git tag is run to tag that commit, it rejects the timestamp!

Deeper analysis

The situation becomes even more interesting when we examine how git interprets the timestamp on the commit. Here's a simplified experiment:

$ git init repo
Initialized empty Git repository in /home/ek/src/repo/.git/
$ cd repo
$ touch a
$ git add a
$ GIT_COMMITTER_DATE='@4147483646 +0000' git commit -m 'Initial commit'
[main (root-commit) 99b1de4] Initial commit
    1 file changed, 0 insertions(+), 0 deletions(-)
    create mode 100644 a
$ git log --pretty=fuller
commit 99b1de4e0610ff715e784bd4089dc315e8071332 (HEAD -> main)
Author:     Eliah Kagan <[email protected]>
AuthorDate: Sat Nov 23 01:05:12 2024 -0500
Commit:     Eliah Kagan <[email protected]>
CommitDate: Thu Jan 1 00:00:00 1970 +0000

    Initial commit
$ git tag t
fatal: Timestamp too large for this system: 4147483646
$ git tag -a t
fatal: Timestamp too large for this system: 4147483646

The timestamp was accepted, but the reported commit date in git log is:

Thu Jan 1 00:00:00 1970 +0000

This is, as one would imagine, specific to 32-bit systems. (I also do not know for certain that it happens on all 32-bit systems, but my guess is that it happens with all 32-bit builds of Git on at least some prominent 32-bit platforms, including 32-bit x86 where I have observed it.) On 64-bit systems, we get a future date and there are no errors.

Why it passes without `GIX_TEST_IGNORE_ARCHIVES`: 32-bit `git` uses the tag if present

A repository with such a tag can be cloned on a 32-bit system. The commit has the overflowed date in 1970. But the tag is present. It is a packed tag, in that scenario. But git on a 32-bit system has the ability to recognize it as a loose tag just as well, which happens if we bring the repository over from another system, as happens with a generated archive.

It also works if we create the tag by manually creating a file in .git/refs/tags. This allows a repository that, as far as I can tell, is equivalent to what we get when we unpack a generated archive, to be created on a 32-bit system where git tag would produce an error. I wonder if there's a Git feature that just lets us tell it to go ahead and create the tag, but I looked for such a feature and did not find it. I also wonder (as detailed below) if this general approach--of trying to create the tag on 32-bit systems at all--is good...

Changes in this pull request

...but that's what I've done here. This PR:

Sets GIX_TEST_IGNORE_ARCHIVES=1 in the test-32bit CI job. Ideally we would test both with and without it, either in separate jobs or with a partial clean, covering just generated archives, in between. I'm not sure the added work on CI of running the tests another time on the same 32-bit platform is justified, though. For now it only runs with that set. But I've verified locally that, after all changes here, all tests pass both with and without GIX_TEST_IGNORE_ARCHIVES on a 32-bit Debian 12 system.
Modifies the fixture script so that, if it cannot create the tag with git tag, then it checks if git is a 32-bit build and, if so, writes a loose tag file explicitly. The logic for checking if Git is 32-bit and attempting the fallback tactic of creating the loose tag file explicitly is in its own function.
Refactors the fixture script so that it is clearer what is going on. The fixture script had a commit function that performs multiple operations including committing and creating a tag. This renames it to tagged_commit and makes a few other changes.
Regenerates the committed archive.

Further analysis / alternatives and future directions

It may be that more substantial changes should be made. Some of the operations in the fixture script had been inspired or derived from one of the tests in the Git test suite:

gitoxide/gix-commitgraph/tests/fixtures/generation_number_overflow.sh

Line 32 in 438ee47

# adapted from git/t/t5318 'lower layers have overflow chunk'

That test in Git is t5328-commit-graph-64bit-time.sh. The test description is "commit graph with 64-bit timestamps" and it only runs when prerequisites are satisfied related to the availability of 64-bit timestamps.

I don't think this necessarily means that gitoxide's test suite should not run tests on a repository with such timestamps. The test does pass on 32-bit Debian 12 when it uses the repository from the committed archive, as well as when it generates the repository using the modified fixture script introduced in this PR.

The other issue--I'm not sure if it's related--is that it looks like the test that uses this fixture is believed already, and presumably not specifically on 32-bit systems, not to be fully effective. The assertion in the test is:

gitoxide/gix-commitgraph/tests/access/mod.rs

Lines 64 to 68 in 438ee47

    
           assert_eq!( 
        
               cg.commit_by_id(r.id).expect("present").generation(), 
        
               expected, 
        
               "actually, this test seems to have valid generation numbers from the get-go. How to repro the actual issue?" 
        
           );

So maybe there is another change that should be made to the fixture script.

This temporarily instruments a fixture script to reveal how the `access::generation_numbers_overflow_is_handled_in_chained_graph` test fails when `GIX_TEST_IGNORE_ARCHIVES=1` is set: the fixture script `generation_number_overflow.sh` runs, and it sets the timestamp `@4147483646` on a commit, which `git` permits. But then `git tag` rejects this timestamp as being out of range. On stderr: + commit future-1 '@4147483646 +0000' + local message=future-1 + local 'date=@4147483646 +0000' + local file=future-1.t + echo future-1 + git add -- future-1.t + '[' -n '@4147483646 +0000' ']' + export 'GIT_COMMITTER_DATE=@4147483646 +0000' + GIT_COMMITTER_DATE='@4147483646 +0000' + git commit -m future-1 + git tag future-1 fatal: Timestamp too large for this system: 4147483646 The nature of the problem is *partially* illuminated by the following experiment on a 32-bit Debian 12 system: $ git init repo Initialized empty Git repository in /home/ek/src/repo/.git/ $ cd repo $ touch a $ git add a $ GIT_COMMITTER_DATE='@4147483646 +0000' git commit -m 'Initial commit' [main (root-commit) 99b1de4] Initial commit 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 a $ git log --pretty=fuller commit 99b1de4e0610ff715e784bd4089dc315e8071332 (HEAD -> main) Author: Eliah Kagan <[email protected]> AuthorDate: Sat Nov 23 01:05:12 2024 -0500 Commit: Eliah Kagan <[email protected]> CommitDate: Thu Jan 1 00:00:00 1970 +0000 Initial commit $ git tag t fatal: Timestamp too large for this system: 4147483646 $ git tag -a t fatal: Timestamp too large for this system: 4147483646 The timestamp is accepted when committing, and written as actor metadata for the committer. But is is not (or at least not always) interpreted as intended: as `git log` shows it, it has wrapped around to 1970. Even though the commit was created successfully, attempting to make a tag that points to it fails. This happens both for a lightweight tag (as in the fixture script) and an annotated tag, so making it a tag object does not seem to be a workaround.

This is a workaround for the problem where `git tag` rejects the timestamp `4147483646` from `generation_number_overflow.sh` as out of range, even though it has already been accepted to create a commit and `git tag` is obtaining it from the commit metadata. The test already passed if run without `GIX_TEST_IGNORE_ARCHIVES`, even though the commit and tag metadata in the committed archive repo has this timestamp interpreted as from 1970 rather than in the far future. The change here just allows the fixture script to create the repository on a 32-bit system where `git` would refuse to make the tag, by manually creating a loose tag. (Creating a packed tag would also work, but would not improve the ambiguity.)

Byron

Thanks again for your help!

It's good to see that the fixture scripts are this portable, and to have a test run now that prevents that from changing.

I wonder if there's a Git feature that just lets us tell it to go ahead and create the tag, but I looked for such a feature and did not find it. I also wonder (as detailed below) if this general approach--of trying to create the tag on 32-bit systems at all--is good...

I think this should be related to size_t used for time_t so Git probably isn't capable of holding this number. I wonder if it has to be a problem for gitoxide thought, we could just declare the respective types as u64 and be good - for gix-date I am pretty sure that's done as well. This is probably no more than a note.

That test in Git is t5328-commit-graph-64bit-time.sh. The test description is "commit graph with 64-bit timestamps" and it only runs when prerequisites are satisfied related to the availability of 64-bit timestamps.

I see, that probably explains why 32bit systems are not able to read 64 bit timestamps: by design. I still wonder if a 32bit built could be made to handle this fixture, but maybe there isn't any problem and it's really just about Git not being able to produce the fixture on 32 bit systems.
I think that's a convoluted way of wondering if a 32bit build can read commit-graph caches that are created on 64 bit systems. Probably that's not really an issue even if the answer is negative as one can always recreate these

EliahKagan · 2024-11-23T21:43:38Z

I think this should be related to size_t used for time_t so Git probably isn't capable of holding this number.

Some platforms have time_t as the same size as size_t, but not all. For example, in sufficiently new versions of Debian, all 32-bit architectures other than 32-bit x86 have 64-bit time_t. The armhf architecture is a fairly prominent example of a 32-bit architecture that has 64-bit time_t on recent GNU/Linux systems. This required an ABI change to do, but was still done, due to the continued significance of armhf and the need to solve the Year 2038 problem. (Since 32-bit x86 is often used for binary compatibility with existing executables, it is the exception, at least so far.)

I wonder if it has to be a problem for gitoxide thought, we could just declare the respective types as u64 and be good - for gix-date I am pretty sure that's done as well. This is probably no more than a note.

If gitoxide is using a type like usize or isize anywhere for timestamps--especially if it is not conditional on the target being known to require 32-bit time_t--then I think that should be changed to u64 or i64 if at all possible. More operating system versions for 32-bit architectures will tend to have 64-bit time_t as 2038 approaches and passes, even when an ABI break is needed to do it. (I think it is likely that some 32-bit architectures will continue to be used for significant purposes, including in new devices. However, even if that is not the case, I think this would be a correctness issue when targeting a 32-bit ABI where time_t is 64-bit.)

But I acknowledge that "if at all possible" is an important qualification. Timestamps that are stored in indexes or caches, if git represents them differently on different platforms, will have to be readable and, if created, will have to be created in a way that git on that platform can understand them, at least in data that are not available elsewhere, or in caches that cannot be recreated inexpensively enough.

It occurs to me that the failures on s390x mentioned in #1687 might be relevant to this and might be illuminating. I have not yet investigated them. Their significance there was merely that none of them involved any data structure size assertions--that is, I mentioned s390x there merely as evidence that our asserted 64-bit data structure sizes don't seem to vary across platforms.

But, while not related to anything being done in #1687, on s390x there are a few tests that fail, and the failures do seem related to some data structures having unexpected representations. s390x is 64-bit, but bigendian. The failures even when reusing generated archives are:

gix::gix remote::fetch::blocking_and_async_io::fetch_pack
gix::gix remote::fetch::blocking_and_async_io::fetch_pack_without_local_destination

When ignoring generated archives with GIX_TEST_IGNORE_ARCHIVES=1 to re-run all fixture scripts, those failures occur, as well as three others. One of those three others is probably not relevant to s390x because it usually occurs on other platforms and brought about by changes in git 2.47.0; this is due to #1622. But the other two failures are--like the two above--s390x-specific:

gix-pack-tests::pack pack::multi_index::access::general
gix-pack-tests::pack pack::multi_index::verify::integrity

At this time, I haven't opened an issue about this or looked into it in detail. But full test output on s390x, showing those failures, for both runs, can be seen in this gist.

I see, that probably explains why 32bit systems are not able to read 64 bit timestamps: by design. I still wonder if a 32bit built could be made to handle this fixture, but maybe there isn't any problem and it's really just about Git not being able to produce the fixture on 32 bit systems.

Even if git is built with a 32-bit time_t and cannot represent 64-bit timestamps, that still does not explain the curious behavior observed here, all of which is observed even when no git commit-graph command is involved:

The commit is given the timestamp that overflows if interpreted as 32-bit time_t, and git creates that commit on the 32-bit system without objection.
When git log is run on the 32-bit system to view the time, it is shown as having rolled over to the Unix epoch in 1970.
Attempting to create a lightweight tag fails on the grounds that the timestamp of the commit git just created is out of range. (The same thing happens when creating an annotated tag, but this fixture doesn't exercise that.) Creating a tag that refers to a commit does not inherently involve processing a timestamp: the tag holds the object ID of the commit, not other information about the commit. Somehow, more stringent validation happens on the commit's timestamp when a tag referring to the commit is created, than when the commit itself is created.
There is no loss of information in the commit's own timestamp. The timestamp in the repository has not rolled over, even though at this point all operations have been performed with a native build of git on a 32-bit platform where time_t is 32-bit, and even though this git shows the date as having rolled over. The rollover in git log output apparently happens when formatting the timestamp as text, and should possibly be considered a bug in git, depending on the intended semantics of that formatting operation. I can push the repository to a remote, then clone it on a 64-bit system, and it has the originally intended non-rolled-over future timestamp.

It is because of (4) that the change in this PR does not (as far as I know) add any new incorrect behavior. If not for (4), we would not want to allow a non-.gitignored archive to be generated when the fixture script uses a git command built against a libc implementation with a 32-bit time_t, since such an archive would have the wrong timestamp and other systems would be able to observe the difference in the actor metadata in the archived fixture's commit history.

To elaborate, here's an experiment that verifies this, involving two machines: Zim, a 32-bit x86 system; and catenary, a 64-bit x86 system. Starting in a newly created repository on the 32-bit system:

ek@Zim:~/repos/tag32 (main #)$ echo 1 >a
ek@Zim:~/repos/tag32 (main #%)$ git add a
ek@Zim:~/repos/tag32 (main +)$ GIT_COMMITTER_DATE='@0 +0000' git commit -m 'First commit'
[main (root-commit) 916fb33] First commit
 1 file changed, 1 insertion(+)
 create mode 100644 a
ek@Zim:~/repos/tag32 (main)$ echo 2 >a
ek@Zim:~/repos/tag32 (main *)$ git add a
ek@Zim:~/repos/tag32 (main +)$ GIT_COMMITTER_DATE='@4147483646 +0000' git commit -m 'Second commit'
[main 5baa3b8] Second commit
 1 file changed, 1 insertion(+), 1 deletion(-)
ek@Zim:~/repos/tag32 (main)$ git log --pretty=fuller
commit 5baa3b82fccabc8f071e877977d3f57ce5eee897 (HEAD -> main)
Author:     Eliah Kagan <[email protected]>
AuthorDate: Sat Nov 23 14:34:04 2024 -0500
Commit:     Eliah Kagan <[email protected]>
CommitDate: Thu Jan 1 00:00:00 1970 +0000

    Second commit

commit 916fb33758c8cf3da0b156363624a85c9b7cb5eb
Author:     Eliah Kagan <[email protected]>
AuthorDate: Sat Nov 23 14:33:42 2024 -0500
Commit:     Eliah Kagan <[email protected]>
CommitDate: Thu Jan 1 00:00:00 1970 +0000

    First commit

I then pushed to a remote and cloned the repository on the 64-bit system, on which:

ek in 🌐 catenary in tag32 on  main
❯ git log --pretty=fuller
commit 5baa3b82fccabc8f071e877977d3f57ce5eee897 (HEAD -> main, origin/main, origin/HEAD)
Author:     Eliah Kagan <[email protected]>
AuthorDate: Sat Nov 23 14:34:04 2024 -0500
Commit:     Eliah Kagan <[email protected]>
CommitDate: Mon Jun 6 06:47:26 2101 +0000

    Second commit

commit 916fb33758c8cf3da0b156363624a85c9b7cb5eb
Author:     Eliah Kagan <[email protected]>
AuthorDate: Sat Nov 23 14:33:42 2024 -0500
Commit:     Eliah Kagan <[email protected]>
CommitDate: Thu Jan 1 00:00:00 1970 +0000

    First commit

There, the second commit has the originally intended 2101 date, even though it was shown as having the 1970 epoch date on the machine where it was created originally. Furthermore, the first commit has the intended 1970 epoch date on both, so the 32-bit system on which the repository is created really is creating the commits with different timestamps, even though that 32-bit system has 32-bit time_t.

(As noted above, some 32-bit systems have 64-bit time_t. But the test system here is known to have 32-bit time_t, both because it is a Debian i386 system, and because I have verified that sizeof(time_t) is 4 on this system, with a small C test program.)

That the commit timestamp has not really overflowed is, I want to say, no surprise--git objects are meant to be shared across systems of different architectures, after all, and git repositories are intended to be able to hold 64-bit timestamps. But I do find it surprising that the subsequent tag creation operation fails on the 32-bit system.

I think that's a convoluted way of wondering if a 32bit build can read commit-graph caches that are created on 64 bit systems. Probably that's not really an issue even if the answer is negative as one can always recreate these

Since 32-bit git accepts and preserves 64-bit timestamps in some cases, including a case where it writes the timestamp into a Git object (git commit), while rejecting them in other cases, including a case where it does not write the timestamp and where (as far as I know) the timestamp is not part of the semantics of the operation at all (git tag, where the timestamp at issue is the already-present one in the commit object being tagged), it seems to me we cannot conclude from any of what has been seen here what 32-bit git is able to do with commit-graph caches.

Although git commit-graph is involved in the actual fixture script, the simplified experiments -- both the experiment described in the PR description (see also 595929b), and the experiment shown earlier in this comment (where the 64-bit timestamp in the commit is shown to be preserved) -- do not use it.

Byron · 2024-11-24T07:09:46Z

Thanks a lot for sharing!

At this time, I haven't opened an issue about this or looked into it in detail. But full test output on s390x, showing those failures, for both runs, can be seen in this gist.

This is very interesting. What stuck out to the difference in compressed size, which was 49 but should be 46. I have no idea how that comes to be though, but also didn't look into it.

4. The rollover in git log output apparently happens when formatting the timestamp as text, and should possibly be considered a bug in git, depending on the intended semantics of that formatting operation. I can push the repository to a remote, then clone it on a 64-bit system, and it has the originally intended non-rolled-over future timestamp.

I assume this was done because git cat-file commit <commit> didn't work due to validation as well?

It is because of (4) that the change in this PR does not (as far as I know) add any new incorrect behavior. If not for (4), we would not want to allow a non-.gitignored archive to be generated when the fixture script uses a git command built against a libc implementation with a 32-bit time_t, since such an archive would have the wrong timestamp and other systems would be able to observe the difference in the actor metadata in the archived fixture's commit history.

Indeed!

Thanks again for sharing these details, they are illuminating.

EliahKagan · 2024-11-24T09:13:22Z

This is very interesting. What stuck out to the difference in compressed size, which was 49 but should be 46. I have no idea how that comes to be though, but also didn't look into it.

Ordinarily, I would guess that this is due to different data structure layouts across architectures, which due to alignment restrictions can result in data structures being of different sizes. The reason I had tested with s390x when researching #1687 was the idea that maybe some 64-bit size assertions with == could fail on s390x. None tested did.

That's a very weak guess. If I figure out more about that, I'll let you know (probably in an issue, unless I also figure out how to fix it).

The rollover in git log output apparently happens when formatting the timestamp as text, and should possibly be considered a bug in git, depending on the intended semantics of that formatting operation. I can push the repository to a remote, then clone it on a 64-bit system, and it has the originally intended non-rolled-over future timestamp.

I assume this was done because git cat-file commit <commit> didn't work due to validation as well?

git-cat-file shows the commit with no problems. git-fsck will report an error, though.

ek@Zim:~/repos/tag32 (main =)$ git log --pretty=fuller
commit 5baa3b82fccabc8f071e877977d3f57ce5eee897 (HEAD -> main, origin/main)
Author:     Eliah Kagan <[email protected]>
AuthorDate: Sat Nov 23 14:34:04 2024 -0500
Commit:     Eliah Kagan <[email protected]>
CommitDate: Thu Jan 1 00:00:00 1970 +0000

    Second commit

commit 916fb33758c8cf3da0b156363624a85c9b7cb5eb
Author:     Eliah Kagan <[email protected]>
AuthorDate: Sat Nov 23 14:33:42 2024 -0500
Commit:     Eliah Kagan <[email protected]>
CommitDate: Thu Jan 1 00:00:00 1970 +0000

    First commit
ek@Zim:~/repos/tag32 (main =)$ git tag t
fatal: Timestamp too large for this system: 4147483646
ek@Zim:~/repos/tag32 (main =)[128]$ git cat-file commit 5baa3b82fccabc8f071e877977d3f57ce5eee897
tree 02439fb50f442f187d6db9a7b47287b9e3f5d49c
parent 916fb33758c8cf3da0b156363624a85c9b7cb5eb
author Eliah Kagan <[email protected]> 1732390444 -0500
committer Eliah Kagan <[email protected]> 4147483646 +0000

Second commit
ek@Zim:~/repos/tag32 (main =)$ git cat-file commit 916fb33758c8cf3da0b156363624a85c9b7cb5eb
tree 2ce1eef76631e82282e0f7f0cf9e6f3e9a4a0b1e
author Eliah Kagan <[email protected]> 1732390422 -0500
committer Eliah Kagan <[email protected]> 0 +0000

First commit
ek@Zim:~/repos/tag32 (main =)$ git fsck
error in commit 5baa3b82fccabc8f071e877977d3f57ce5eee897: badDateOverflow: invalid author/committer line - date causes integer overflow
Checking object directories: 100% (256/256), done.

I don't think tag creation via git tag runs most git fsck checks though, so I don't think the git fsck output, by itself, explains the situation.

The successful invocations of git cat-file commit shown above pass OIDs, but HEAD and HEAD^ produce the same output. Likewise, git tag t, git tag t HEAD, and git tag t 5baa3b82fccabc8f071e877977d3f57ce5eee897 all fail with the same message. So the way these commands are being told what commit to use does not seem important to the problem.

EliahKagan · 2024-11-25T01:06:16Z

gix-commitgraph/tests/fixtures/generation_number_overflow.sh

-tick
-function commit() {
-  local message=${1:?first argument is the commit message}
+function force_tag() {


Maybe this should be named something else. Calling it force_tag makes it sound like it does something similar to git tag -f, when actually:

force_tag tries (though not very hard) to avoid resetting an existing tag.

git tag -f will still refuse to create a tag in the scenarios where this is intended to be called (or I would've just used git tag -f).

(I had meant to bring this up before the PR was merged, but forgot.)

This further customizes the `cross` Dockerfile so it patches the `/android-runner` script in the image so, when the script runs, it checks if something like `NO_PRELOAD_CXX=1` is set (specifically, if `NO_PRELOAD_CXX` exists with a value other than the empty string or `0`) and, if so, refrains from setting `LD_PRELOAD` to the path of an Android C++ standard library implementation. (This also adds `NO_PRELOAD_CXX` to the list of variables `cross` allows to pass from host to container environment, so it can be set easily.) The problem this solves is warnings, which appear to be errors based on their wording but which do not inherently cause failure, that a library cannot be loaded into executables that are for a different architecture. `ld.so` issues these when running x86-64 executables in the `cross` container for `armv7-linux-androideabi`. Currently that includes shells (`sh`, `bash`) and `git`. ERROR: ld.so: object '/android-ndk/sysroot/usr/lib/arm-linux-androideabi/libc++_shared.so' from LD_PRELOAD cannot be preloaded (wrong ELF class: ELFCLASS32): ignored. While the warnings do not inherently cause failure, removing them is not just to decrease noise in the output of test runs. Because the messages are sent to stderr, they are picked up in code that expects stderr to be of a particular form, including test cases that include assertions about what is written to standard error. In addition to not representing actual bugs, the resulting test failures are confusing because they consist of a message that a library cannot be preloaded for a platform-related reason ("wrong ELF class"), followed by the failure of a test that exercises code that runs an external command (or possibly, in some cases, tests that capture stderr in a fixture script to set up a test). That creates the appearance that the external command could not be run, when actually the problem is that the external command ran, often with no problems at all, but its captured stderr contained one or (if it ran its own subprocesses) more extra lines describing themselves as errors. In e94526e and 094023a (both in GitoxideLabs#1687), I had claimed in comments that the reason to test `gix-hashtable` rather than `gix` in the `armv7-linux-androideabi` job (since removed) was that, in `armv7-linux-androideabi`, some `gix` tests needed to be able to run `git`, and needed ARMv7 `git`. This is technically true, but misleading (and while the logs have expired so I cannot check what I saw at that time, I suspect I was myself misled). They have relied on ARMv7 `git` on such a platform in the sense that the x86-64 `git` is completely usable but causes "wrong ELF class" warnings, which appear to be errors, to be written to stderr, breaking tests or test fixtures that examine stderr (as well as any test that attempts to read a `once_cell::sync::Lazy` that a previous test run in the same process tried to initialize by running code that panicked due to such a failure -- when control exits a `Lazy` initializer due to unwinding *from a panic*, the cell enters a "poisoned" state and subsequent uses of it fail). If an ARMv7 `git` were to be used, then the shared library -- even if `git` didn't use the library -- would presumably be preloaded, which would avoid the warning. But the problem was not inherent to the architecture or build of binares like `git`, instead arising from the incompatible `LD_PRELOAD` value. (Also, as far as I know, we have so far never used an ARMv7 build of `git` in any `cross` test... though it is used in the current `test-32bit` arm32v7 job.) Avoiding setting `LD_PRELOAD`, as done here, fixes that. However, this is not an ideal solution, because the library may be needed if an external command is built for the Android target architecture and is written in C++. It might be ideal to suppress the warnings while still attempting to load the library, but it's not clear if a reasonably simple way to do so is available: - `LD_PRELOAD` paths support `$LIB`, but I am not sure it can be used here, because the library name, and not just the directory prefix, differs. (Also, when not attempting to link to the C++ stdlib for Android, nothing would *ordinarily* be added to `LD_PRELOAD` at all.) - glibc issue https://sourceware.org/bugzilla/show_bug.cgi?id=20172 proposes a new environment variable `LD_PRELOAD_IGNORE_ERRORS`, but no patch with that feature has been merged into any version of glibc at this time (so certainly not the fairly old glibc in the `cross` image). The other reason this is not an ideal solution is that, at least if testing with `cross` is to become fully useful, some external commands in the `cross` container *should* be built for the target architecture, because in practice those are the builds (or in some cases, ports) that gitoxide would have to be compatible with. That includes `git` and, if feasible, `sh`, and maybe some others such as `gpg` and (if sufficiently tested to matter) `ssh`. So while the existing workaround makes it so ARMv7 `git` is not necessary to keep the tests from failing when they should pass, it remains potentially necessary to keep the tests from passing when they should fail. (This also applies to other architectures tested with `cross`, particularly s390x, where failure occurring in native runs, mentioned in GitoxideLabs#1687 and discused in GitoxideLabs#1698 comments, have not yet been reproducible when testing with `cross`. The reason is *possibly* some difference between the behavior of x86-64 and s390x `git`.) However, even if that can be achieved, it will likely also be an incomplete solution on its own, because as long as we may ever capture stderr from *some* external command where the warning is issued, we have to suppress or tolerate the warning in some way.

This avoids installing `software-properties-common` in the container, instead manually setting up the PPA again. Accordingly, it is not necessary to remove any packages to avoid high bloat in the image, so that is removed. Other cleanup is still done. This also removes the old `git` and packages that were marked as automatically installed that were only needed as its dependencies, and shows details about the `git` that is installed. For s390x, this works to cause the test suite to use a s390x build of `git`, at a reasonably new version as provided by the PPA. That should happen for some other targets, but not all. This therefore shows the `git` version as well as examining the executable to report its architecture while the container is being built. Note that this does not yet appear sufficient to produce the s390x test failures mentioned in GitoxideLabs#1687 and discused in GitoxideLabs#1698 comments. (That might relate to how this is still just testing `max-pure`.)

This avoids installing `software-properties-common` in the container, instead manually setting up the PPA again. Accordingly, it is not necessary to uninstall any packages to avoid high bloat in the image, so that is removed. Other cleanup is still done. This also removes the old `git` and packages that were marked as automatically installed that were only needed as its dependencies, and shows details about the `git` that is installed. For s390x, this works to cause the test suite to use a s390x build of `git`, at a reasonably new version as provided by the PPA. That should happen for some other targets, but not all. This therefore shows the `git` version as well as examining the executable to report its architecture while the container is being built. Note that this does not yet appear sufficient to produce the s390x test failures mentioned in GitoxideLabs#1687 and discused in GitoxideLabs#1698 comments. (That might relate to how this is still just testing `max-pure`.)

This modifies the `cross` dockerfile and associated `justfile` recipes to install dependencies needed for testing beyond max-pure. This works when installing packages for the cross target is already working, which is the same scenario where the upgraded `git` (from the git-core PPA) is installed for the `cross` target. But unlike `git`, when testing max-pure it is not necessary to install these libraries. Yet the current approach does install them... but native builds of them. This is inefficient in the Android test where there may be no immediately forthcoming way to install appropriate `cross`-target builds of the libraries. This extra work should probably be eliminated. But the best way to do that will depend on whether the two cases of cross targets (those we can install packages for and those we cannot) remain only two cases, and whether a single `RUN` script continues to be used, versus splitting it (or creating multiple `cross` dockerfiles). This also gives the `cross-test` recipe a `*cargo-test-args` parameter, and moves `--no-default-features --features max-pure` out of `cross-test` and into the arguments `cross-test-android` passes to `cross-test` (causing `*cargo-test-args` to receive it). The reason for doing this rather than having a parameter that is put in place of `max-pure`, for which `cross-test-s390x` could pass `max`, is that it is not obvious that `--no-default-features` is free of undesired effects when `--workspace` is used: it seems like it may suppress some features of some tested `gix-*` crate projects that do not individually recognize `max`. (This differs from the situation with the journey tests, which test the `ein` and `gix` executables from by the `gitoxide` crate, which recognizes `max-pure`.) The intuitive effect of "testing max" is currently to run the entire test suite with all default features enabled, so this strives to do that. This still does not appear sufficient to produce the s390x test failures mentioned in GitoxideLabs#1687 and discused in GitoxideLabs#1698 comments, even though this is now running s390x tests with the same full default features as were used when those failures were observed.