Skip to content

Commit ca327cf

Browse files
dschoderrickstolee
authored andcommitted
Merge core VFS features
These were done in private, before microsoft/git. Signed-off-by: Derrick Stolee <[email protected]>
2 parents fb59c1d + a203316 commit ca327cf

27 files changed

+859
-15
lines changed

Documentation/config/core.txt

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -650,6 +650,48 @@ core.multiPackIndex::
650650
single index. See linkgit:git-multi-pack-index[1] for more
651651
information. Defaults to true.
652652

653+
core.gvfs::
654+
Enable the features needed for GVFS. This value can be set to true
655+
to indicate all features should be turned on or the bit values listed
656+
below can be used to turn on specific features.
657+
+
658+
--
659+
GVFS_SKIP_SHA_ON_INDEX::
660+
Bit value 1
661+
Disables the calculation of the sha when writing the index
662+
GVFS_MISSING_OK::
663+
Bit value 4
664+
Normally git write-tree ensures that the objects referenced by the
665+
directory exist in the object database. This option disables this check.
666+
GVFS_NO_DELETE_OUTSIDE_SPARSECHECKOUT::
667+
Bit value 8
668+
When marking entries to remove from the index and the working
669+
directory this option will take into account what the
670+
skip-worktree bit was set to so that if the entry has the
671+
skip-worktree bit set it will not be removed from the working
672+
directory. This will allow virtualized working directories to
673+
detect the change to HEAD and use the new commit tree to show
674+
the files that are in the working directory.
675+
GVFS_FETCH_SKIP_REACHABILITY_AND_UPLOADPACK::
676+
Bit value 16
677+
While performing a fetch with a virtual file system we know
678+
that there will be missing objects and we don't want to download
679+
them just because of the reachability of the commits. We also
680+
don't want to download a pack file with commits, trees, and blobs
681+
since these will be downloaded on demand. This flag will skip the
682+
checks on the reachability of objects during a fetch as well as
683+
the upload pack so that extraneous objects don't get downloaded.
684+
GVFS_BLOCK_FILTERS_AND_EOL_CONVERSIONS::
685+
Bit value 64
686+
With a virtual file system we only know the file size before any
687+
CRLF or smudge/clean filters processing is done on the client.
688+
To prevent file corruption due to truncation or expansion with
689+
garbage at the end, these filters must not run when the file
690+
is first accessed and brought down to the client. Git.exe can't
691+
currently tell the first access vs subsequent accesses so this
692+
flag just blocks them from occurring at all.
693+
--
694+
653695
core.sparseCheckout::
654696
Enable "sparse checkout" feature. See linkgit:git-sparse-checkout[1]
655697
for more information.
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
Read Object Process
2+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
3+
4+
The read-object process enables Git to read all missing blobs with a
5+
single process invocation for the entire life of a single Git command.
6+
This is achieved by using a packet format (pkt-line, see technical/
7+
protocol-common.txt) based protocol over standard input and standard
8+
output as follows. All packets, except for the "*CONTENT" packets and
9+
the "0000" flush packet, are considered text and therefore are
10+
terminated by a LF.
11+
12+
Git starts the process when it encounters the first missing object that
13+
needs to be retrieved. After the process is started, Git sends a welcome
14+
message ("git-read-object-client"), a list of supported protocol version
15+
numbers, and a flush packet. Git expects to read a welcome response
16+
message ("git-read-object-server"), exactly one protocol version number
17+
from the previously sent list, and a flush packet. All further
18+
communication will be based on the selected version.
19+
20+
The remaining protocol description below documents "version=1". Please
21+
note that "version=42" in the example below does not exist and is only
22+
there to illustrate how the protocol would look with more than one
23+
version.
24+
25+
After the version negotiation Git sends a list of all capabilities that
26+
it supports and a flush packet. Git expects to read a list of desired
27+
capabilities, which must be a subset of the supported capabilities list,
28+
and a flush packet as response:
29+
------------------------
30+
packet: git> git-read-object-client
31+
packet: git> version=1
32+
packet: git> version=42
33+
packet: git> 0000
34+
packet: git< git-read-object-server
35+
packet: git< version=1
36+
packet: git< 0000
37+
packet: git> capability=get
38+
packet: git> capability=have
39+
packet: git> capability=put
40+
packet: git> capability=not-yet-invented
41+
packet: git> 0000
42+
packet: git< capability=get
43+
packet: git< 0000
44+
------------------------
45+
The only supported capability in version 1 is "get".
46+
47+
Afterwards Git sends a list of "key=value" pairs terminated with a flush
48+
packet. The list will contain at least the command (based on the
49+
supported capabilities) and the sha1 of the object to retrieve. Please
50+
note, that the process must not send any response before it received the
51+
final flush packet.
52+
53+
When the process receives the "get" command, it should make the requested
54+
object available in the git object store and then return success. Git will
55+
then check the object store again and this time find it and proceed.
56+
------------------------
57+
packet: git> command=get
58+
packet: git> sha1=0a214a649e1b3d5011e14a3dc227753f2bd2be05
59+
packet: git> 0000
60+
------------------------
61+
62+
The process is expected to respond with a list of "key=value" pairs
63+
terminated with a flush packet. If the process does not experience
64+
problems then the list must contain a "success" status.
65+
------------------------
66+
packet: git< status=success
67+
packet: git< 0000
68+
------------------------
69+
70+
In case the process cannot or does not want to process the content, it
71+
is expected to respond with an "error" status.
72+
------------------------
73+
packet: git< status=error
74+
packet: git< 0000
75+
------------------------
76+
77+
In case the process cannot or does not want to process the content as
78+
well as any future content for the lifetime of the Git process, then it
79+
is expected to respond with an "abort" status at any point in the
80+
protocol.
81+
------------------------
82+
packet: git< status=abort
83+
packet: git< 0000
84+
------------------------
85+
86+
Git neither stops nor restarts the process in case the "error"/"abort"
87+
status is set.
88+
89+
If the process dies during the communication or does not adhere to the
90+
protocol then Git will stop the process and restart it with the next
91+
object that needs to be processed.
92+
93+
After the read-object process has processed an object it is expected to
94+
wait for the next "key=value" list containing a command. Git will close
95+
the command pipe on exit. The process is expected to detect EOF and exit
96+
gracefully on its own. Git will wait until the process has stopped.
97+
98+
A long running read-object process demo implementation can be found in
99+
`contrib/long-running-read-object/example.pl` located in the Git core
100+
repository. If you develop your own long running process then the
101+
`GIT_TRACE_PACKET` environment variables can be very helpful for
102+
debugging (see linkgit:git[1]).

GIT-VERSION-GEN

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/bin/sh
22

33
GVF=GIT-VERSION-FILE
4-
DEF_VER=v2.33.0
4+
DEF_VER=v2.33.0.vfs.0.0
55

66
LF='
77
'
@@ -12,10 +12,15 @@ if test -f version
1212
then
1313
VN=$(cat version) || VN="$DEF_VER"
1414
elif test -d ${GIT_DIR:-.git} -o -f .git &&
15-
VN=$(git describe --match "v[0-9]*" HEAD 2>/dev/null) &&
15+
VN=$(git describe --match "v[0-9]*vfs*" HEAD 2>/dev/null) &&
1616
case "$VN" in
1717
*$LF*) (exit 1) ;;
1818
v[0-9]*)
19+
if test "${VN%%.vfs.*}" != "${DEF_VER%%.vfs.*}"
20+
then
21+
echo "Found version $VN, which is not based on $DEF_VER" >&2
22+
exit 1
23+
fi
1924
git update-index -q --refresh
2025
test -z "$(git diff-index --name-only HEAD --)" ||
2126
VN="$VN-dirty" ;;

cache-tree.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
#include "cache.h"
2+
#include "gvfs.h"
23
#include "lockfile.h"
34
#include "tree.h"
45
#include "tree-walk.h"
@@ -251,7 +252,8 @@ static int update_one(struct cache_tree *it,
251252
int flags)
252253
{
253254
struct strbuf buffer;
254-
int missing_ok = flags & WRITE_TREE_MISSING_OK;
255+
int missing_ok = gvfs_config_is_set(GVFS_MISSING_OK) ?
256+
WRITE_TREE_MISSING_OK : (flags & WRITE_TREE_MISSING_OK);
255257
int dryrun = flags & WRITE_TREE_DRY_RUN;
256258
int repair = flags & WRITE_TREE_REPAIR;
257259
int to_invalidate = 0;

cache.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -987,6 +987,7 @@ extern char *git_replace_ref_base;
987987

988988
extern int fsync_object_files;
989989
extern int core_preload_index;
990+
extern int core_gvfs;
990991
extern int precomposed_unicode;
991992
extern int protect_hfs;
992993
extern int protect_ntfs;
@@ -1014,6 +1015,8 @@ int use_optional_locks(void);
10141015
extern char comment_line_char;
10151016
extern int auto_comment_line_char;
10161017

1018+
extern int core_virtualize_objects;
1019+
10171020
enum log_refs_config {
10181021
LOG_REFS_UNSET = -1,
10191022
LOG_REFS_NONE = 0,

config.c

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
*
77
*/
88
#include "cache.h"
9+
#include "gvfs.h"
910
#include "branch.h"
1011
#include "config.h"
1112
#include "environment.h"
@@ -1528,6 +1529,11 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
15281529
return 0;
15291530
}
15301531

1532+
if (!strcmp(var, "core.gvfs")) {
1533+
gvfs_load_config_value(value);
1534+
return 0;
1535+
}
1536+
15311537
if (!strcmp(var, "core.sparsecheckout")) {
15321538
core_apply_sparse_checkout = git_config_bool(var, value);
15331539
return 0;
@@ -1558,6 +1564,11 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
15581564
return 0;
15591565
}
15601566

1567+
if (!strcmp(var, "core.virtualizeobjects")) {
1568+
core_virtualize_objects = git_config_bool(var, value);
1569+
return 0;
1570+
}
1571+
15611572
/* Add other config variables here and to Documentation/config.txt. */
15621573
return platform_core_config(var, value, cb);
15631574
}

connected.c

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
#include "cache.h"
2+
#include "gvfs.h"
23
#include "object-store.h"
34
#include "run-command.h"
45
#include "sigchain.h"
@@ -30,6 +31,26 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
3031
struct transport *transport;
3132
size_t base_len;
3233

34+
/*
35+
* Running a virtual file system there will be objects that are
36+
* missing locally and we don't want to download a bunch of
37+
* commits, trees, and blobs just to make sure everything is
38+
* reachable locally so this option will skip reachablility
39+
* checks below that use rev-list. This will stop the check
40+
* before uploadpack runs to determine if there is anything to
41+
* fetch. Returning zero for the first check will also prevent the
42+
* uploadpack from happening. It will also skip the check after
43+
* the fetch is finished to make sure all the objects where
44+
* downloaded in the pack file. This will allow the fetch to
45+
* run and get all the latest tip commit ids for all the branches
46+
* in the fetch but not pull down commits, trees, or blobs via
47+
* upload pack.
48+
*/
49+
if (gvfs_config_is_set(GVFS_FETCH_SKIP_REACHABILITY_AND_UPLOADPACK))
50+
return 0;
51+
if (core_virtualize_objects)
52+
return 0;
53+
3354
if (!opt)
3455
opt = &defaults;
3556
transport = opt->transport;
Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
#!/usr/bin/perl
2+
#
3+
# Example implementation for the Git read-object protocol version 1
4+
# See Documentation/technical/read-object-protocol.txt
5+
#
6+
# Allows you to test the ability for blobs to be pulled from a host git repo
7+
# "on demand." Called when git needs a blob it couldn't find locally due to
8+
# a lazy clone that only cloned the commits and trees.
9+
#
10+
# A lazy clone can be simulated via the following commands from the host repo
11+
# you wish to create a lazy clone of:
12+
#
13+
# cd /host_repo
14+
# git rev-parse HEAD
15+
# git init /guest_repo
16+
# git cat-file --batch-check --batch-all-objects | grep -v 'blob' |
17+
# cut -d' ' -f1 | git pack-objects /guest_repo/.git/objects/pack/noblobs
18+
# cd /guest_repo
19+
# git config core.virtualizeobjects true
20+
# git reset --hard <sha from rev-parse call above>
21+
#
22+
# Please note, this sample is a minimal skeleton. No proper error handling
23+
# was implemented.
24+
#
25+
26+
use strict;
27+
use warnings;
28+
29+
#
30+
# Point $DIR to the folder where your host git repo is located so we can pull
31+
# missing objects from it
32+
#
33+
my $DIR = "/host_repo/.git/";
34+
35+
sub packet_bin_read {
36+
my $buffer;
37+
my $bytes_read = read STDIN, $buffer, 4;
38+
if ( $bytes_read == 0 ) {
39+
40+
# EOF - Git stopped talking to us!
41+
exit();
42+
}
43+
elsif ( $bytes_read != 4 ) {
44+
die "invalid packet: '$buffer'";
45+
}
46+
my $pkt_size = hex($buffer);
47+
if ( $pkt_size == 0 ) {
48+
return ( 1, "" );
49+
}
50+
elsif ( $pkt_size > 4 ) {
51+
my $content_size = $pkt_size - 4;
52+
$bytes_read = read STDIN, $buffer, $content_size;
53+
if ( $bytes_read != $content_size ) {
54+
die "invalid packet ($content_size bytes expected; $bytes_read bytes read)";
55+
}
56+
return ( 0, $buffer );
57+
}
58+
else {
59+
die "invalid packet size: $pkt_size";
60+
}
61+
}
62+
63+
sub packet_txt_read {
64+
my ( $res, $buf ) = packet_bin_read();
65+
unless ( $buf =~ s/\n$// ) {
66+
die "A non-binary line MUST be terminated by an LF.";
67+
}
68+
return ( $res, $buf );
69+
}
70+
71+
sub packet_bin_write {
72+
my $buf = shift;
73+
print STDOUT sprintf( "%04x", length($buf) + 4 );
74+
print STDOUT $buf;
75+
STDOUT->flush();
76+
}
77+
78+
sub packet_txt_write {
79+
packet_bin_write( $_[0] . "\n" );
80+
}
81+
82+
sub packet_flush {
83+
print STDOUT sprintf( "%04x", 0 );
84+
STDOUT->flush();
85+
}
86+
87+
( packet_txt_read() eq ( 0, "git-read-object-client" ) ) || die "bad initialize";
88+
( packet_txt_read() eq ( 0, "version=1" ) ) || die "bad version";
89+
( packet_bin_read() eq ( 1, "" ) ) || die "bad version end";
90+
91+
packet_txt_write("git-read-object-server");
92+
packet_txt_write("version=1");
93+
packet_flush();
94+
95+
( packet_txt_read() eq ( 0, "capability=get" ) ) || die "bad capability";
96+
( packet_bin_read() eq ( 1, "" ) ) || die "bad capability end";
97+
98+
packet_txt_write("capability=get");
99+
packet_flush();
100+
101+
while (1) {
102+
my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
103+
104+
if ( $command eq "get" ) {
105+
my ($sha1) = packet_txt_read() =~ /^sha1=([0-9a-f]{40})$/;
106+
packet_bin_read();
107+
108+
system ('git --git-dir="' . $DIR . '" cat-file blob ' . $sha1 . ' | git -c core.virtualizeobjects=false hash-object -w --stdin >/dev/null 2>&1');
109+
packet_txt_write(($?) ? "status=error" : "status=success");
110+
packet_flush();
111+
} else {
112+
die "bad command '$command'";
113+
}
114+
}

0 commit comments

Comments
 (0)