Skip to content

Commit a62a8ef

Browse files
stefanhaRHMiklos Szeredi
authored and
Miklos Szeredi
committed
virtio-fs: add virtiofs filesystem
Add a basic file system module for virtio-fs. This does not yet contain shared data support between host and guest or metadata coherency speedups. However it is already significantly faster than virtio-9p. Design Overview =============== With the goal of designing something with better performance and local file system semantics, a bunch of ideas were proposed. - Use fuse protocol (instead of 9p) for communication between guest and host. Guest kernel will be fuse client and a fuse server will run on host to serve the requests. - For data access inside guest, mmap portion of file in QEMU address space and guest accesses this memory using dax. That way guest page cache is bypassed and there is only one copy of data (on host). This will also enable mmap(MAP_SHARED) between guests. - For metadata coherency, there is a shared memory region which contains version number associated with metadata and any guest changing metadata updates version number and other guests refresh metadata on next access. This is yet to be implemented. How virtio-fs differs from existing approaches ============================================== The unique idea behind virtio-fs is to take advantage of the co-location of the virtual machine and hypervisor to avoid communication (vmexits). DAX allows file contents to be accessed without communication with the hypervisor. The shared memory region for metadata avoids communication in the common case where metadata is unchanged. By replacing expensive communication with cheaper shared memory accesses, we expect to achieve better performance than approaches based on network file system protocols. In addition, this also makes it easier to achieve local file system semantics (coherency). These techniques are not applicable to network file system protocols since the communications channel is bypassed by taking advantage of shared memory on a local machine. This is why we decided to build virtio-fs rather than focus on 9P or NFS. Caching Modes ============= Like virtio-9p, different caching modes are supported which determine the coherency level as well. The “cache=FOO” and “writeback” options control the level of coherence between the guest and host filesystems. - cache=none metadata, data and pathname lookup are not cached in guest. They are always fetched from host and any changes are immediately pushed to host. - cache=always metadata, data and pathname lookup are cached in guest and never expire. - cache=auto metadata and pathname lookup cache expires after a configured amount of time (default is 1 second). Data is cached while the file is open (close to open consistency). - writeback/no_writeback These options control the writeback strategy. If writeback is disabled, then normal writes will immediately be synchronized with the host fs. If writeback is enabled, then writes may be cached in the guest until the file is closed or an fsync(2) performed. This option has no effect on mmap-ed writes or writes going through the DAX mechanism. Signed-off-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Vivek Goyal <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
1 parent 2d1d25d commit a62a8ef

File tree

7 files changed

+1240
-0
lines changed

7 files changed

+1240
-0
lines changed

fs/fuse/Kconfig

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,3 +27,14 @@ config CUSE
2727

2828
If you want to develop or use a userspace character device
2929
based on CUSE, answer Y or M.
30+
31+
config VIRTIO_FS
32+
tristate "Virtio Filesystem"
33+
depends on FUSE_FS
34+
select VIRTIO
35+
help
36+
The Virtio Filesystem allows guests to mount file systems from the
37+
host.
38+
39+
If you want to share files between guests or with the host, answer Y
40+
or M.

fs/fuse/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,6 @@
55

66
obj-$(CONFIG_FUSE_FS) += fuse.o
77
obj-$(CONFIG_CUSE) += cuse.o
8+
obj-$(CONFIG_VIRTIO_FS) += virtio_fs.o
89

910
fuse-objs := dev.o dir.o file.o inode.o control.o xattr.o acl.o readdir.o

fs/fuse/fuse_i.h

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -353,6 +353,10 @@ struct fuse_req {
353353
/** Used to wake up the task waiting for completion of request*/
354354
wait_queue_head_t waitq;
355355

356+
#if IS_ENABLED(CONFIG_VIRTIO_FS)
357+
/** virtio-fs's physically contiguous buffer for in and out args */
358+
void *argbuf;
359+
#endif
356360
};
357361

358362
struct fuse_iqueue;
@@ -383,6 +387,11 @@ struct fuse_iqueue_ops {
383387
*/
384388
void (*wake_pending_and_unlock)(struct fuse_iqueue *fiq)
385389
__releases(fiq->lock);
390+
391+
/**
392+
* Clean up when fuse_iqueue is destroyed
393+
*/
394+
void (*release)(struct fuse_iqueue *fiq);
386395
};
387396

388397
/** /dev/fuse input queue operations */

fs/fuse/inode.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -630,6 +630,10 @@ EXPORT_SYMBOL_GPL(fuse_conn_init);
630630
void fuse_conn_put(struct fuse_conn *fc)
631631
{
632632
if (refcount_dec_and_test(&fc->count)) {
633+
struct fuse_iqueue *fiq = &fc->iq;
634+
635+
if (fiq->ops->release)
636+
fiq->ops->release(fiq);
633637
put_pid_ns(fc->pid_ns);
634638
put_user_ns(fc->user_ns);
635639
fc->release(fc);

0 commit comments

Comments
 (0)