Skip to content

Introducing a high-level FS abstraction #472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 2, 2023

Conversation

phip1611
Copy link
Member

@phip1611 phip1611 commented Jul 28, 2022

Hi! This originated in a personal learning project and I'd like to know what others think about it. Would be cool if we can bring this upstream. It's about creating a high level FS abstraction over the existing FS protocol. It aims to be as close to top-level functions provided by std::fs as possible.

Motivation

Usually, UEFI-application hackers want to do something like this:

  • read configuration file from /foo/bar/config
  • write to /for/bar/log.txt
  • read /foo/bar/kernel.elf

Code example / API demonstration

    let mut fs = FileSystem::new(
        "boot",
        system_table.boot_services().get_image_file_system(handle).unwrap()
    );

    fs.write("foobar.bin", [1,2,3]).unwrap();
    fs.read("bar").unwrap_err();
    fs.copy("foobar.bin", "bar").unwrap();
    fs.read("bar").unwrap();
    fs.rename("bar", "barfoo").unwrap();
    assert_eq!(fs.read("barfoo").unwrap().as_slice(), &[1, 2, 3]);

    log::info!("fs integration test worked!");

From the module description (initial version; a few changes in the meantime):

//! A high-level file system API for UEFI applications. It supports you to conveniently perform the
//! following important operations:
//! - read file to bytes
//! - write bytes to file
//! - create files
//! - iterate directories and walk the file system tree
//!
//! Unlike UNIX-based file systems, there is no virtual file system. Thus, users perform actions on
//! dedicated volumes: For example, the boot volume, such as a CD-rom, USB-stick, or a storage
//! device. There are no symlinks or hard-links. Just plain paths, directories, and regular files.
//! The way to access a volume is to open a [FileSystem].
//!
//! All paths are absolute and follow the FAT-like file system conventions for paths. Thus, there
//! is no current working directory and path components like "." and ".." are not supported. In
//! other words, the current working directory is always "/", i.e., the root, of the opened volume.
//! This may change in the future but is currently sufficient.
//!
//! There are no `File` and `Path` abstractions that get publicly exported. Paths are provided as
//! `&str` and validated internally. There are no `File` objects that are exposed to users.
//!
//! The difference to using the [SimpleFileSystemProtocol] directly is that the abstractions in this
//! module take care of automatic release of resources and support you by returning data owned on
//! the heap (such as file info). There is no synchronization as it is untypically that users
//! bootstrap Application Processors (AP) during the UEFI stage, i.e., before hand-off to a kernel.

Open Questions

  • Do we want this at all in uefi-rs?
  • Integration tests (I have integration tests in my private project so I know it is working)
  • how to integrate it into the existing code base
  • verify: apparently there is already a "get boxed file info"-implementation. But I do not know if it respects the dynamically sized C16-string length

Steps to Undraft

  • polish documentation a little
  • fix integration test

@nicholasbishop
Copy link
Member

I haven't had a chance to look at the details of this yet, but I'm definitely on board with the idea of a high-level FS API! The motivation you described sounds very reasonable to me. I'll give some more detailed feedback once I get some time to peruse :)

Copy link
Collaborator

@GabrielMajeri GabrielMajeri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing this code! I've left some comments, based on an initial review of the PR.

use uefi::table::boot::ScopedProtocol;
use uefi::{CString16};

// #[derive(Debug, Clone, Ord, PartialOrd, Eq, PartialEq, Hash)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for keeping this comment around?

src/fs/mod.rs Outdated
//! - create files
//! - iterate directories and walk the file system tree
//!
//! Unlike UNIX-based file systems, there is no virtual file system. Thus, users perform actions on
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of just saying that this is not like UNIX, we could also affirmatively indicate that it's a look like Windows, where you also have distinct "drive letters" for each volume :)

src/fs/mod.rs Outdated
Comment on lines 23 to 24
//! the heap (such as file info). There is no synchronization as it is untypically that users
//! bootstrap Application Processors (AP) during the UEFI stage, i.e., before hand-off to a kernel.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though it's unlikely that an UEFI app will start the other processors, during the boot stage, sync is still a problem because of interrupts. You could have code which modifies an internal data structure, gets interrupted by an event, and then that event handler tries to access the same data structure (which is currently in an invalid, in progress state!).

Fortunately, event handlers aren't exactly the usual place to do FS operations...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I don't know how interrupts in UEFI are handled and where they are used. If an interrupt handler wants to access the FileSystem abstraction, while the lock is contended, then.. what to do? Force an unlock the File System? @GabrielMajeri

Copy link
Member Author

@phip1611 phip1611 Mar 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, it is the users'/developers's responsibility to find a locking policy. A user might wrap the FileSystem implementation in a SpinLock or not.

The documentation says that there is no out-of-the-box synchronization.

/// Removes an empty directory.
pub fn remove_dir(&mut self, path: &str) -> FileSystemResult<()> {
let _path = Path::new(path)?;
todo!()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed these todo! calls. Are you planning to implement them before finalizing the PR, or leave them for future implementation post-merge? (if it's the first case, then I'd advise marking the PR as a draft so we don't accidentally merge it before it's ready 😅)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They will be implemented, for sure! I just wanted to clarify if we want this thing at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having recently done something with the "raw" filesystem API I can confirm that a simpler fs interface would be nice :)

@phip1611 phip1611 changed the title Introducing a high-level FS abstraction Draft: Introducing a high-level FS abstraction Aug 2, 2022
@phip1611
Copy link
Member Author

phip1611 commented Jan 2, 2023

@nicholasbishop @GabrielMajeri I finally had time working on this. It is still not ready to merge but much closer now. I add it as a new workspace member uefi-fs. Let me know what you think.

API examples from the new integration test:

     let mut fs = FileSystem::new(sfs);

    fs.create_dir("test_file_system_abs").unwrap();

    // slash is transparently transformed to backslash
    fs.write("test_file_system_abs/foo", "hello").unwrap();
    // absolute or relative paths are supported; ./ is ignored
    fs.copy("\\test_file_system_abs/foo", "\\test_file_system_abs/./bar").unwrap();
    let read = fs.read("\\test_file_system_abs\\bar").unwrap();
    let read = String::from_utf8(read).unwrap();

    fs.rename("test_file_system_abs\\bar", "test_file_system_abs\\barfoo").unwrap();

    let entries = fs.read_dir("test_file_system_abs").unwrap()
        .map(|e| e.unwrap().file_name().to_string())
        .collect::<Vec<_>>();
    assert_eq!(&[".", "..", "foo", "barfoo"], entries.as_slice());

@nicholasbishop
Copy link
Member

Looks nice!

A few thoughts and questions:

  • Will a PathBuf API be provided as well? Wouldn't necessarily need to be in the initial PR.
  • I think maybe instead of a new crate, we could put this as a module in the main uefi crate, e.g. uefi::fs. My reasoning is basically that unless the new code runs against one of a few specific cases I'll list below, it's easier on the end user to keep it in the main crate. This is relevant to those other tickets about features vs crates as well, and I intend to write up some more thoughts eventually. But briefly, the reasons I can think of to put it in a separate crate are:
    • If the new code increases the compile-time of the crate significantly then putting it in a separate crate makes sense. I don't think that's an issue here; compilation-time increases are often related to pulling in more dependencies, but the only new dep here is derive_more, which pulls in quote/syn/proc_macro2, all of which we already depend on because of uefi-macros. And the new code is not so large that it should meaningfully affect compilation times.
    • If the new code needs to be separated by a feature flag. Putting it in a separate crate can be a reasonable alternative to crate features, but in this case I think we can either gate the whole thing behind alloc, or maybe do it more fine-grained and just gate particular functions and types behind alloc. But it's not like, depending on new unstable features or anything.
    • If the new code would negatively affect the API of the uefi crate. I don't think there's any issue with that here, it could all be neatly contained in uefi::fs.
  • Just a suggestion for examples and tests with paths containing backslashes, you can use r"a\b\c" as a slightly cleaner alternative to "a\\b\\c".

mod path;
mod uefi_types;

pub use file_system::{FileSystem, FileSystemError, FileSystemResult};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed from looking at the cargo doc output, probably want a pub use for Path here.

@phip1611 phip1611 marked this pull request as draft March 12, 2023 12:41
@phip1611 phip1611 changed the title Draft: Introducing a high-level FS abstraction Introducing a high-level FS abstraction Mar 12, 2023
@phip1611 phip1611 force-pushed the high-level-fs-abstraction branch from 16e3360 to 62ee08a Compare March 12, 2023 12:51
@phip1611
Copy link
Member Author

phip1611 commented Mar 14, 2023

  • Will a PathBuf API be provided as well? Wouldn't necessarily need to be in the initial PR.

So far, the API a user sees is just a &str. Internally, it is transformed to a normalized CString16. I do not have a Path/PathBuf implementation yet (that is meant to be public). I think in the long term, each method should consume fn foo<P: Into<&cstr16>(&self, path: P) or so, but I'm not sure about the interface. I think, it is most important that &str is accepted as path.

Internally, the separator is \\. However, users can just use a regular slash. The slash is transparently replaced by the internal path normalization process.

  • I think maybe instead of a new crate, we could put this as a module in the main uefi crate, e.g. uefi::fs. My reasoning is basically that unless the new code runs against one of a few specific cases I'll list below, it's easier on the end user to keep it in the main crate. This is relevant to those other tickets about features vs crates as well, and I intend to write up some more thoughts eventually. But briefly, the reasons I can think of to put it in a separate crate are:

    • If the new code increases the compile-time of the crate significantly then putting it in a separate crate makes sense. I don't think that's an issue here; compilation-time increases are often related to pulling in more dependencies, but the only new dep here is derive_more, which pulls in quote/syn/proc_macro2, all of which we already depend on because of uefi-macros. And the new code is not so large that it should meaningfully affect compilation times.
    • If the new code needs to be separated by a feature flag. Putting it in a separate crate can be a reasonable alternative to crate features, but in this case I think we can either gate the whole thing behind alloc, or maybe do it more fine-grained and just gate particular functions and types behind alloc. But it's not like, depending on new unstable features or anything.
    • If the new code would negatively affect the API of the uefi crate. I don't think there's any issue with that here, it could all be neatly contained in uefi::fs.

I thought about it, and it doesn't massively increase compilation times nor requires it a feature flag. It will be just part of the normal API.

  • Just a suggestion for examples and tests with paths containing backslashes, you can use r"a\b\c" as a slightly cleaner alternative to "a\\b\\c".

Ah, I didn't know that, thanks.

PS: This feature heavily uses (small and medium) allocations. In the long term, we might get some feedback in case our allocator does something wrong or so.

@phip1611
Copy link
Member Author

Ping @nicholasbishop ? :)

pub struct FileSystem<'boot_services> {
/// Underlying UEFI protocol.
proto: ScopedProtocol<'boot_services, SimpleFileSystemProtocol>,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since FileSystem now just contains the open SimpleFileSystemProtocol, would it makes sense to drop the new type and instead add the FileSystem methods directly to SimpleFileSystemProtocol?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep thinks separate. Low-level API and high-level API should not exist on the same type, especially, as mixing them together might break stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say more about what might break when using both? If there are things the end user must avoid doing then we should add that info to the docs.

Copy link
Member Author

@phip1611 phip1611 Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought again about it and there are probably no safety issues, just convenience issues and a weird mixture of low-level and high-level APIs. However, I mixed up the SimpleFileSystem protocol with the FileProtocol. As the SimpleFileSystem protocol is so simple, it only has open_volume, integrating the new FileSystem abstraction into the SimpleFileSystem protocol abstraction is a possible design choice. However, the following properties do not feel right:

  • mix of low-level API with high-level API
  • I need additional helpers for the Filesystem (Path, FileSystemError, ...) and it feels wrong to integrate them into uefi::proto::media. They are types of a different level.

Instead, I'd allow constructing a FileSystem from a SimpleFileSystem (via .into() or .to_file_ystem()) for a high developer convenience.

What do you say?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, I especially agree that it would feel a little weird to have things like Path mixed into uefi::proto::media. It's a little unfortunate that the names are kinda reversed; SimpleFileSystem is really more complex to use than FileSystem. But that's probably fine, I don't have any better suggestion :) Anyway, +1 to the separate module.

@phip1611 phip1611 force-pushed the high-level-fs-abstraction branch from b34efe5 to 6b1bc4c Compare March 28, 2023 06:09
/// High-level file-system abstraction for UEFI volumes with an API that is
/// close to `std::fs`. It acts as convenient accessor around the
/// [`SimpleFileSystemProtocol`].
pub struct FileSystem<'boot_services> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lifetime is technically wrong. It only lifes as long "as it lives". Should be the same the underlying protocol, hence, 'a.

eprintln!("HELLO");
eprintln!("HELLO");
eprintln!("HELLO");
fs::copy(&test_disk, "/home/pschuster/Desktop/qemu.img").unwrap();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo, I used this for testing. remove

@phip1611 phip1611 force-pushed the high-level-fs-abstraction branch from 6b1bc4c to 4bc5821 Compare March 31, 2023 16:01
@phip1611 phip1611 marked this pull request as ready for review March 31, 2023 16:01
@phip1611
Copy link
Member Author

@nicholasbishop I think this is ready for review and merging, when we keep the following things in mind:

  • So far, the public API change and exports are very small and unlikely to ever change. I'd like to keep it that way.
  • Users only use &str to pass in paths at the moment. This will eventually by replaced by something like Into<Path>, where Path is part of a public API of the fs module. This will be subject of (follow-up) discussions (of a follow-up PR hopefully)
  • I'm very unhappy with my internal implementation of NormalizedPath and path handling in general. However, as this is a purely internal API, I think I can fix this in a follow-up MR and finally merge this feature.
  • I'm very happy with the state of the (public) documentation right now and how this simplifies things

Open Questions/Blockers:

  • In my NormalizedPath abstraction I kept the way open for merging a path with a present working directory. The idea is that users specify FileSystem::set_pwd("/foo/kernel"). But I think, this is over-engineered and too feature-creep, aye?
  • I need do double check that the integration test is sufficient.

@phip1611 phip1611 force-pushed the high-level-fs-abstraction branch from 4bc5821 to 1f47d4d Compare March 31, 2023 16:10
@@ -17,6 +17,7 @@ use core::mem::{self, MaybeUninit};
use core::ops::{Deref, DerefMut};
use core::ptr::NonNull;
use core::{ptr, slice};
use uefi::fs::FileSystem;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[cfg(feature = "alloc")]

/// carefully!
pub fn remove_dir_all(&mut self, _path: impl AsRef<Path>) -> FileSystemResult<()> {
todo!()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of getting it merged, maybe just drop this method for now?

@nicholasbishop
Copy link
Member

I'm in favor of merging this once the tests are passing and the one todo!() is fixed. We can handle cleanups/additional tests in follow up PRs.

@phip1611
Copy link
Member Author

phip1611 commented Apr 1, 2023

I'm in favor of merging this once the tests are passing and the one todo!() is fixed. We can handle cleanups/additional tests in follow up PRs.

I've created a tracking issue: #747

I'm in favor of merging this once the tests are passing and the one todo!() is fixed. We can handle cleanups/additional tests in follow up PRs.

I want to refactor the all my existing abstractions, namely Path, Components, and NormalizedPath in a follow-up PR. Currently, the code is not very nice.

@phip1611 phip1611 force-pushed the high-level-fs-abstraction branch 3 times, most recently from ce533e7 to 81299de Compare April 1, 2023 11:53
pub fn create_dir_all(&mut self, path: impl AsRef<Path>) -> FileSystemResult<()> {
let path = path.as_ref();

let normalized_path = NormalizedPath::new("\\", path)?;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ugly but will be replaced once there is a new and better Path/PAthBuf abstraction

@@ -0,0 +1,239 @@
//! Module for path normalization. See [`NormalizedPath`].
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole module will be refactored in a follow-up PR.

@@ -0,0 +1,154 @@
//! Module for handling file-system paths in [`super::FileSystem`].
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole module will be refactored in a follow-up PR.

@phip1611 phip1611 force-pushed the high-level-fs-abstraction branch 2 times, most recently from 8a54c28 to 78a32a0 Compare April 1, 2023 12:50
@phip1611 phip1611 force-pushed the high-level-fs-abstraction branch from 78a32a0 to 509dc71 Compare April 2, 2023 13:23
@phip1611 phip1611 merged commit 16aaa09 into rust-osdev:main Apr 2, 2023
@phip1611 phip1611 deleted the high-level-fs-abstraction branch April 2, 2023 13:29
@phip1611 phip1611 linked an issue Apr 2, 2023 that may be closed by this pull request
5 tasks
@phip1611 phip1611 mentioned this pull request May 6, 2023
2 tasks
nicholasbishop added a commit to nicholasbishop/uefi-rs that referenced this pull request Jul 3, 2023
This reverts a small change from: rust-osdev#472

If using the library without the `alloc` feature enabled, `FileSystem` isn't
available, but you might still want access to the image's file system via the
underlying protocol.

The high-level API is still easily accessible via `FileSystem::new`.
nicholasbishop added a commit to nicholasbishop/uefi-rs that referenced this pull request Jul 3, 2023
This reverts a small change from: rust-osdev#472

If using the library without the `alloc` feature enabled, `FileSystem` isn't
available, but you might still want access to the image's file system via the
underlying protocol.

The high-level API is still easily accessible via `FileSystem::new`.
nicholasbishop added a commit that referenced this pull request Jul 3, 2023
This reverts a small change from: #472

If using the library without the `alloc` feature enabled, `FileSystem` isn't
available, but you might still want access to the image's file system via the
underlying protocol.

The high-level API is still easily accessible via `FileSystem::new`.
phip1611 pushed a commit to phip1611/uefi-rs that referenced this pull request Nov 14, 2023
This reverts a small change from: rust-osdev#472

If using the library without the `alloc` feature enabled, `FileSystem` isn't
available, but you might still want access to the image's file system via the
underlying protocol.

The high-level API is still easily accessible via `FileSystem::new`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tracking Issue: high-level FS abstraction
3 participants