Skip to content

uefi: Add safe protocol wrapper for EFI_EXT_SCSI_PASS_THRU_PROTOCOL #1589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 10, 2025

Conversation

seijikun
Copy link
Contributor

@seijikun seijikun commented Mar 24, 2025

Update Latest API discussions


Added a safe wrapper for the extended SCSI Pass Thru protocol.
I did my best to design the API in a way that avoids accidentally using it incorrectly, yet allowing it to operate efficiently.
The ScsiRequestBuilder API is designed in a way that should easily make it possible to use it for both EFI_EXT_SCSI_PASS_THRU_PROTOCOL and a possible future safe protocol wrapper of EFI_SCSI_IO_PROTOCOL.

Exemplary usage to probe all devices potentially connected to every SCSI channel in the system:

Easy variant (io/cmd buffer allocations per request):

// query handles with support for the protocol (one handle per SCSI controller)
let scsi_ctrl_handles = uefi::boot::find_handles::<ExtScsiPassThru>().unwrap();
// Iterate over scsi controllers with passthru support:
for handle in scsi_ctrl_handles {
    // open protocol for controller
    let scsi_pt = uefi::boot::open_protocol_exclusive::<ExtScsiPassThru>(handle).unwrap();
    // iterate (potential!) devices on the controller
    for device in scsi_pt.iter_devices() {
        // this is not an actual (guaranteed-to-exist) device, but a !!potential!! device.
        // We have to probe it to find out if there is something connected to this chan/target/lun

        // construct SCSI INQUIRY (0x12) request
        let request = ScsiRequestBuilder::read(scsi_pt.io_align())
            .with_timeout(Duration::from_millis(500))
            .with_command_data(&[0x12, 0x00, 0x00, 0x00, 0xFF, 0x00]).unwrap()
            .with_read_buffer(255).unwrap()
            .build();
        // send device through controller to potential device
        if let Ok(response) = device.execute_command(request) {
            println!(
                "SUCCESS HostAdapterStatus: {:?}, TargetStatus: {:?}\r",
                response.host_adapter_status(),
                response.target_status()
            );
            let inquiry_response = response.read_buffer().unwrap();
            println!("ResponseBfr: {:?}\r", inquiry_response);
        } else {
            println!("ERROR - probably not a device\r");
        }
    }
}

Buffer-reuse API variant:

[...]
    // open protocol for controller
    let scsi_pt = uefi::boot::open_protocol_exclusive::<ExtScsiPassThru>(handle).unwrap();
    // allocate buffers and reuse amongst drives on this SCSI controller
    // It's important this is not shared across SCSI controllers !! Alignment differs
    let mut cmd_bfr = scsi_pt.alloc_io_buffer(6).unwrap();
    cmd_bfr.copy_from(&[0x12, 0x00, 0x00, 0x00, 0xFF, 0x00]);
    let mut read_bfr = scsi_pt.alloc_io_buffer(255).unwrap();
    // iterate (potential!) devices on the controller
    for device in scsi_pt.iter_devices() {
        // this is not an actual devices, but a !!potential!! device.
        // We have to probe it to find out if there is something connected to this chan/target/lun

        // construct SCSI INQUIRY (0x12) request
        let request = ScsiRequestBuilder::read(scsi_pt.io_align())
            .with_timeout(Duration::from_millis(500))
            .use_command_buffer(&mut cmd_bfr).unwrap()
            .use_read_buffer(&mut read_bfr).unwrap()
            .build();
[...]

Checklist

  • Sensible git history (for example, squash "typo" or "fix" commits). See the Rewriting History guide for help.
  • Update the changelog (if necessary)

@seijikun seijikun force-pushed the mr-extscsipt branch 7 times, most recently from 6aff09b to 3787821 Compare March 24, 2025 13:33
@seijikun
Copy link
Contributor Author

seijikun commented Mar 24, 2025

image

@seijikun seijikun force-pushed the mr-extscsipt branch 3 times, most recently from 2406c3a to 359311e Compare March 24, 2025 14:44
Copy link
Member

@phip1611 phip1611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution! I left a few remarks. Further, can you please add an integration test for the new functionality?

@seijikun seijikun force-pushed the mr-extscsipt branch 4 times, most recently from 9518d32 to ff4607d Compare March 24, 2025 18:28
@seijikun
Copy link
Contributor Author

seijikun commented Mar 24, 2025

@phip1611 @nicholasbishop
I hope I addressed all opened review comments now.
CI on x86_64 is green now, including the AlignedBuffer unit-test and the integration-test.
For the integration test, I changed the second (FAT32 formatted) disk to SCSI.

Since that change, the qemu process of the aarch64 integration test runner is hard-crashing in the Disk I/O 2 test..
Do you have an idea of what this might be?

EDIT: Was able to fix it by leaving the second disk to the way it was before and instead adding a third disk, then located on a SCSI Controller.

@seijikun seijikun force-pushed the mr-extscsipt branch 2 times, most recently from 5b1f56e to bca67e0 Compare March 24, 2025 19:10
@seijikun seijikun requested a review from phip1611 March 24, 2025 19:16
Copy link
Member

@phip1611 phip1611 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good PR with a comprehensive improvement! Thanks! LGTM

As this PR is a little bigger than others, I'd like to ask @nicholasbishop to give a second approval

@nicholasbishop
Copy link
Member

Yes, as long as it doesn't cause other issues I think that would be a good refactor. Fewer lifetimes, and avoiding internal pointers and Phantom markers feels worth it to me, to make it less likely that the code is accidentally unsound.

@seijikun
Copy link
Contributor Author

seijikun commented Mar 27, 2025

grr
I just did the refactoring, and that architecture produces the same lifetime problems.
I can not send commands within an iterator, because the device iterator borrows the protocol.

let mut scsi_pt = uefi::boot::open_protocol_exclusive::<ExtScsiPassThru>(handle).unwrap();
                // v borrows scsi_pt immutably
for device in scsi_pt.iter_devices() {
    let request = ...;
    let result = scsi_pt.execute_command(&device, request);
                // ^ attempts to borrow scsi_pt mutably
}

@nicholasbishop
Copy link
Member

Ah right. What about collecting the iterator into a Vec first?

@seijikun
Copy link
Contributor Author

seijikun commented Mar 27, 2025

Yes, that works (just tested it). Although it's an additional allocation. On the other hand, we require alloc for this anyway.
Your call.

let mut scsi_pt = uefi::boot::open_protocol_exclusive::<ExtScsiPassThru>(handle).unwrap();
let devices: Vec<_> = scsi_pt.iter_devices().collect();
for device in devices {
    let request = ...;
    let result = scsi_pt.execute_command(&device, request);
    // ...
    // profit
}

One minor caveat: ScsiDevices are then nolonger bound to a certain protocol. So you could accidentally use an ScsiDevice instance on any protocol instance.

let device = scsi_pt.iter_devices().next().unwrap();

and then run:

other_scsi_pt.execute_command(&device, request);

@seijikun
Copy link
Contributor Author

seijikun commented Mar 27, 2025

@nicholasbishop
My personal favorite stays the smart object design I suggested. It has several advantages:

  • By far the best usability (it is intuitive - you don't have to search what to do with the objects returned from the iterator)
  • It avoids an unnecessary allocation (no iter_devices().collect())
  • It actually makes sense, in that:
    • it does not allow you to borrow the protocol instance mutably while you have ScsiDevice instances alive. So you can e.g. not reset the entire channel with ScsiDevice's borrowing the protocol immutably
    • it's not possible to accidentally use a ScsiDevice instance with another ExtScsiPassThru protocol instance

I just pushed a version without internal pointer and without PhantomData usage - thus alleviating some of your concerns with it.

@nicholasbishop
Copy link
Member

FYI, I'll return to looking at this soon -- I've been a way for a few days so lots to catch up on at work :)

@nicholasbishop
Copy link
Member

Mind pulling AlignedBuf into a separate PR? I think I'm sold on the usefulness of that addition to the API, and the need for aligned buffers comes up in lots of places, so I'd like to review that independently.

@phip1611 phip1611 mentioned this pull request Apr 6, 2025
13 tasks
@seijikun seijikun force-pushed the mr-extscsipt branch 4 times, most recently from 5cb00bb to 670362b Compare April 8, 2025 12:04
@seijikun
Copy link
Contributor Author

seijikun commented Apr 9, 2025

I think the API discussion has drifted off a bit and is somewhat widespread throughout various "threads" in this issue.
As an attempt to get this PR back on track, I'll try to quickly summarize the options/variants I see for the device interaction API and lay out their potential problems / advantages:


1) Intelligent Device Objects: Everything const

  • Device Iterator yields intelligent objects that directly provide APIs for interaction with devices on the bus
  • All interaction requires only const references (on protocol and on devices) - including actions like reset_channel() to reset the entire bus

API

ExtScsiPassThru::reset_channel(&self);
// returned ScsiDevice is owned (not a reference), but internally borrows the underlying raw protocol
ExtScsiPassThru::iter_devices(&self) -> Iterator<Item = ScsiDevice>;

ScsiDevice::reset(&self);
ScsiDevice::execute_command(&self, cmd: ScsiCommand);

Usability wise (w.r.t. to flexibility for weird use-cases), probably the best option.
This would allow mixing bus resets with device interactions without having to call the iterator again.
Allowing this is not unsafe, it's just that all subsequent device interaction (if the device vanished from the bus due to the reset) returns errors.
I don't know if there is a use-case where a mixed-in bus reset makes sense.

Usage

let proto: ExtScsiPassThru;
for device in proto.iter_devices() {
    device.execute_command(..);
    proto.reset_channel(); // << allows bus-reset mixed with device interaction
}
// also allows:
let devices = proto.iter_devices().collect();
devices[0].reset();
devices[1].reset();

2) Intelligent Device Objects: Actions mut

  • Protocol-wide actions (bus reset) require a mutable protocol reference.
  • The device iterator only captures the protocol const
  • Device-specific actions (device reset, execute command) require a mutable device reference.

API

ExtScsiPassThru::reset_channel(&mut self);
// returned ScsiDevice is owned (not a reference), but internally borrows the underlying raw protocol
ExtScsiPassThru::iter_devices(&self) -> Iterator<Item = ScsiDevice>;

ScsiDevice::reset(&mut self);
ScsiDevice::execute_command(&mut self, cmd: ScsiCommand);

Usability wise, very close to all-const option.
Only difference is that it forbis bus resets while a ScsiDevice object lives.
Forbidding this makes sense if you assume, that a device doesn't reappear after a bus reset - which is IMHO a valid possibility.

Usage

let proto: ExtScsiPassThru;
for mut device in proto.iter_devices() {
    device.execute_command(..);
}
// also allows:
let devices = proto.iter_devices().collect();
devices[0].reset();
devices[1].reset();

3) Device Handle: Actions at protocol mut

  • All interaction (either bus-wide, or device specific) happen through methods in ExtScsiPassThru that require a mutable reference.
  • Device iterator yields dumb handles that have no functionality themselves. They get passed to the device-specific methods in ExtScsiPassThru for device interaction.

API

ExtScsiPassThru::reset_channel(&mut self);
// Returned ScsiDevice objects are only handles passed back to ExtScsiPassThru methods
// The objects don't keep a reference to the protocol
ExtScsiPassThru::iter_devices(&self) -> Iterator<Item = ScsiDevice>;

ExtScsiPassThru::reset_device(&mut self, dev: ScsiDevice);
ExtScsiPassThru::execute_command(&mut self, dev: ScsiDevice, cmd: ScsiCommand);

Usability-wise, this variant is worse than the other two, because:

  • The API discoverability is lower, because the ScsiDevice objects are only handles
  • The iterator borrows the Protocol, so you can't easily interact with devices while iterating, you always need to allocate

However: It allows resetting the bus while ScsiDevice instances live, just like the all-const variant.
And: It allows taking the ScsiDevice handle from one ExtScsiPassThru instance and pass it to another, which is at least as "wrong" as what can happen with a bus reset.

Usage

let proto: ExtScsiPassThru;
let proto2: ExtScsiPassThru;
for device in proto.iter_devices() {
    // not possible! Iterator borrows protocol immutably, interaction requires mut
    device.execute_command(device, ..);
}
// you have to allocate:
let devices = proto.iter_devices().collect();
for device in devices {
    proto.execute_command(device, ..);
    proto.reset_channel(); // << allows bus-reset mixed with device interaction
}
// also allows:
device.execute_command(device[0], ..);
device.execute_command(device[1], ..);

// miss-use handle at other instance:
proto2.execute_command(device[0], ..);

IMHO, 2) Intelligent Device Objects: Actions mut is the best option. That's the one that is implemented at the moment.
Does not make weird use-cases like the mixed-in bus-reset outright impossible, just makes it less convenient.

I don't want to create any pressure, but I'd be happy if we could decide on one variant of these soon (or a concrete counterproposal I can implement).

@phip1611
Copy link
Member

@seijikun Thanks for the write-up. That is definitely very helpful.

I'm also in favor of 2) Intelligent Device Objects: Actions mut.

In case the test of time shows that this was a bad design, we can just make a breaking chance (we are still not v1.0.0).

One more question: Do you see a possibility to abstract over all passthru storage device protocols to not reimplement the same API multiple times?

@seijikun
Copy link
Contributor Author

One more question: Do you see a possibility to abstract over all passthru storage device protocols to not reimplement the same API multiple times?

I think the largest duplication effort is in the RequestBuilders, and that one can't be abstracted away because of how different these protocols are.
Especially NVMe is also very different in semantics. While one NvmePassThru instance represents one NVMe drive, one instance of ExtScsiPassThru and AtaPassThru represents a controller with potentially connected drives, or splitters and even more drives. (NVMe namespaces are like partitions of the same physical drive)

So if at all, I think we could only put SCSI and ATA in the same abstraction - but even they differ quite a lot in their implementation. Compare e.g. the iterator implementation, which is much more contrived with ATA, because of the splitter semantics.
I doubt we could save in the department of raw code lines, sadly.

@phip1611
Copy link
Member

So if at all, I think we could only put SCSI and ATA in the same abstraction - but even they differ quite a lot in their implementation.

No problem, that's fine. We do not need abstractions at any cost

@nicholasbishop
Copy link
Member

Thanks for the detailed writeup! And sorry for the delay in getting you my review, been quite busy recently :)

I agree with your analysis, let's stick with option 2 (and as Philipp said, if we need to change it in the future we can).

@nicholasbishop nicholasbishop added this pull request to the merge queue Apr 10, 2025
Merged via the queue into rust-osdev:main with commit 3acb50c Apr 10, 2025
16 checks passed
@nicholasbishop
Copy link
Member

Merged, thanks for your patience!

A couple suggestions to reduce delays for the future:

  1. As you noted, discussion of the design ended up playing out across a number of PRs. Since the end design for nvme/scsi/ata passthrough would probably end up similarly, it'd probably have been a little smoother to just put up one of these PRs at a time; get it reviewed and merged before putting up the next. (This is just a suggestion for future work, no need to close any of your current PRs).
  2. I'm a big fan of small PRs that build towards an end goal. It's much easier to review a 100 line change than a 1000 line change, and the review can be more thorough with a smaller PR. I recognize in this case you're adding a whole new API and tests, so it can be hard to find a good place to split. In that case, splitting into small commits within the PR can be helpful. https://google.github.io/eng-practices/ has some nice guidance for code authors and reviewers that I like.

Hope these are helpful, and thanks again for all your work on these protocols.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants