-
Notifications
You must be signed in to change notification settings - Fork 7.3k
[RFC] Introduction to MSPI (Multi-bit SPI) driver API #70723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @swift-tk! We appreciate you submitting your first issue for our open-source project. 🌟 Even though I'm a bot, I can assure you that the whole community is genuinely grateful for your time and effort. 🤖💙 |
Arch WG:
|
I could not attend the meeting (it's really not the right time for me), but has anyone noted that this is again a proposal for dual/quad/octal modes in addition to other things? So already, is there somebody really interested to get such modes as a generic SPI feature exposed? (See PR #39991). We could avoid having a new API altogether. |
@tbursztyka - I also like the idea of supporting dummy clocks before a data phase. Specifically, because some peripherals (e.g. fpgas) use it for handshaking, and I think that is missing from the current api |
@tbursztyka As far as I can tell, it is hard to overhaul the SPI API and that is why everybody just implement their own HAL abstraction controller and have their own device drivers call the controller. This I mean the qspi, ospi from st, qspi from Nordic, flexspi from nxp. I’ve looked into each of these controller implementations, and I think that there are a lot of things in common and that is what I did in my new API proposal. I also would like to point like that there are multiple other things missing from the SPI API.
There is no multiple device support and callback management as the generic SPI would not need such things.
I forgot to mention that the SPI does not support DQS configuration which is required for high speed devices. |
@cfriedt Isn't this just a certain amount of 0s "being sent" at certain places or is there more to this? @swift-tk Thanks for the details. When it comes to dedicated qspi devices (such as nordic's etc..), this was the main reason why at the time we did not integrate dual/quad/octal into generic SPI API because these controllers offers hardware optimizations the generic API would not be able to take full advantage of, like XIP or JEDEC etc... (btw, have a quick look at #20564, as you can see it's not a recent issue). Do I understand your proposal well so that you would like to replace all dedicated qspi, ospi, ... device drivers, found in flash currently, into ones implementing MSPI API? And then there would (ideally) be a unique flash_mspi.c driver of some sort ? My own concern has always been that one: are there some use cases where a generic SPI controller would be driving single mode devices and dual/quad/octal based devices (most likely a flash memory device)? Does that type of controller even exist ? Anyway, I see things that way at the moment, sticking with the flash use case: (I hope my crappy asci art shows up the same way everywhere)
Btw, doing that you could then - and only then as there is no other way round (unless MSPI API would have to be exposed by ALL existing SPI drivers...) - remove spi_nor.c also. Since below the MSPI API, it would use the generic SPI API when relevant (through a dedicated driver, mspi_generic_spi.c for instance). I hope I am clear enough. |
Certainly you could construct the buffer such that the dummy cycles are between addr and data. However, there are two problems I can see, one being that it is ambiguous. Thus in the controller layer, there is no way to know whether the 0s are actually data or dummies(as well as cmd and addr length), unless the hardware is configured to be used as a generic SPI, in which case, it does not care. This leads to the second problem being not efficient enough for advanced SPI hardware(qspi and etc) because often times the transfers are initiated as many times as the number of buffers. I believe this is a part of what you mentioned about hardware optimization? (as the advanced hardware could just initiate one transfer for everything)
Yes, your understanding is correct, the device drivers(flash_mspi_device_name.c) that uses the proposed MSPI API could eventually replace all platform specific device drivers. But controller layer would have to be rewritten to implement the proposed API.
I haven't seen anything exactly that in Zephyr repo, but I do know that SPI API is used to initialize the flash device in the espi example. However, I don't know enough about Intel eSPI to tell what is going on next. This #51402 talked about a device right? not a controller? If there were such a controller hardware, the proposed API could still cover that by setting With that being said, I think calling the SPI API from MSPI API seems to be an overkill. It would only spend more processing time and not nearly efficient. But I could definitely see that the SPI and MSPI would co-exist for quite a while. And then Zephyr may slowly deprecate the SPI API. |
One of the issues with e.g. the lattice ice40 series was the need to invert /CS polarity mid-sequence as a kind of handshake. It's a definite quirk. Currently it's bit-banged with gpio because the old Zephyr SPI API does not have a way to make that happen synchronously within timing requirements. It could maybe be done as a bit of a board hack / platform driver, but doing it with the SPI API would be better. |
A few comments here...
I'm pushing to move i2c/spi to use a generalized command queue + completion queue API (rtio) like io_uring or iocp as it...
I'm wondering if this would best be served by using a common API for all of this as its already being used in an updated sensors API, already has tests verifying transactional behavior, chaining, and cancellation. Cheers! |
@teburd See this paragraph.
The mspi_register_callback just registers the callback and callback context. The controller driver should save it in accordance with the event type. I have xfer complete event type defined and the controller may use that to notify async completion. The demo usage of the async call can be found here https://github.com/AmbiqMicro/ambiqzephyr/blob/RFC-MSPI/samples/drivers/memc/src/main.c I did not use a software queue in aync implementation as our controller has hardware command queue. Do you have code I could look at? It could certainly be a supplement for those that don’t have hardware command queue. I think it may be better to take similar forms as spi_context.h so that the interface API doesn’t force people to use a software queue when not needed. |
Let me reiterate that having callbacks be the primary event notifier here will not work with user space and register_callback can't be a syscall, I don't think that's being fully appreciated or understood. This makes me believe the register_callback is stateful and connected to the next transfer? mspi_register_callback(controller, dev_id, MSPI_BUS_XFER_COMPLETE, (mspi_callback_handler_t)async_cb, &cb_ctx1);
mspi_transceive_async(controller, dev_id, &pack1);
mspi_register_callback(controller, dev_id, MSPI_BUS_XFER_COMPLETE, (mspi_callback_handler_t)async_cb, &cb_ctx2);
mspi_transceive_async(controller, dev_id, &pack2); Is that right? That's a bit suspicious, is it expected the user of an mspi device maintain an external lock on it? And you'd like those two async transfers to occur back to back presumably, without something else taking control of the bus controller? How rtio looks using spi might be... struct rtio_sqe *sqe;
struct rtio_cqe *cqe;
struct rtio_iodev *spi_device = /* MACRO here... */
sqe = rtio_sqe_acquire(ctx);
rtio_prep_write(sqe, spi_device, buf, buf_len);
sqe->flags |= RTIO_TRANSACTION;
sqe = rtio_sqe_acquire(ctx);
rtio_prep_write(ctx, spi_device, buf, buf_len);
rtio_submit(ctx, 2); /* submit to the spi controller a write and read, block waiting on 2 completions, if 0 is given the requests are started but the call should not block */
cqe = rtio_consume_cqe(ctx); /* could block here instead if you want */
/* cqe->result has a int result code */
if (cqe->result != 0) {
LOG_ERR("write+read to spi device failed");
} Ideally every rtio sqe would turn into a hardware command/dma transer. For lpspi this isn't quite true as the mcux sdk doesn't provide direct low level access to the hardware command queue today. The iodev can contain configuration specifics, in this case its the chip select, struct device *, clock configuration, and polarity settings typical of spi. |
That would definitely be an improvement for the SPI API, go ahead make a dedicated issue, I'll look into this. |
I was bringing this up only in the case were a generic SPI controller would be in use (so no hardware features exposed to simplify that). Obviously this does not fit with dedicated controllers. Thus why I agreed on a configuration exposed via DTS (not for SPI API, so yes MSPI would do). Then, below, it's just a matter of a driver using it relevantly: a dedicated controller would use the hw optimizations, a generic SPI controller would mimic by inserting the right length buffer of 0 at proper places.
Ok but why saying that then:
So far, MSPI seems fully designed over a very minimal subset of use cases which is related to d/q/o, xip and jedec targeting flash memories. Bringing a lot of features and configuration bits that would not fit 99% of other use cases (thus a lot of overhead, useless rom/ram usage - seriously the proposed structures are super heavy - etc...). It's confusing. What's the actual focus here? Finally fixing d/q/r, xip, jedec support in SPI or replacing SPI API? |
I did not reply on that, but looking at previous statement, now is the time:
That would mean implementing the proposed API in all existing SPI drivers. (knowing that the proposed API brings nothing to single line SPI mode - the 99.999% usage atm). Is that what's being wanted ? |
Not sure this is something desirable. To me these both API address different use cases and controllers. We should keep them exclusive. |
@teburd I get your point, the syscall will be removed from
Not necessarily. The example is a bit bias by Ambiq HAL. The user could just register the callback and the context once if there is no need to change it throughout the program, but I guess that is generally not the case. So I shown changing the context per async call. BTW, one async call don't necessarily mean to prepare the hardware to start only one transfer. There are use cases where each transfer requires a callback. Thats why I have
Whether it needs an external lock is entirely up to the controller implementation. In my case (Ambiq MSPI), it is not need as there are two internal locks. One for transfer context, one for controller access.
It would occur back to back as it writes to hardware command queue and does not wait for the previous to complete. The controller driver would have to be written in a way that blocks others (other device / next transfer) from accessing the controller. See here example implementation of the controller driver. https://github.com/AmbiqMicro/ambiqzephyr/blob/RFC-MSPI/drivers/mspi/mspi_ambiq_ap3.c I like the RTIO idea, it can certainly be supplemented to MSPI API the way you did for SPI. But the
I'd like to keep this RFC as simple as possible as the primiary focus is to provide a generic API for advanced SPI devices that the generic SPI fails to achieve. |
I would say neither, the primary focus is to provide a generic API for advanced SPI devices that the generic SPI fails to achieve. My opinion is that the generic SPI could no longer serve for interfacing with advanced SPI devices in terms of overall performance. By the way, the advanced SPI usage is not that small and believe me, I did not just come up with this without actually trying to drive those advanced SPI devices with the generic SPI. I just said the proposed MSPI API is capable of doing what SPI does, whether it can replace SPI is entirely a different matter. Like you said, it has more configuration that generic SPI devices don't need, then let people use the generic SPI for simple SPI devices. But leave the complex and heavy lifting interfacing to a dedicited API. There is no reason to force people to use the generic SPI for complex devices. Also, the industry have evolved, some common SPI controllers in SoCs are upgraded to have cmd and data phases. This means for RX operation, it no longer needs to start a TX first and then switch to RX because it can be done through hardware. But the generic SPI treats everything as data which is not ideal. You would also need software CE/CS control for the generic SPI while the upgraded controllers does not. I don't see a way around having more complex data structures as the SPI controller and devices grows more complex. Personally, one of the problems I find with SPI API is that it is not really expandable. |
@erwango I agree with you, we don't need to implement the MSPI API for all legacy SPI device drivers. |
@tbursztyka Sorry I had to keep editing my responses.
Just having the configurations through DTS is just not enough. Thoses settings may change in run-time depending on the command, mode of operation( |
How is exclusive usage of the device done then between multiple calling contexts? E.g. multiple threads running concurrently or preempting eachother, or perhaps even in an ISR wanting to start an async transfer that is caught later. E.g. the flow goes... thread A: register_callback It's not very clear to me at least how all this is dealt with. Its even less clear how this might work out where you could start an async transfer from an ISR which turns out to be a really useful thing for things like sensors at least. |
Once again, this is entirely up to the controller driver implementation on how to tackle threading. What I took is a similar approach that is in spi_context.h
In this situation for my case, I have a local struct This is very similar to that of spi_context_lock. The lock is not going to get released until all transfers gets queue in the hardware queue. So the register_callback from A would not affect what is in struct If in the case which the hardware command queue does not take callback and callback context or no hardware command queue at all, implementing RTIO software queue can be helpful.
No, register_callback from B may have changed the local storage but not going to affect A because callback and callback context are in Now if we are only looking at the controller driver implementation, there is a chance for the callback and callback context local storage to get tampered. Execution flow below: However, this would not happen if the device driver is implemented correctly as there is always a third lock.
So thread B is not going get the device control and be able to call mspi_register_callback before thread A releases the device control. |
What are the "advanced SPI devices" out there? Besides memory devices I mean. Are there any other types? This was the reason why, in 2019, it was decided not to adopt d/q/o and other features into SPI because it was much simple to just interface the flash API (or other memory APIs) directly. I disagree on "generic SPI fails to achieve" statement: as long as these advanced features are not supported into the API, obviously it will fail to use them. Then comes another question: what blocks the possibility to improve the existing SPI API to enable those features versus creating an entirely new API? |
It's a new API, so nobody's ack is necessary here afaik. |
By "ACK" I meant whether you are OK with adding a new API instead of extending the current one. |
Yes as described by @nashif, it's better to have something out of the existing SPI API. |
Excellent, thanks @tbursztyka. @swift-tk feel free to proceed with the RFC. |
the original context in which i was thinking about this issue (see #51402) was related to conceptually using a bt81x connected to a Nordic nRF52840 using QSPI on the bt81x requires instructing the bt81x to switch from 1x mode to x2 or x4 mode by writing to a "REG_SPI_WIDTH" register which defaults to x1. so you have to write in x1 mode to then switch to x2/x4 mode to me a new but functionally compatible MSPI Bus is fine, as long as you can identify what bus is attached to switch driver mode in relation to qspi and multiple devices as far as i am aware multiple devices on SPI is relatively standard, e.g. the nordic nRF52840 and a bt81x and e.g matter usage when the internal flash storage on the nRF52840 is insufficient, external flash is required, and you want a qspi display in combination with this you need to connect both to the same QSPI Bus controller. the nRF52840 QSPI module only has one CS so you cant connect multiple simultaneously internally It looks like you can reconfigure the pin multiplexing to connect both (deactivate module. change pin, reactivate module) secondly the bus controller driver would have to know which mode each device is currently operating in as talking 4x to a 1x device and vice versa is unlikely to work and devices may have individual mode switching requirements. im not sure what/how of the other SPI "modes" extend to QSPI |
In your case, the device switching will have to be managed by software. The controller driver does not need to know the device operating mode as long as the device drivers record their current operating mode and simply reconfigure the controller during device switching. But for the convience of debug, the controller driver may record the device specific settings ( |
@swift-tk I think a |
Hi @erwango |
If the intent is for this API to be used by Zephyr Sensors moving forward than it likely needs to implement the rtio API. Are there sensors in mind that will use this? |
I have created a draft PR here #71769 for the RFC. @carlescufi Does it need to be discussed again in AG meetings? Since I had to split into more than one PRs for these items.
|
Tbh, I'm not clear on the concept of channel on this API. What is a channel in the context of this API and how a client would know on which channel it should query status ? Anyway, let me check the drat PR :) |
That just a place holder for now, I saw some vendors used it. From what I understand, it is the controller instance number(for the actual hardware). |
From me, for the moment, I don't have any sensors in mind. Maybe @RichardSWheatley has some plans. |
No plans currently for me either. |
On STM32 at least, targetted controllers are not intended to be used with sensors but are focused on external memories. |
Perfect, I wanted to ensure we didn't do lots of great new work only to turn around and note that it doesn't fit with the newer sensor API flows we are pushing forward |
Limitation: - Zephyr doesn't support DDR mode currently and there is RFC discussion about spi enhancement(including ddr mode) on upstream. zephyrproject-rtos#70723 [RFC] Introduction to MSPI (Multi-bit SPI) driver API Test - Flash access: [Pass] READ JEDEC ID west build -p always -b it82xx2_evb samples/hello_world/ ***** delaying boot 1ms (per build configuration) ***** *** Booting Zephyr OS build v3.6.0-2887-g0048de286c9c (delayed boot 1ms) *** [00:00:00.004,882] <err> flash_test: frequency= 1500000 [00:00:00.004,913] <err> flash_test: Chip select 0 [00:00:00.005,523] <err> flash_test: JEDEC ID: c2 23 15 |.#. [00:00:03.006,134] <err> flash_test: Chip select 1 [00:00:03.006,774] <err> flash_test: JEDEC ID: c2 25 37 |.%7 - [Fail] Tool: spi_shell - SPI frequency [Ongoing] sspi_clk/1, /2, /4, /6, /8, /10, /12 [Pass] sspi_clk/16, /14 - SPI line mode - SPI mode [Pass] sigle mode: pass [Ongoing] dual/quad mode - Chip selection [Pass] CE0/CE1 Signed-off-by: Ren Chen <[email protected]>
@teburd I've been looking at RTIO and rtio_sqe_prep_* function usages, I have a few questions. If my understanding is correct, then it is not truly async and would not be compatible with a controller implementation that uses hardware command queue for async transfer calls but does not wait for the transfer to complete or handles the complete signaling. To be compatible with what I'm proposing here, the |
close as #71769 is merged. |
Introduction
Rather than attempting to overhaul the existing SPI interface, adding a new MSPI API may be a better option for both new and existing users.
Problem description
The existing Zephyr SPI has many limitations including but not limited to
Proposed change
The MSPI interface should contain a controller driver that is SoC platform specific and implements the following APIs.
The device driver should then reference these APIs so that a unified device driver can be achieved.
Note: mspi_register_callback to removed from the list of syscalls.
Detailed RFC
Methodology
To better serve the modern-day memory devices, I have divided configurations into three categories:
The common configurations are also defined in mspi-controller.yaml as a standard and should be referenced by all MSPI controller drivers’ bindings. For example, the ambiq,mspi-controller.yaml includes mspi-controller.yaml and has additional properties exclusive to Ambiq MSPI hardware/hal.
The common settings are also defined in mspi-device.yaml as a standard and should be referenced by all MSPI device drivers’ bindings. For example, the ambiq,mspi-device.yaml includes mspi-device.yaml and has additional timing properties only for Ambiq MSPI.
API Detail
Now let's dive deeper and checkout what each API should do.
This routine provides a generic interface to override MSPI controller capabilities. In the controller driver, one may implement this API to initialize or re-initialize their controller hardware. Additional SoC platform specific settings that are not in struct mspi_cfg may be added to one's own binding(xxx,mspi-controller.yaml) so that one may derive the settings from DTS and configure it in this API. In general, these settings should not change during run-time.
This routine provides a generic interface to override MSPI controller device specific settings. With struct mspi_dev_id defined as the device index and CE GPIO from device tree, the API supports multiple devices on the same controller instance. It is up to the controller driver implementation whether to support device switching either by software or by hardware. The implementation may also support individual parameter configurations specified by enum mspi_dev_cfg_mask.
The settings within struct mspi_dev_cfg don't typically change once the mode of operation is determined after the device initialization.
An example of the DTS.
This routine provides a generic interface to check whether the hardware is busy. This is useful in the multiple slave devices scheme.
This routine provides a generic interface to register different types of bus events.
The dev_id is provided so that the controller can identify its device and determine whether the access is allowed in a multiple device scheme.
The enum mspi_bus_event is a preliminary list of bus events. There are XIP events that can be added. I encourage the community to come up with more events that they would use.
This routine provides a generic interface to transfer a request synchronously/asynchronously.
The dev_id is provided so that the controller can identify its device and determine whether the access is allowed in a multiple device scheme.
The req is of type mspi_xfer_packet which allows for dynamically changing the transfer related settings once the mode of operation is determined and configured by mspi_dev_config.
The API supports bulk transfers with different starting addresses and sizes with struct mspi_buf. However, it is up to the controller implementation whether to support scatter IO and callback management. The controller can determine which user callback to trigger based on enum mspi_bus_event_cb_mask upon completion of each async/sync transfer if the callback had been registered using mspi_register_callback. Or not to trigger any callback at all with MSPI_BUS_NO_CB even if the callbacks are already registered.
In which case that a controller supports hardware command queue, user could take the full advantage of it in terms of performance if scatter IO and callback management are supported.
This routine provides a generic interface to configure the XIP and scrambling feature. Typically, the cfg parameter includes an enable and the range of address to take effect. I also wouldn't expect these settings to change often.
This routine provides a generic interface to configure timing parameters that are SoC platform specific.
If it is used, there should be one's own definition for param_mask and cfg type in one's own *.h file.
Dependencies
To compile, one needs to checkout “apollo3p-dev-mspi” at https://github.com/AmbiqMicro/ambiqhal_ambiq.git and "RFC-MSPI" at https://github.com/AmbiqMicro/ambiqzephyr.git
The branch is based of a PR that has yet to be merged to Zephyr main. #67815.

Please look at these commits for the example implementations.
The API prototype is at https://github.com/AmbiqMicro/ambiqzephyr/blob/RFC-MSPI/include/zephyr/drivers/mspi.h
Example code
Example implementation of the MSPI API can be found in this path zephyr\drivers\mspi\mspi_ambiq_ap3.c
Example usage of the MSPI API can be found in the following files.
zephyr\drivers\memc\memc_mspi_aps6404l.c PSRAM APS6404L device driver
zephyr\drivers\flash\flash_mspi_atxp032.c NOR FLASH ATXP032 device driver
zephyr\samples\drivers\memc\src\main.c demo the usage of mspi_transceive_async
The text was updated successfully, but these errors were encountered: