Skip to content

[RFC] Proposing a new DMA Memory Management (DMM) subsystem #57220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
carlocaione opened this issue Apr 25, 2023 · 8 comments
Open

[RFC] Proposing a new DMA Memory Management (DMM) subsystem #57220

carlocaione opened this issue Apr 25, 2023 · 8 comments
Assignees
Labels
area: DMA Direct Memory Access RFC Request For Comments: want input from the community

Comments

@carlocaione
Copy link
Collaborator

carlocaione commented Apr 25, 2023

This is an extension of the issue already introduced in #36471

An implementation of the RFC can be found in #57602

The problem

In memory bus architectures, CPU running software is one of the masters accessing the bus, while DMA engines are other masters accessing the same (or other, but connected) bus.

In basic MCUs there is a simple bus architecture, where CPU and DMA engines can access the whole memory range, no caches are present, and memory addressing is consistent between all the bus masters.

The memory access in these devices can be summarized with the following rules:

  • CPU and DMA have access to the whole memory space
  • memory address used by CPU to point to any memory location A is equal to the memory address used by DMA to point to the memory location A
  • any data stored in memory by DMA is immediately visible to the CPU
  • any data stored in memory by the CPU is immediately visible to DMA

In more complex systems with a more complex memory architecture the interaction between CPU and DMA, or between multiple CPUs, can be complicated:

  • DMA can be restricted to access only part of the system memory (or only part of the memory can be accessed efficiently by DMA)
  • CPU may need to translate its buffer address (virtual or physical) to an address (or a set of addresses if the buffer is not continuous in the bus memory space) usable by DMA before passing it to a DMA engine (to take into account cache line alignment, memory region allocation, etc...)
  • CPU cache can contain stale data and CPU may need to invalidate cache to read the data that DMA or other CPU updated in the memory
  • Data intended by a CPU to be stored in memory can get stuck in a CPU's write-back cache and be invisible to DMA or other CPUs so the CPU has to flush caches before informing DMA or other CPUs about data availability in the memory

All of the discussed challenges must be addressed by the software running in a system with a complex memory architecture.

What Zephyr is doing about this

Zephyr has no solution yet for this kind of complex platforms. There are several scattered attempts to overcome these limitations:

Proposal

I'm proposing to add a new sub-system called DMM - DMA Memory Management with the following responsabilities:

  • allocating and freeing "bounce buffers" if the buffer provided by a driver user cannot be used by the DMA engine (memory accessible by DMA, aligned to DMA requirements, if cacheable: aligned and padded to cache lines),
  • copying data to and from the bounce buffers
  • translation of CPU buffer addresses (virtual or physical) to DMA-usable addresses
  • cache management if buffers are allocated in cacheable memory

Why this RFC?

Because before starting to write the code I want to have an opinion about this and to discuss whether the introduction of a new subsystem makes sense vs expanding the current APIs (DMA or MM).

@carlocaione carlocaione added the RFC Request For Comments: want input from the community label Apr 25, 2023
@carlocaione carlocaione changed the title [RFC] Proposing a new DMA Memory Management subsystem [RFC] Proposing a new DMA Memory Management (DMM) subsystem Apr 25, 2023
@carlocaione carlocaione self-assigned this Apr 25, 2023
@carlocaione
Copy link
Collaborator Author

carlocaione commented Apr 25, 2023

@hubertmis
Copy link
Member

hubertmis commented Apr 25, 2023

Thanks for raising this. It is an important missing feature in Zephyr.

I have a comment regarding expanding current DMA API with discussed features:

It looks that the current DMA driver API describes standalone DMA engines capable to copy data around the bus system, based on instructions provided in job descriptors. Standalone DMA like this is just one type of DMAs we can find in MCUs. Another example is a DMA built into a peripheral like UART, I2C, or even PWM. Any DMA (standalone, embedded in a peripheral, any other) needs proper memory management if memory system is complex.

That's why I think it is better to keep current DMA API as it - responsible to handle standalone DMAs. The proposed subsystem should be separated from drivers, but there must be clear dependency path. The proposed subsystem should be usable by any driver of any device with DMA capability (UART, I2C, PWM, standalone DMA engine, or any other). Implementations of device drivers of any type should be able to depend on the new subsystem.

To sum up, I think we need separated modules:

  1. DMA - being a driver for standalone DMA engines
  2. DMM - memory management for any driver of a device with DMA capability

DMA implementation can depend on DMM, not the other way around.
Other device drivers also can depend on DMM.

@teburd
Copy link
Collaborator

teburd commented Apr 25, 2023

I wonder if this could be better changed to an aligned/regioned/cached buffer API of sorts? DMA isn't the only bus master in some parts that might have requirements like this. Multicore parts with asymmetric cores might also have similar shared buffer requirements. E.g. NXP's imxrt685

Do you have a particular DMA that you are running into trouble with otherwise with the existing API?

@henrikbrixandersen henrikbrixandersen added the area: DMA Direct Memory Access label Apr 30, 2023
@carlocaione
Copy link
Collaborator Author

carlocaione commented May 5, 2023

I wonder if this could be better changed to an aligned/regioned/cached buffer API of sorts?

Yes, the DMA case was just the most common one but this is indeed a buffer management API of sorts. We can change the name if that is confusing.

Do you have a particular DMA that you are running into trouble with otherwise with the existing API?

This is secret ;) Secret aside as @hubertmis beautifully explained in #57220 (comment) this concerns not only DMA engines but also generic DMA-enabled peripherals.

@teburd
Copy link
Collaborator

teburd commented May 5, 2023

I don't disagree with any of that, but I'd point out that this isn't solely a DMA concern though the dmm naming implies it.

The problem is shared memory among bus masters and matching the requirements needed for doing so. So arc_buf, shm_buf, mmbuf, whatever you want to call it... its a region shared amongst bus masters. CPU cores, DMAs, or as you say yourself peripherals that can also act as bus masters reading and writing directly to memory.

@hubertmis
Copy link
Member

I wonder if this could be better changed to an aligned/regioned/cached buffer API of sorts

I think an aligned/regioned/cached buffer API would be a complementary solution, but it would not solve all the issues pointed by @carlocaione . I think memory management for DMA-capable device drivers is something we should define first, and later we could allow optimizations providing a buffer allocator with alignment/padding/cache-awarness features.

Multicore parts with asymmetric cores might also have similar shared buffer requirements

That's a good point. I think the DMM module proposed by @carlocaione can be extended to be usable not only by DMA-capable device drivers, but also by IPC subsystems. Or a common part can be extracted to be used by both.

@hakehuang
Copy link
Collaborator

I think below two assumption is not correct:

  • CPU and DMA have access to the whole memory space
    it can be controlled by domain
  • memory address used by CPU to point to any memory location A is equal to the memory address used by DMA to point to the memory location A
    DMA address is usually system address, and cpu address is core address, which is different in hybird core system.

@fbenamrouche
Copy link

Anything new about this proposal?
I would like to avoid having the memcpy done by the spi driver for evey transfer, and reserving a statically DMA capable buffer is a loss of space since the bounce buffer in the spi driver will be unused. (and there is no API to get the bounce buffer from the spi driver so that we can write directly inside)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: DMA Direct Memory Access RFC Request For Comments: want input from the community
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

6 participants