Skip to content

Pull upstream dma backport into rpi 4.6.y #1522

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jun 10, 2016

Conversation

clivem
Copy link

@clivem clivem commented Jun 9, 2016

This is an 18 patch series, consisting of 16 patches from @msperl (the upstream dma backport) and 2 "fix" patches from @HiassofT

msperl and others added 18 commits June 9, 2016 13:36
The original patch contained 3 dma channels that were masked out.

These - as far as research and discussions show - are a
artefacts remaining from the downstream legacy dma-api.

Right now down-stream still includes a legacy api used only
in a single (downstream only) driver (bcm2708_fb) that requires
2D DMA for speedup (DMA-channel 0).
Formerly the sd-card support driver also was using this legacy
api (DMA-channel 2), but since has been moved over to use
dmaengine directly.

The DMA-channel 3 is already masked out in the devicetree in
the default property "brcm,dma-channel-mask = <0x7f35>;"

So we can remove the whole masking of DMA channels.

Signed-off-by: Martin Sperl <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
Add additional defines describing the DMA registers
as well as adding some more documentation to those registers.

Signed-off-by: Martin Sperl <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
…_desc

In preparation to consolidating code we move the cyclic member
into the bcm_2835_desc structure.

Signed-off-by: Martin Sperl <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
…method

In preparation of adding slave_sg functionality this patch moves the
generation/allocation of bcm2835_desc and the building of
the corresponding DMA-control-block chain from bcm2835_dma_prep_dma_cyclic
into the newly created method bcm2835_dma_create_cb_chain.

Signed-off-by: Martin Sperl <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
The bcm2835 dma system has 2 basic types of dma-channels:
* "normal" channels
* "light" channels

Lite channels are limited in several aspects:
* internal data-structure is 128 bit (not 256)
* does not support BCM2835_DMA_TDMODE (2D)
* DMA length register is limited to 16 bit.
  so 0-65535 (not 0-65536 as mentioned in the official datasheet)
* BCM2835_DMA_S/D_IGNORE are not supported

The detection of the type of mode is implemented by looking at
the LITE bit in the DEBUG register for each channel.
This allows automatic detection.

Based on this the maximum block size is set to (64K - 4) or to 1G
and this limit is honored during generation of control block
chains. The effect is that when a LITE channel is used more
control blocks are used to do the same transfer (compared
to a normal channel).

As there are several sources/target DREQS that are 32 bit wide
we need to have the transfer to be a multiple of 4 as this would
break the transfer otherwise.

This is why the limit of (64K - 4) was chosen over the
alternative of (64K - 4K).

Signed-off-by: Martin Sperl <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
Add slave_sg support to bcm2835-dma using shared allocation
code for bcm2835_desc and DMA-control blocks already used by
dma_cyclic.

Note that bcm2835_dma_callback had to get modified to support
both modes of operation (cyclic and non-cyclic).

Tested using:
* Hifiberry I2S card (using cyclic DMA)
* fb_st7735r SPI-framebuffer (using slave_sg DMA via spi-bcm2835)
playing BigBuckBunny for audio and video.

Signed-off-by: Martin Sperl <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
Also added check for an error condition in bcm2835_dma_create_cb_chain
that showed up during development of this patch.

Tested using dmatest for all enabled channels.

Signed-off-by: Martin Sperl <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
Use platform_get_irq_byname to allow for correct mapping of
interrupts to dma channels.

The currently implemented device tree is unfortunately
implemented with the wrong assumption, that each dma-channel
has its own dma channel, but dma-irq 11 is handling
dma-channel 11-14 and dma-irq 12 is actually a "catch all"
interrupt.

So here we use the byname variant and require that interrupts
are explicitly named via the interrupts-name property in the
device tree.

The use of shared interrupts is also implemented.

As a side-effect this means we can now use dma channels 12, 13 and 14
in a correct manner - also testing shows that onl using
channels 11 to 14 for spi and i2s works perfectly (when playing
some video)

Signed-off-by: Martin Sperl <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Vinod Koul <[email protected]>
Load driver early since at least bcm2708_fb doesn't support deferred
probing and even if it did, we don't want the video driver deferred.

Support the legacy DMA API which is needed by bcm2708_fb
(but only using the dedicated dma channel 0).

Signed-off-by: Noralf Trønnes <[email protected]>
Signed-off-by: Martin Sperl <[email protected]>
Dma channel0 is used by the legacy api - to avoid confilcts this
needs to get masked out in the device-tree instead of hardcoding it
in the driver.

Signed-off-by: Martin Sperl <[email protected]>
Add interrupt-names properties to dt and apply the correct
mapping between irq and dma channels.

Signed-off-by: Martin Sperl <[email protected]>
Enable the use of dma-channel 0 when using the vc4-kms-v3d overlay.

Signed-off-by: Martin Sperl <[email protected]>
The code responsible for splitting periods into chunks that
can be handled by the DMA controller missed to update total_len,
the number of bytes processed in the current period, when there
are more chunks to follow.

Therefore total_len was stuck at 0 and the code didn't work at all.
This resulted in a wrong control block layout and audio issues because
the cyclic DMA callback wasn't executing on period boundaries.

Fix this by adding the missing total_len update.

Signed-off-by: Matthias Reichl <[email protected]>
Signed-off-by: Martin Sperl <[email protected]>
Tested-by: Clive Messer <[email protected]>
The current cyclic DMA period splitting implementation can generate
very small chunks at the end of each period. For example a 65536 byte
period will be split into a 65532 byte chunk and a 4 byte chunk on
the "lite" DMA channels.

This increases pressure on the RAM controller as the DMA controller
needs to fetch two control blocks from RAM in quick succession and
could potentially cause latency issues if the RAM is tied up by other
devices.

We can easily avoid these situations by distributing the remaining
length evenly between the last-but-one and the last chunk, making
sure that split chunks will be at least half the maximum length the
DMA controller can handle.

This patch checks if the last chunk would be less than half of
the maximum DMA length and if yes distributes the max len+4...max_len*1.5
bytes evenly between the last 2 chunks. This results in chunk sizes
between max_len/2 and max_len*0.75 bytes.

Signed-off-by: Matthias Reichl <[email protected]>
Signed-off-by: Martin Sperl <[email protected]>
Tested-by: Clive Messer <[email protected]>
@popcornmix
Copy link
Collaborator

Thanks. @msperl @HiassofT confirmation that this works for you would be welcome.

@msperl
Copy link
Contributor

msperl commented Jun 9, 2016

@popcornmix : I can not test this right now, as I do only have a rpi with me as access point in the hotel, but no audio/SPI hw attached to that host, so i can not test it...

But as I assume these were just cherry-picked from my branch, it should not be an issue.
For the last 2 patches I can only comment on code and can say that it looks fine.

My guess the first of those patches will go into upstream without questions, but there is a higher chance that discussions around the second commit could be more intensive, so this one may require some rewrite (but that is guesswork).

@clivem
Copy link
Author

clivem commented Jun 9, 2016

.... this one may require some rewrite ....

And if that is the case, we just revert the downstream patch and replace it with the upstream accepted version, right? ;)

@popcornmix
Copy link
Collaborator

And if that is the case, we just revert the downstream patch and replace it with the upstream accepted version, right? ;)

Yes. Not a big problem. Just something to keep an eye on.

Kernel builds and works with this PR. Let's see if I can test an I2S card...

@clivem
Copy link
Author

clivem commented Jun 9, 2016

And I wonder if it is worth giving a heads-up to one of those Sonic Pi users...... I'm not familiar with this Millhouse OS, and have no idea if it even includes Sonic Pi?????? But if it does and this PR is included in the nightly build tonight, it might be worth asking @rbnpi to try the nightly and see if it resolves his stuttering issues with SonicPi and IQAudIO DAC.

@pelwell
Copy link
Contributor

pelwell commented Jun 9, 2016

Or perhaps someone like @samaaron?

@popcornmix
Copy link
Collaborator

The Milhouse build is LibreELEC (Kodi media centre).
http://forum.kodi.tv/showthread.php?tid=269814

I've got an IQAudio DAC+ here and it seems fine with kodi using this 4.6 kernel with this PR.
Unless anyone spots an issue, I'm happy to merge this PR this evening so we can get some testing in tonight's Milhouse build.

@rbnpi
Copy link

rbnpi commented Jun 9, 2016

Have you tried it with jackd and Sonic Pi? That’s where the issue lies. It worked fine with previous kernel, but not with current release version.

On 9 Jun 2016, at 16:07, popcornmix [email protected] wrote:

The Milhouse build is LibreELEC (Kodi media centre).
http://forum.kodi.tv/showthread.php?tid=269814

I've got an IQAudio DAC+ here and it seems fine with kodi using this 4.6 kernel with this PR.
Unless anyone spots an issue, I'm happy to merge this PR this evening so we can get some testing in tonight's Milhouse build.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@pelwell
Copy link
Contributor

pelwell commented Jun 9, 2016

We were hoping to crowdsource some of the testing...

@samaaron
Copy link

samaaron commented Jun 9, 2016

Very happy to test this when I return from Estonia.

@popcornmix
Copy link
Collaborator

I've got sonic pi running with IQ Audio DAC+ and I'm not hearing any dropouts or glitches when playing some of the live loop demos. Not sure if I need any specific settings to provoke the issue, but it looks good to me.

@clivem
Copy link
Author

clivem commented Jun 9, 2016

I'm not hearing any dropouts or glitches when playing some of the live loop demos

can you cat hw_params for the card and tell me the buffer size and period size being used?

@popcornmix
Copy link
Collaborator

$ cat /proc/asound/card0/pcm0p/sub0/hw_params 
access: MMAP_INTERLEAVED
format: S32_LE
subformat: STD
channels: 2
rate: 44100 (44100/1)
period_size: 2048
buffer_size: 6144

@clivem
Copy link
Author

clivem commented Jun 9, 2016

Hmmm. That wasn't what I expected to see. Was expecting to see large numbers for buffer/period, which would have given a very plausible reason for the cause of the stuttering.....

@popcornmix
Copy link
Collaborator

I've just reverted this PR and tried the same Sonic Pi usage and it sounds much worse.
Not sure exactly how it is fixed, but this PR does fix something!

@popcornmix
Copy link
Collaborator

popcornmix commented Jun 9, 2016

No change in hw_params

$ cat /proc/asound/card0/pcm0p/sub0/hw_params 
access: MMAP_INTERLEAVED
format: S32_LE
subformat: STD
channels: 2
rate: 44100 (44100/1)
period_size: 2048
buffer_size: 6144

@HiassofT
Copy link
Contributor

I did a few quick tests (playback+recording) on my RPi3 and everything worked fine!

@popcornmix
Copy link
Collaborator

@HiassofT thanks.
I forgot to merge last night. Should happen at 8:45pm tonight (I've set a reminder) assuming no problems are discovered.

@popcornmix popcornmix merged commit 47b63bd into raspberrypi:rpi-4.6.y Jun 10, 2016
@popcornmix
Copy link
Collaborator

@clivem feel free to submit PR to 4.4 tree - we'll merge in a day or two if no one complains about the 4.6 kernel.

@clivem
Copy link
Author

clivem commented Jun 10, 2016

@popcornmix OK. Will do.

@msperl
Copy link
Contributor

msperl commented Jun 11, 2016

Note that there is one patch that Eric submitted that may be relevant as well - not sure what the current status is - it is called:
dmaengine: bcm2835: Fix polling for completion of DMA with interrupts masked.

@clivem
Copy link
Author

clivem commented Jun 11, 2016

I'll post a PR to add that Eric "dma completion" patch to the rpi-4.6.y branch later this evening.

EDIT: No need. @popcornmix has already cherry picked it into rpi-4.6.y. See #1524 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants