Skip to content

ADC DMA fails on FRDM K64F #56070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
teburd opened this issue Mar 21, 2023 · 23 comments
Closed

ADC DMA fails on FRDM K64F #56070

teburd opened this issue Mar 21, 2023 · 23 comments
Assignees
Labels
area: ADC Analog-to-Digital Converter (ADC) area: DMA Direct Memory Access bug The issue is a bug, or the PR is fixing a bug platform: NXP NXP priority: medium Medium impact/importance bug

Comments

@teburd
Copy link
Collaborator

teburd commented Mar 21, 2023

Describe the bug
The adc_dma test, buildable using

west build -p always -b frdm_k64f -T tests/drivers/adc/adc_dma/drivers.adc-dma

fails since commit
6e21ebf

It looks like perhaps the NXP specific changes made in this commit broke things.

Test output if helpful

*** Booting Zephyr OS build zephyr-v3.3.0-1492-ga57810dc3b1e ***
Running TESTSUITE adc_dma
===================================================================
START - test_adc_asynchronous_call

    Assertion failed at WEST_TOPDIR/zephyr/tests/drivers/adc/adc_dma/src/test_adc.c:308: test_task_asynchronous_call: (ret not equal to 0)
k_poll failed with error -11
 FAIL - test_adc_asynchronous_call in 1.015 seconds
@teburd teburd added bug The issue is a bug, or the PR is fixing a bug area: ADC Analog-to-Digital Converter (ADC) area: DMA Direct Memory Access platform: NXP NXP labels Mar 21, 2023
@teburd
Copy link
Collaborator Author

teburd commented Mar 21, 2023

@heinwessels please take a look, commit points to a change you made to the test

@nashif nashif added the priority: medium Medium impact/importance bug label Mar 21, 2023
@heinwessels
Copy link
Contributor

heinwessels commented Mar 22, 2023

The failure shown in the description is caused by me changing the interval_us from HW_TRIGGER_INTERVAL to 0. I know this is fine the STM32 so I incorrectly assumed it should work for the NXP as well. I asked teburd to see if that fixes it, and it's definitely a problem. However, the test fails at a new location afterwards.

*** Booting Zephyr OS build zephyr-v3.3.0-1492-ga57810dc3b1e ***
Running TESTSUITE adc_dma
===================================================================
START - test_adc_asynchronous_call
Samples read: 0x0372 0x0000 0x00f4 0x0000 0x037a 0x0000 
    Assertion failed at WEST_TOPDIR/zephyr/tests/drivers/adc/adc_dma/src/test_adc.c:176: check_samples: (INVALID_ADC_VALUE not equal to sample_value)
[5] should be empty
 FAIL - test_adc_asynchronous_call in 0.021 seconds
===================================================================
START - test_adc_invalid_request
E: Invalid resolution
E: Invalid resolution
Samples read: 0x0034 0x0000 
    Assertion failed at WEST_TOPDIR/zephyr/tests/drivers/adc/adc_dma/src/test_adc.c:176: check_samples: (INVALID_ADC_VALUE not equal to sample_value)
[1] should be empty
 FAIL - test_adc_invalid_request in 0.022 seconds
===================================================================
START - test_adc_repeated_samplings
repeated_samplings_callback: done 1
Samples read: 0x0201 0x0000 
    Assertion failed at WEST_TOPDIR/zephyr/tests/drivers/adc/adc_dma/src/test_adc.c:176: check_samples: (INVALID_ADC_VALUE not equal to sample_value)
[1] should be empty
ASSERTION FAIL [0] @ WEST_TOPDIR/zephyr/kernel/sched.c:1750
        aborting essential thread 0x200007e0
E: r0/a1:  0x00000004  r1/a2:  0x000006d6  r2/a3:  0x00000080
E: r3/a4:  0x00002de5 r12/ip:  0x00000000 r14/lr:  0x0000663f
E:  xpsr:  0x41000011
E: Faulting instruction address (r15/pc): 0x00007c20
E: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0
E: Fault during interrupt handling

E: Current thread: 0x200007e0 (main)
E: Halting system

It's likely the crash is called by a test failing, but the cause of the test failing concerns me. INVALID_ADC_VALUE not equal to sample_value is a check I added, copied directly from adc_api, where it verfies the buffer is filled correctly. I'm wondering if this could point to a bug in the driver?

@hakehuang
Copy link
Collaborator

It's likely the crash is called by a test failing, but the cause of the test failing concrens me. INVALID_ADC_VALUE not equal to sample_value is a check I added, copied directly from adc_api, where it verfies the buffer is filled correctly. I'm wondering if this could point to a bug in the driver?

Thanks @heinwessels for qiuck analysis, I am checking the driver.

@heinwessels
Copy link
Contributor

heinwessels commented Mar 22, 2023

Having a look at adc_mcux_adc16.c it seems like the dma_block.block_size is set bigger than it should be, which would explain the (INVALID_ADC_VALUE not equal to sample_value) error.

The potential problematic code is here:

data->adc_dma_config.dma_block.block_size = ctx->sequence.buffer_size;

When starting the DMA the bytes-to-transfer (called block_size) is set to sequence->buffer_size. And for the test, the buffer_size is set to the actual size of the buffer, which is much larger than it needs to be for this specific test, in this case 24.

IMO buffer_size is the size of the buffer, and should not be used. (only to check that it's indeed big enough). Instead the block_size should be set to something like sizeof(sample).

In the STM32 driver we set it explictly to the number of channels samples multiplied by the size of each sample, and ignore the actual size of the buffer.

blk_cfg->block_size = channel_count * sizeof(int16_t);

@hakehuang do you think this might be it?

@hakehuang
Copy link
Collaborator

hakehuang commented Mar 22, 2023

@heinwessels , I will craete a PR for this fix, btw will you fix the interval_us issue, or you want to fix in as well?

@heinwessels
Copy link
Contributor

@heinwessels , I will craete a PR for this fix, btw will you fix the interval_us issue, or you want to fix in as well?

I'll create a fix for this one. It doesn't seem to be blocking the CI, so I think it's fine if it's two seperate PRs.

Thanks!

@heinwessels
Copy link
Contributor

@hakehuang I'm not sure how to fix the tests in a generic way, because I'm not sure of what the NXP drivers need. Could you possibly tell me why? For example, before my changes, the interval_us was set to 3 different values in three different tests.

For test_task_asynchronous_call it's set to HW_TRIGGER_INTERVAL = 2 .

/* Start consecutive samplings as fast as possible. */
.interval_us = HW_TRIGGER_INTERVAL,

For test_task_with_interval it's set to 500.
.interval_us = 500UL, /*make this double to sample time*/

For test_task_repeated_samplings it's set to 10000.
/* Start consecutive samplings as fast as possible. */
/* but the interval shall be larger than the HW trigger*/
.interval_us = SAMPLE_INTERVAL_US,

For me this all seems like it should all be replaced with a single value, because that's the minimum value that seems to work, and most tests can use that. So I can replace the test_task_asynchronous_call and test_task_repeated_samplings with this value.

#define MINIMUM_SAMPLING_TIME = 2

The only test that might need a different timing is test_task_with_interval, but the actual timing isn't tested at all, so technically the interval_us could be set to 2 as well. However, it seems to me that this test is a special case and needs an additional

#define INTERVAL_SAMPLING_TIME = 500

// in the test_task_with_interval test

options.interval_us = INTERVAL_SAMPLING_TIME;
zassert_true(milliseconds_spent > ( options.extra_samplings * INTERVAL_SAMPLING_TIME ), "...");

What do you think? Can I fix the tests this way?

@hakehuang
Copy link
Collaborator

hakehuang commented Mar 22, 2023

What do you think? Can I fix the tests this way?

the sample interval depends on the hardware mode, the minial interval is 2us for nxp paltforms(consider the longsample mode), and our adc is triggerred by a counter timer form HW, the counter is set to 2us as well. As to the 500US and 1000US, it is because I want to make sure the sample interval is double of the sample interval time, such as I can get exact a new data back.

@hakehuang
Copy link
Collaborator

hakehuang commented Mar 22, 2023

@heinwessels , I will craete a PR for this fix, btw will you fix the interval_us issue, or you want to fix in as well?

I'll create a fix for this one. It doesn't seem to be blocking the CI, so I think it's fine if it's two seperate PRs.

Thanks!

For the changes to

blk_cfg->block_size = channel_count * sizeof(int16_t); 

How can the driver know the expect result to return? in your test it is 1 plus 4 extra, but buffer given 24. and I did not see an option pass to driver on the exact number of results expected. say if someone wants 2 records in adc read, how could they do this? @heinwessels

@heinwessels
Copy link
Contributor

heinwessels commented Mar 22, 2023

@hakehuang

the minial interval is 2us for nxp paltforms(consider the longsample mode),

Of course. This would be a #define that's set to 2 for NXP, but defaults to zero.

As to the 500US and 1000US, it is because I want to make sure the sample interval is double of the sample interval time, such as I can get exact a new data back.

As far as I can see, it doesn't matter. If this interval_us is only used as a delay between the end of the previous sample, and the start of the next sample, as forced by the adc_context.h. It's not up to the driver implementation. Basically once a sample is completed, the driver calls adc_context_on_sampling_done. Then adc_context decides if it will delay or not, before triggering the next sample.

This means it should still wait for your hardware timer.

The important part is this function:

static inline void adc_context_start_read(struct adc_context *ctx,
const struct adc_sequence *sequence)
{
ctx->sequence = *sequence;
ctx->status = 0;
if (sequence->options) {
ctx->options = *sequence->options;
ctx->sequence.options = &ctx->options;
ctx->sampling_index = 0U;
if (ctx->options.interval_us != 0U) {
atomic_set(&ctx->sampling_requested, 0);
adc_context_enable_timer(ctx);
return;
}
}
adc_context_start_sampling(ctx);
}

I will implement it this way, and then maybe you can check if the NXP tests still pass?

How can the driver know the expect result to return? in your test it is 1 plus 4 extra, but buffer given 24. and I did not see an option pass to driver on the exact number of results expected. say if someone wants 2 records in adc read, how could they do this?

The extra_samples is handled by adc_context where it repeats a read for every extra_sample. This means your driver doesn't have to account for it. adc_context should automatically increment the buffer pointer for example. And since your driver doesn't support sampling multiple channels at once then channel_count will always be 1, and thus your DMA will only have to transfer the data for one single read.

@hakehuang
Copy link
Collaborator

hakehuang commented Mar 22, 2023

The extra_samples is handled by adc_context where it repeats a read for every extra_sample. This means your driver doesn't have to account for it. adc_context should automatically increment the buffer pointer for example. And since your driver doesn't support sampling multiple channels at once then channel_count will always be 1, and thus your DMA will only have to transfer the data for one single read.

@heinwessels , the NXP DMA can support multiple samples and increate the buffer by HW, so it will not need the extra software involving. I think maybe you can just skip the extra check so only do below

		if (i < expected_count) {
			zassert_not_equal(INVALID_ADC_VALUE, sample_value,
				"[%u] should be filled", i);
		} 

and remove the

	else {
			zassert_equal(INVALID_ADC_VALUE, sample_value,
				"[%u] should be empty", i);
		}

@heinwessels
Copy link
Contributor

heinwessels commented Mar 22, 2023

the NXP DMA can support multiple samples and increate the buffer by HW

In that case it should still set the block_size correctly, meaning sizeof(int16_t) * (1 + options.extra_samples).

The test is correct. If the user wants a read with 4 extra samples and supplies a buffer, then only 2 * 5 = 10 bytes should be written by the ADC driver. Even if the buffer is bigger than that. Currently your driver is writing 24 bytes.

@hakehuang
Copy link
Collaborator

I will implement it this way, and then maybe you can check if the NXP tests still pass?

sure, please add me to your PR review list. Thanks

@hakehuang
Copy link
Collaborator

hakehuang commented Mar 22, 2023

In that case it should still set the block_size correctly, meaning sizeof(int16_t) * (1 + options.extra_samples).

@heinwessels , I check the adc.h and find they define so


	/**
	 * Number of extra samplings to perform (the total number of samplings
	 * is 1 + extra_samplings).
	 */
	uint16_t extra_samplings;

our driver does not porcess the extra_samplings at all, I will add a PR to fix it, Thanks a lot for your checking

@hakehuang
Copy link
Collaborator

hakehuang commented Mar 22, 2023

@heinwessels , I find another issue here, see below

 ( (*** Booting Zephyr OS build zephyr-v3.3.0-1516-g1dce3c3ee2e9 ***
Running TESTSUITE adc_dma
===================================================================
START - test_adc_asynchronous_call
Samples read: 0x0371 0x003c 0x0371 0x003c 0x0371 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000
0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000
0xffff8000 0xffff8000 0xffff8000
 PASS - test_adc_asynchronous_call in 0.024 seconds
===================================================================
START - test_adc_invalid_request
E: Invalid resolution
E: Invalid resolution
Samples read: 0x032d 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000
0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000 0xffff8000
0xffff8000 0xffff8000 0xffff8000
 PASS - test_adc_invalid_request in 0.029 seconds
===================================================================
START - test_adc_repeated_samplings
repeated_samplings_callback: done 1
Samples read: 0x0371 0x003c
    Assertion failed at WEST_TOPDIR/zephyr/tests/drivers/adc/adc_dma/src/test_adc.c:176: check_samples: (INVALID_ADC_VALUE not equal to sample_value)
[1] should be empty
ASSERTION FAIL [0] @ WEST_TOPDIR/zephyr/kernel/sched.c:1750
        aborting essential thread 0x20000740
E: r0/a1:  0x00000004  r1/a2:  0x000006d6  r2/a3:  0x000000a2
E: r3/a4:  0x00002de5 r12/ip:  0x00000000 r14/lr:  0x0000663f
E:  xpsr:  0x41000011
E: Faulting instruction address (r15/pc): 0x00007c20
E: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0
E: Fault during interrupt handling

E: Current thread: 0x20000740 (idle)
E: Halting system

the issue is that is I correct the adc_mcux_adc16.c with 2 transfer and set the block_size to sizeof(int16_t) * (1 + options.extra_samples), and the test_adc_repeated_samplings still fails as it expect to get only one result, but as our driver do 1 + extra_samples in batches, the adc_context_on_sampling_done does not work here, how to handle this?

and I try to change the block_size to 2 so that it is handled by the adc_context_on_sampling_done , but the issue is below function

static void adc_context_update_buffer_pointer(struct adc_context *ctx,
					      bool repeat_sampling)
{
	struct mcux_adc16_data *data =
		CONTAINER_OF(ctx, struct mcux_adc16_data, ctx);

	if (repeat_sampling) {
		data->buffer = data->repeat_buffer;
	}
}

and

static void adc_context_update_buffer_pointer(struct adc_context *ctx,
					      bool repeat_sampling)
{
	struct adc_stm32_data *data =
		CONTAINER_OF(ctx, struct adc_stm32_data, ctx);

	if (repeat_sampling) {
		data->buffer = data->repeat_buffer;
	}
}

and as the repeat_sampling is false, so we do not move the data point at all.
and I find in st driver when dma is done below code is moved by the channel count, so st driver looks ok in this case.

ata->buffer += data->channel_count;

@heinwessels
Copy link
Contributor

@hakehuang I still think your NXP driver should have block_size = sizeof(int16_t). Can you try it with that? There's no mention of extra_samples in your driver, so how would the DMA know to repeat the sampling?

How the NXP driver should work:

  1. User calls read with extra_samples = 2. There are three channels selected.
  2. Driver configures DMA to read the three channels once. This means DMA length should be set to 3 * sizeof(int16_t) or POPCOUNT(data->channels) * sizeof(int16_t).
  3. All three channels are read, which calls adc_dma_callback, which in turn calls adc_context's adc_context_on_sampling_done.
  4. Now adc_context decides what to do next. If there are extra_samples specified it will trigger the read again, meaning start again at step 2. adc_context will also call the callback, which might return ADC_ACTION_REPEAT, which will force it to sample again anyway.

@heinwessels
Copy link
Contributor

and I find in st driver when dma is done below code is moved by the channel count, so st driver looks ok in this case.

Ah yeah, it might be that you have to add that. It's that exact line that caused me to improve the unit tests to check the buffer which is now failing for NXP, because I also forgot it. You can see it here in #52965 (comment)

@hakehuang
Copy link
Collaborator

Ah yeah, it might be that you have to add that. It's that exact line that caused me to improve the unit tests to check the buffer which is now failing for NXP, because I also forgot it. You can see it here in #52965 (comment)

@heinwessels , ok, thanks. after those changes it can work now, but I have a customer feedback here.
they want the dma to do multiply samples, not only one sample a time it is critial for their user case, any idea to provide such support as well, what I can think is to add anther parameter to adc_sequence_options like

	/**
	 * Number of extra samplings to perform (the total number of samplings
	 * is 1 + extra_samplings).
	 */
	uint16_t extra_samplings;
	/**
	 * Number of samplings to perform per request
	 */
	uint32_t samples_pre_req;
};

@heinwessels
Copy link
Contributor

heinwessels commented Mar 22, 2023

@hakehuang great news! So that means my PR works, and I will remove the draft label.

they want the dma to do multiply samples, not only one sample a time it is critial for their user case, any idea to provide such support as well

And my customer also maybe wants such a feature. I've opened a RFC to expand the ADC API to support this. If you could comment there that you need it it might get more interaction and decided sooner. You can find it here: #55783

@hakehuang
Copy link
Collaborator

  • User calls read with extra_samples = 2. There are three channels selected.
  • Driver configures DMA to read the three channels once. This means DMA length should be set to 3 * sizeof(int16_t) or POPCOUNT(data->channels) * sizeof(int16_t).
  • All three channels are read, which calls adc_dma_callback, which in turn calls adc_context's adc_context_on_sampling_done.
  • Now adc_context decides what to do next. If there are extra_samples specified it will trigger the read again, meaning start again at step 2. adc_context will also call the callback, which might return ADC_ACTION_REPEAT, which will force it to sample again anyway.

nxp adc channel dma mode does not handle the multi-channel well, I will add the missing parts as well, Thanks to point this out.

heinwessels added a commit to fancom/zephyr that referenced this issue Mar 22, 2023
In 6e21ebf the sampling interval was changed, which caused the
issue zephyrproject-rtos#56070. This is fixed by allowing different boards to
specify a sampling period.

Also changed the test_task_with_interval to verify that
the supplied interval was used.

Signed-off-by: Hein Wessels <[email protected]>
fabiobaltieri pushed a commit that referenced this issue Mar 23, 2023
In 6e21ebf the sampling interval was changed, which caused the
issue #56070. This is fixed by allowing different boards to
specify a sampling period.

Also changed the test_task_with_interval to verify that
the supplied interval was used.

Signed-off-by: Hein Wessels <[email protected]>
carlescufi pushed a commit that referenced this issue Apr 13, 2023
add support for async call and repeat sample test

1. change the DMA req to 2 byte each
2. increase the buffer pre-dma
3. add protection on invalid buffer

depends on: PR #56104
fixing: issue #56070

Signed-off-by: Hake Huang <[email protected]>
@github-actions
Copy link

This issue has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this issue will automatically be closed in 14 days. Note, that you can always re-open a closed issue at any time.

@github-actions github-actions bot added the Stale label May 22, 2023
@dleach02 dleach02 removed the Stale label May 23, 2023
@decsny
Copy link
Member

decsny commented May 30, 2023

@hakehuang is this issue resolved?

@hakehuang
Copy link
Collaborator

@hakehuang is this issue resolved?

yes it is resolved, and we see another failure #58467

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: ADC Analog-to-Digital Converter (ADC) area: DMA Direct Memory Access bug The issue is a bug, or the PR is fixing a bug platform: NXP NXP priority: medium Medium impact/importance bug
Projects
None yet
Development

No branches or pull requests

6 participants