Skip to content

Improve loading speed by 50% #66

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

DavidEGrayson
Copy link
Contributor

@DavidEGrayson DavidEGrayson commented Feb 8, 2023

The W25Q16JVUXIQ flash chip on the Pico has multiple erase commands that erase different sizes of memory, as documented in the datasheet:

image

The current implementation of picotool only uses the 4 KB erase command, which is the least efficient one in terms of bytes per second.

This pull request changes picotool to pick erase sizes based on how much flash is left to load. The test results below show that loading a 596 KiB UF2 currently takes 4.8 seconds, and with my changes it will only take 3.2 seconds, which means the speed has increased by 50%. This test was on Windows but I also saw similar results on Linux.

Status quo (commit 03f2812):

$ time ./picotool load --verify rp2-pico-20220618-v1.19.1.uf2
Loading into Flash: [==============================]  100%
Verifying Flash:    [==============================]  100%
  OK

real    0m4.807s
user    0m0.015s
sys     0m0.000s

With this PR (commit cdf6228):

$ time ./picotool load --verify rp2-pico-20220618-v1.19.1.uf2
Loading:   [==============================]  100%
Verifying: [==============================]  100%

real    0m3.212s
user    0m0.000s
sys     0m0.000s

There is room for further improvement if anyone is interested, because the way the code picks erase sizes is very simple right now. When it needs to erase 12 KB, it will use three 4 KB commands instead of using a single 32 KB erase command which would be faster. Also, it would do the erasing using three separate USB commands when one would work.

I haven't looked into how standardized the larger erase commands are and I don't know what set of flash chips picotool is aiming to support, but I thought I'd make the pull request anyway to start the conversation.

The first commit of this pull request contains the speed improvement, and could be merged in or cherry-picked on its own.

The pull request also contains a second commit that further improves things but doesn't have much effect on speed. The second commit changes the order that things are done so that all verifications are done after all writes. This is important because the erasing/writing commands might conceivably malfunction and write to parts of the chip they were not supposed to touch, invalidating an earlier verification: the later we can delay verification, the more meaningful it is. Secondly, the current code fills the screen with numerous progress bars and "OK" messages when you give it a sparse UF2 file that writes to many different regions of flash: it prints "Loading", "Verifying", and "OK" for every region. Moving all the verifications to the end fixes that.

If anyone is interested, I have a branch with all of my improvements here: https://github.com/DavidEGrayson/picotool/tree/dev/david/master

Loading a 600 KB UF2 file onto a Pico from Windows now
takes 3.2 s instead of 4.8 s.
- All verification should be done after all writing just in case the
  writing commands are misconfigured and have unintended side effects.
- We should avoid printing numerous progress bars when loading a
  sparse file that writes to many separate areas of flash.
Comment on lines +1854 to +1862
uint8_t block_erase_cmd = 0x20;
if (mem_range.to - base >= 0x10000) {
batch_size = 0x10000;
block_erase_cmd = 0xD8;
}
else if (mem_range.to - base >= 0x8000) {
batch_size = 0x8000;
block_erase_cmd = 0x52;
}
Copy link
Contributor

@lurch lurch Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably makes sense to add #define values for these different erase cmds? 🤷‍♂️
(but like you, I dunno how widely supported these commands are amongst other flash chips, and that'll be the kicker here!)

@@ -39,6 +39,7 @@ int picoboot_enter_cmd_xip(libusb_device_handle *usb_device);
int picoboot_exit_xip(libusb_device_handle *usb_device);
int picoboot_reboot(libusb_device_handle *usb_device, uint32_t pc, uint32_t sp, uint32_t delay_ms);
int picoboot_exec(libusb_device_handle *usb_device, uint32_t addr);
int picoboot_flash_range_erase(libusb_device_handle *, uint32_t, uint32_t, uint32_t, uint8_t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the missing parameter-names here?

Copy link
Contributor Author

@DavidEGrayson DavidEGrayson Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to keep the lines to a reasonable length so you can have a text editor and terminal on the same screen. It would have been OK to have the names and insert a line break instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the header-file often used as a quick-reference list of the functions and their parameters? That's why I'd argue for including the parameter-names, but ultimately that's Graham's decision.

@lurch
Copy link
Contributor

lurch commented Feb 8, 2023

When it needs to erase 12 KB, it will use three 4 KB commands instead of using a single 32 KB erase command which would be faster.

But surely you can't guarantee that there's not data in that other 20KB that you don't want to be erased? 🤔

@DavidEGrayson
Copy link
Contributor Author

This PR only erases what it needs to, but for future work maybe it could be argued that erasing a 32 KB or 64 KB chunk isn't all that different from erasing 4 KB. I'm actually accustomed to programming smaller chips like AVRs where we just erase the whole chip by default.

@kilograham
Copy link
Contributor

thanks; yes we will need to think about how to tell picotool what erase commands it can use based on the flash chip (or whether there is a standard progression)

@lurch
Copy link
Contributor

lurch commented Feb 8, 2023

for future work maybe it could be argued that erasing a 32 KB or 64 KB chunk isn't all that different from erasing 4 KB.

Picotool allows you to write arbitrary chunks of data at arbitrary addresses, which might be useful if you have a multi-stage bootloader or something, or if you have a "data blob" that you want to update independently of the executable code. 🤷‍♂️

@DavidEGrayson
Copy link
Contributor Author

Just a reminder: I think think PR represents a very good improvement to the performance of picotool and should be merged in, even if there are some code style issues that someone wants to change later.

@lurch
Copy link
Contributor

lurch commented Jul 22, 2023

There are many 3rd-party boards all using the RP2040 https://github.com/raspberrypi/pico-sdk/tree/develop/src/boards/include/boards

Do you know if this PR works correctly with all the various Flash chips that they use?

@kilograham
Copy link
Contributor

Just a reminder: I think think PR represents a very good improvement to the performance of picotool and should be merged in, even if there are some code style issues that someone wants to change later.

some version of it will be at some point ;-) just not super imminently

@DavidEGrayson
Copy link
Contributor Author

Hmmm, I'm trying to update this pull request to work with the picotool version 2 and the RP2350, but it seems like picoboot_exec always just returns an error code. I'll have to do more tests to figure out what's going on.

@kilograham
Copy link
Contributor

picoboot EXEC does not exist on RP2350 for security reasons.

@DavidEGrayson
Copy link
Contributor Author

DavidEGrayson commented Sep 1, 2024

Oh, thanks for pointing that out. When I look at _do_flash_erase_range in the RP2350 bootrom, I see it just erases one 4096-byte sector at a time, unfortunately. There is another function in the same bootrom named s_varm_api_flash_range_erase which is capable of faster erasing but I guess there's no way to call it from USB.

I'm closing this pull request since there won't be a way to make it with the RP2350.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants