-
Notifications
You must be signed in to change notification settings - Fork 133
Improve loading speed by 50% #66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Loading a 600 KB UF2 file onto a Pico from Windows now takes 3.2 s instead of 4.8 s.
- All verification should be done after all writing just in case the writing commands are misconfigured and have unintended side effects. - We should avoid printing numerous progress bars when loading a sparse file that writes to many separate areas of flash.
uint8_t block_erase_cmd = 0x20; | ||
if (mem_range.to - base >= 0x10000) { | ||
batch_size = 0x10000; | ||
block_erase_cmd = 0xD8; | ||
} | ||
else if (mem_range.to - base >= 0x8000) { | ||
batch_size = 0x8000; | ||
block_erase_cmd = 0x52; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably makes sense to add #define
values for these different erase cmds? 🤷♂️
(but like you, I dunno how widely supported these commands are amongst other flash chips, and that'll be the kicker here!)
@@ -39,6 +39,7 @@ int picoboot_enter_cmd_xip(libusb_device_handle *usb_device); | |||
int picoboot_exit_xip(libusb_device_handle *usb_device); | |||
int picoboot_reboot(libusb_device_handle *usb_device, uint32_t pc, uint32_t sp, uint32_t delay_ms); | |||
int picoboot_exec(libusb_device_handle *usb_device, uint32_t addr); | |||
int picoboot_flash_range_erase(libusb_device_handle *, uint32_t, uint32_t, uint32_t, uint8_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the missing parameter-names here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just wanted to keep the lines to a reasonable length so you can have a text editor and terminal on the same screen. It would have been OK to have the names and insert a line break instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the header-file often used as a quick-reference list of the functions and their parameters? That's why I'd argue for including the parameter-names, but ultimately that's Graham's decision.
But surely you can't guarantee that there's not data in that other 20KB that you don't want to be erased? 🤔 |
This PR only erases what it needs to, but for future work maybe it could be argued that erasing a 32 KB or 64 KB chunk isn't all that different from erasing 4 KB. I'm actually accustomed to programming smaller chips like AVRs where we just erase the whole chip by default. |
thanks; yes we will need to think about how to tell picotool what erase commands it can use based on the flash chip (or whether there is a standard progression) |
Picotool allows you to write arbitrary chunks of data at arbitrary addresses, which might be useful if you have a multi-stage bootloader or something, or if you have a "data blob" that you want to update independently of the executable code. 🤷♂️ |
Just a reminder: I think think PR represents a very good improvement to the performance of picotool and should be merged in, even if there are some code style issues that someone wants to change later. |
There are many 3rd-party boards all using the RP2040 https://github.com/raspberrypi/pico-sdk/tree/develop/src/boards/include/boards Do you know if this PR works correctly with all the various Flash chips that they use? |
some version of it will be at some point ;-) just not super imminently |
Hmmm, I'm trying to update this pull request to work with the picotool version 2 and the RP2350, but it seems like |
picoboot EXEC does not exist on RP2350 for security reasons. |
Oh, thanks for pointing that out. When I look at I'm closing this pull request since there won't be a way to make it with the RP2350. |
The W25Q16JVUXIQ flash chip on the Pico has multiple erase commands that erase different sizes of memory, as documented in the datasheet:
The current implementation of picotool only uses the 4 KB erase command, which is the least efficient one in terms of bytes per second.
This pull request changes picotool to pick erase sizes based on how much flash is left to load. The test results below show that loading a 596 KiB UF2 currently takes 4.8 seconds, and with my changes it will only take 3.2 seconds, which means the speed has increased by 50%. This test was on Windows but I also saw similar results on Linux.
Status quo (commit 03f2812):
With this PR (commit cdf6228):
There is room for further improvement if anyone is interested, because the way the code picks erase sizes is very simple right now. When it needs to erase 12 KB, it will use three 4 KB commands instead of using a single 32 KB erase command which would be faster. Also, it would do the erasing using three separate USB commands when one would work.
I haven't looked into how standardized the larger erase commands are and I don't know what set of flash chips picotool is aiming to support, but I thought I'd make the pull request anyway to start the conversation.
The first commit of this pull request contains the speed improvement, and could be merged in or cherry-picked on its own.
The pull request also contains a second commit that further improves things but doesn't have much effect on speed. The second commit changes the order that things are done so that all verifications are done after all writes. This is important because the erasing/writing commands might conceivably malfunction and write to parts of the chip they were not supposed to touch, invalidating an earlier verification: the later we can delay verification, the more meaningful it is. Secondly, the current code fills the screen with numerous progress bars and "OK" messages when you give it a sparse UF2 file that writes to many different regions of flash: it prints "Loading", "Verifying", and "OK" for every region. Moving all the verifications to the end fixes that.
If anyone is interested, I have a branch with all of my improvements here: https://github.com/DavidEGrayson/picotool/tree/dev/david/master