-
-
Notifications
You must be signed in to change notification settings - Fork 6
Use multiple processes per DataSource in Manager #311
Comments
I'll work on this now... |
with 4 processes per data source, we're getting 18 batches of satellite data in 131 seconds. Which is 7 secs per batch. |
with 8 processes, it's quite juddery and does about one sat batch every 9 secs. |
with 1 process it's also 8 seconds per batch! |
8 seconds per batch isn't terrible: That's about 2.3 days to create 25,000 batches. but, yeah, we probably want to make sure |
and about 8 seconds per batch with 2 processes per DataSource! OK. I think the conclusion is clear: Using multiple processes per DataSource doesn't speed up satellite. Which is probably because dask is already using multiple processes for us. So I'll close the associated PR. This isn't such bad news because concurrency definitely adds complexity! |
Some DataSources would benefit a lot from having multiple Processes per DataSource.
Maybe make the
n_processes
configurable.The text was updated successfully, but these errors were encountered: