Skip to content

performance: add pre-fetch for block reader #9308

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
BohuTANG opened this issue Dec 20, 2022 · 3 comments · Fixed by #9335
Closed

performance: add pre-fetch for block reader #9308

BohuTANG opened this issue Dec 20, 2022 · 3 comments · Fixed by #9335
Labels
C-performance Category: Performance

Comments

@BohuTANG
Copy link
Member

BohuTANG commented Dec 20, 2022

Summary

For sync read:
https://github.com/datafuselabs/databend/blob/523e2190c275d481c707f8ee35972fc7290cba38/src/query/storages/fuse/fuse/src/io/read/block_reader.rs#L380-L390

https://github.com/datafuselabs/databend/blob/523e2190c275d481c707f8ee35972fc7290cba38/src/query/storages/fuse/fuse/src/io/read/block_reader.rs#L389-L389

If we have many indices may be a waste because their offset may be adjacent (note: not sequentially connected, there may be some gaps, like the gap <1KB), converts small fragmented reads into one large read, so we can merge the reading them all in one:

let (merge_read_offset, merge_read_length) = ...;
let result = Self::sync_read_column(op.object(&location), merge_read_offset, merge_read_length); 
-- Strip out (offset, length) data from result
@BohuTANG
Copy link
Member Author

cc @sundy-li @RinChanNOWWW

@BohuTANG BohuTANG added the C-performance Category: Performance label Dec 20, 2022
@sundy-li
Copy link
Member

sync_read_column will respect fs cache if they are adjacent.

@BohuTANG
Copy link
Member Author

async read from the second object store, like s3, the pre-fetch will be helpful.

The query:

SELECT * FROM hits  WHERE URL LIKE '%google%' ORDER BY EventTime LIMIT 10;

I print all read costs for each column of a partition:
image

Read 30086bytes took: 114 ms. It's almost the same as reading 890095 bytes took: 111 ms. It's network latency bounded, not io-bounded.

FYI @Xuanwo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-performance Category: Performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants