Skip to content

feat: block stream write #17744

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 17, 2025
Merged

feat: block stream write #17744

merged 8 commits into from
Apr 17, 2025

Conversation

zhyass
Copy link
Member

@zhyass zhyass commented Apr 9, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR introduces block stream writing support in Databend, which enables writing data in a streaming manner during insert operations.

Key highlights:

  • Adds a new processor: TransformBlockWriter, designed to stream and flush data blocks efficiently based on uncompressed size and compressed size thresholds.
  • Supports streaming block writing for:
    • INSERT statements
    • COPY INTO operations
  • Currently, only the Parquet format is supported.

To enable block stream write:

set enable_block_stream_write = 1;

This improvement lays the foundation for better memory control and performance in large-scale data writing scenarios.

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Apr 9, 2025
@zhyass zhyass marked this pull request as draft April 9, 2025 17:39
@zhyass zhyass force-pushed the feat_fix branch 3 times, most recently from a0e48a0 to b6b2b41 Compare April 11, 2025 16:47
@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Apr 14, 2025
@zhyass zhyass marked this pull request as ready for review April 15, 2025 09:05
@zhyass zhyass added ci-benchmark Benchmark: run all test and removed ci-cloud Build docker image for cloud test labels Apr 15, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17744-87af3a3-1744714248

note: this image tag is only available for internal use.

@databendlabs databendlabs deleted a comment from github-actions bot Apr 15, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Apr 15, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Apr 15, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Apr 15, 2025
@zhyass zhyass removed the ci-benchmark Benchmark: run all test label Apr 15, 2025
@zhyass zhyass added the ci-benchmark Benchmark: run all test label Apr 15, 2025
@databendlabs databendlabs deleted a comment from github-actions bot Apr 15, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17744-c0fc4ee-1744725941

note: this image tag is only available for internal use.

@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-benchmark Benchmark: run all test labels Apr 15, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17744-4308f77-1744747347

note: this image tag is only available for internal use.

@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Apr 16, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17744-36da122-1744772602

note: this image tag is only available for internal use.

Copy link
Member

@SkyFan2002 SkyFan2002 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zhyass zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Apr 16, 2025
Copy link
Contributor

Docker Image for PR

  • tag: pr-17744-274c6e0-1744788273

note: this image tag is only available for internal use.

@BohuTANG
Copy link
Member

Confilicts: src/query/storages/fuse/src/io/write/block_writer.rs

@BohuTANG BohuTANG merged commit dd2447c into databendlabs:main Apr 17, 2025
144 of 148 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-cloud Build docker image for cloud test pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants