Add pushChunks #176

vaidikcode · 2025-03-01T14:28:18Z

Signed-off-by: vaidikcode [email protected]

Description

Added pushChunks

Testing done

Submitter checklist

I have read and understood the CONTRIBUTING guide
I have run mvn license:update-file-header, mvn spotless:apply, pre-commit run -a, mvn clean install before opening the PR

Signed-off-by: vaidikcode <[email protected]>

vaidikcode · 2025-03-01T14:30:32Z

To fix the tests, one way is to add the alias method

public Layer pushBlobStream(ContainerRef containerRef, InputStream input, long size) {
    return pushChunks(containerRef, input, size);
}

or update the tests.
@jonesbusy
wdyt?

jonesbusy · 2025-03-01T14:42:23Z

Thanks for your PR. I will take a look.

Please update the tests. It's fine to break the existing API for now (Renaming pushBlobStream to pushChunks)

jonesbusy · 2025-03-01T14:44:01Z

src/main/java/land/oras/Registry.java

+            // Phase 1: Initialize upload session
+            URI uri = URI.create("%s://%s".formatted(getScheme(), containerRef.getBlobsUploadPath()));
+            OrasHttpClient.ResponseWrapper<String> response =
+                    client.post(uri, new byte[0], Map.of("Content-Length", "0"));


Please use constants from Consts for all headers

jonesbusy · 2025-03-01T14:45:44Z

src/main/java/land/oras/Registry.java

+                throw new OrasException("Failed to initiate blob upload: " + response.statusCode());
+            }
+
+            String location = response.headers().get(Const.LOCATION_HEADER.toLowerCase());


Not sure to understand this part. Why do we need to update the location?

According to the spec section on "Pushing a blob in chunks":

Each successful chunk upload MUST have a 202 Accepted response code, and MUST have the following headers:

Location: <location> Range: 0-<end-of-range>

Each consecutive chunk upload SHOULD use the provided in the response to the previous chunk upload.

The code is correctly implementing this part of the specification. After each chunk upload (PATCH request), the registry may update the location header to point to a different URL for the next chunk

jonesbusy · 2025-03-01T14:45:57Z

src/main/java/land/oras/Registry.java

+                endRange = startRange + bytesRead - 1;
+
+                Map<String, String> headers = Map.of(
+                        "Content-Type",


Constants here

jonesbusy · 2025-03-01T14:46:09Z

src/main/java/land/oras/Registry.java

+                }
+
+                // Updating range for next chunk
+                String rangeHeader = response.headers().get("range");


jonesbusy · 2025-03-01T14:46:17Z

src/main/java/land/oras/Registry.java

+
+            // Phase 3: Complete upload
+            Map<String, String> headers = Map.of(
+                    "Content-Type", "application/octet-stream",


jonesbusy · 2025-03-01T14:47:43Z

src/main/java/land/oras/Registry.java

+        // We Initilase the Message Digest first
+        MessageDigest digest;
+        try {
+            digest = MessageDigest.getInstance(


Logic should not be here

Use

public static boolean isSupported(String prefix)

Or implement the logic in private method

jonesbusy · 2025-03-01T14:50:00Z

src/main/java/land/oras/Registry.java

+    private static final int CHUNK_SIZE = 5 * 1024 * 1024;
+
+    /**
+     * The digest calculation limit


Why 16? Is it from the spec?

No, it's not from the specifications. It's just a general practical value. We can change it, though, as per our requirements. What should it be set to?

jonesbusy · 2025-03-01T14:50:27Z

src/main/java/land/oras/Registry.java

+    /**
+     * The chunk size for uploading blobs
+     */
+    private static final int CHUNK_SIZE = 5 * 1024 * 1024;


Why 5? It it from the spec? If yes would be nice to link it on the javadoc. Or explain this value

same as digest calculation limit.

I think this must be honnored

If the registry has a minimum chunk size, the POST response SHOULD include the following header, where <size> is the size in bytes (see the blob PATCH definition for usage)

https://github.com/opencontainers/distribution-spec/blob/main/spec.md

Also I think be default we should use the monolitic 2 step push unless the file is large (to also limit the number of HTTP request if not needed)

jonesbusy · 2025-03-01T14:50:42Z

src/main/java/land/oras/Registry.java

+            // Upload first chunk
+            if (totalBytesRead > 0) {
+                Map<String, String> headers = Map.of(
+                        "Content-Type",


jonesbusy · 2025-03-01T17:29:26Z

See

https://github.com/oras-project/oras-py/blob/1790ad7df3c83884f24aabd2ab4b2d97e4c2f3b0/oras/provider.py#L283-L293

Perhaps we should keep using the pushBlob internally.

Then just provide the pushChunk as public method. And if user wants to push large blobs they can just use it

vaidikcode · 2025-03-03T00:43:02Z

I agree. One way could be to:

Introduce a threshold constant.
Modify the pushLayers function used in pushArtifacts to select between pushBlob and pushChunks based on the file size.
Implement the pushChunks function to check the response header for the minimum chunk size.

Or just implement pushChunks and if user wants, they use(as you said)

@jonesbusy
wdyt?
Thanks!

jonesbusy · 2025-03-06T04:56:35Z

I agree. One way could be to:

Introduce a threshold constant.

Modify the pushLayers function used in pushArtifacts to select between pushBlob and pushChunks based on the file size.

Implement the pushChunks function to check the response header for the minimum chunk size.

Or just implement pushChunks and if user wants, they use(as you said)

@jonesbusy wdyt? Thanks!

Or just implement pushChunks and if user wants, they use(as you said). Yes correct. Very large artifact can be implemented by pushing blob/layer via chunks and then creating a manifest

Signed-off-by: vaidikcode <[email protected]>

jonesbusy · 2025-03-14T09:10:45Z

src/main/java/land/oras/Registry.java

+     * @param algorithm The supported algorithm
+     * @return The algorithm string for MessageDigest
+     */
+    private String getMessageDigestAlgorithm(SupportedAlgorithm algorithm) {


Such util already exist

jonesbusy · 2025-03-14T09:11:12Z

src/main/java/land/oras/Registry.java

+    }
+
+    // Helper method to convert bytes to hex
+    private static String bytesToHex(byte[] bytes) {


Pretty such this already exist. If not must be moved to DigestUtil

I can't simply move bytesToHex to DigestUtils and use it because the DigestUtils class appears to be designed as a proper utility class with static methods, yet its package-private access limits its usability.

What was the intended scope for DigestUtils? If it is meant to be a shared utility, making it public would be a cleaner solution than duplicating the functionality in the Registry. What do you think? Is there any other way to achieve the same behavior as bytesToHex in the createDigestString method? @jonesbusy

I suggest to check SupportedAlgorithm class. I think your answer is there.

Since this PR start to be quite large I think it's worth to split it in several smaller PR that can support this one.

It would make it easier to review and faster to merge smaller part of code

Add pushChunks

a818019

Signed-off-by: vaidikcode <[email protected]>

jonesbusy requested changes Mar 1, 2025

View reviewed changes

vaidikcode added 4 commits March 13, 2025 14:19

Finalise Push Chunks and add Test

a4884dd

Signed-off-by: vaidikcode <[email protected]>

Merge branch 'main' into Chunk_Upload

41f0050

Minor Fixes in resolving conflicts

5c0397d

Signed-off-by: vaidikcode <[email protected]>

Minor fixes

64e8a07

Signed-off-by: vaidikcode <[email protected]>

vaidikcode marked this pull request as ready for review March 13, 2025 09:21

vaidikcode marked this pull request as draft March 13, 2025 11:43

vaidikcode marked this pull request as ready for review March 13, 2025 11:47

jonesbusy reviewed Mar 14, 2025

View reviewed changes

fix

18b839d

vaidikcode mentioned this pull request Mar 20, 2025

add getter for algoritm name #227

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pushChunks #176

Add pushChunks #176

vaidikcode commented Mar 1, 2025 •

edited

Loading

vaidikcode commented Mar 1, 2025 •

edited

Loading

jonesbusy commented Mar 1, 2025

jonesbusy Mar 1, 2025

jonesbusy Mar 1, 2025

vaidikcode Mar 1, 2025 •

edited

Loading

jonesbusy Mar 1, 2025

jonesbusy Mar 1, 2025

jonesbusy Mar 1, 2025

jonesbusy Mar 1, 2025

jonesbusy Mar 1, 2025

vaidikcode Mar 1, 2025

jonesbusy Mar 1, 2025

vaidikcode Mar 1, 2025

jonesbusy Mar 1, 2025

jonesbusy Mar 1, 2025

jonesbusy commented Mar 1, 2025

vaidikcode commented Mar 3, 2025 •

edited

Loading

jonesbusy commented Mar 6, 2025

jonesbusy Mar 14, 2025

jonesbusy Mar 14, 2025

vaidikcode Mar 14, 2025 •

edited

Loading

jonesbusy Mar 16, 2025

Add pushChunks #176

Are you sure you want to change the base?

Add pushChunks #176

Conversation

vaidikcode commented Mar 1, 2025 • edited Loading

Description

Testing done

Submitter checklist

vaidikcode commented Mar 1, 2025 • edited Loading

jonesbusy commented Mar 1, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vaidikcode Mar 1, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonesbusy commented Mar 1, 2025

vaidikcode commented Mar 3, 2025 • edited Loading

jonesbusy commented Mar 6, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vaidikcode Mar 14, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vaidikcode commented Mar 1, 2025 •

edited

Loading

vaidikcode commented Mar 1, 2025 •

edited

Loading

vaidikcode Mar 1, 2025 •

edited

Loading

vaidikcode commented Mar 3, 2025 •

edited

Loading

vaidikcode Mar 14, 2025 •

edited

Loading