Skip to content

[CBOR] - Implement RFC compliant BigInteger bytes encoding & decoding #578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: 2.x
Choose a base branch
from

Conversation

iifawzi
Copy link

@iifawzi iifawzi commented Apr 26, 2025

Hello, I discovered this issue #431 while working on the async parser for CBOR.
This PR adds feature flags to control compliant encoding & decoding behavior as described in the RFC https://datatracker.ietf.org/doc/html/rfc8949#section-3.4.3

  • For encoding: Changed calculation to -1 - n for negative values
  • For decoding:
    • Applied -1 - n formula
    • Updated to treat bytes as signed using new BigInteger(1, _binaryValue)

Changes are controlled by feature flags:

  • CBORGenerator.Feature.CORRECT_CBOR_NEGATIVE_BIGINT_ENCODING
  • CBORParser.Feature.CORRECT_CBOR_NEGATIVE_BIGINT_DECODING

Both flags default to false for backward compatibility.

Since changes are feature-flag controlled, I think they should be safe for 2.19.1 and 2.20 releases (I'm not sure about versioning after the release) @cowtowncoder I'll adjust base branch/comments based on your feedback.

@cowtowncoder
Copy link
Member

First of all: thank you for working on this!

Second of all: Rats! With SemVer, we can't really merge that in 2.19 in a patch as that changes API.
So needs to go in 2.20 -- so 2.x branch is correct.

@iifawzi
Copy link
Author

iifawzi commented Apr 27, 2025

First of all: thank you for working on this!

Second of all: Rats! With SemVer, we can't really merge that in 2.19 in a patch as that changes API. So needs to go in 2.20 -- so 2.x branch is correct.

Thank you. so I understand even if it's managed by feature flags, we consider it an API change.

I will update the comments, and will also update the tests to a more realistic case, as I realized even if it's possible to have BigInteger(-1), it's not really a big integer, and might be confusing given checking any online encoder of CBOR won't use big integer tag for -1 by default. I will include another test with an actual big integer for clarity.

edit: marking it as a draft temporarily until I do more verifications and add more tests.

Signed-off-by: Fawzi Essam <[email protected]>
@iifawzi iifawzi marked this pull request as draft April 27, 2025 12:04
@iifawzi iifawzi marked this pull request as ready for review April 27, 2025 14:34
@iifawzi
Copy link
Author

iifawzi commented Apr 27, 2025

The only point to comment on is that we're not following the preferred serialization as described https://www.rfc-editor.org/rfc/rfc8949.html#name-bignums, so encoding -340282366920938463463374607431768211456 using jackson results in:

byte[] expectedBytes = {
                (byte) 0xC3,
                (byte) 0x51, // 17 bytes - leading zero, 16 for the number
                (byte) 0x00, // LEADING Zero
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF,
                (byte) 0xFF
        };

which's still considered fine encoding as per the RFC as long as we're able to decode it Decoders that understand these tags MUST be able to decode bignums that do have leading zeroes.

Tests added in the mapper to verify we're able to decode both with/without leading zeros to the same correct value, while for backward compatibility, it remains the same, it returns different incorrect values with/without leading zeros.

EDIT: we're fine with not following the preferred way anyway, it wasn't mentioned in the initial RFC (https://www.rfc-editor.org/rfc/rfc7049#section-2.4.2), only mentioned point is that decoder should be able to decode with/without leading zeros, which's what has been achieved through this PR.

Signed-off-by: Fawzi Essam <[email protected]>
@cowtowncoder
Copy link
Member

Thank you. so I understand even if it's managed by feature flags, we consider it an API change.

More specifically: Addition of said Feature flags is an API change (functionality addition). Something to do in a minor release, but not in patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants