Skip to content

ASCII digits only for versions #652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -235,8 +235,8 @@ public boolean next() {
index++;
break;
} else {
int digit = Character.digit(c, 10);
if (digit >= 0) {
if (c >= '0' && c <= '9') { // only ASCII digits
int digit = c - '0';
if (state == -1) {
end = index;
terminatedByNumber = true;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
* under the License.
*/
/**
* Ready-to-use version scheme for parsing/comparing versions and utility classes.
* Version scheme for parsing/comparing versions and utility classes.
* <p>
* Contains the "generic" scheme {@link org.eclipse.aether.util.version.GenericVersionScheme}
* that serves the purpose of "factory" (and/or parser) for all corresponding elements (all those are package private).
Expand All @@ -28,12 +28,13 @@
* <p>
* Below is the <em>Generic Version Spec</em> described:
* <p>
* Version string is parsed into version according to these rules below:
* Version string is parsed into version according to these rules:
* <ul>
* <li>The version string is parsed into segments, from left to right.</li>
* <li>Segments are explicitly delimited by single {@code "." (dot)}, {@code "-" (hyphen)} or {@code "_" (underscore)} character.</li>
* <li>Segments are implicitly delimited by transition between digits and non-digits.</li>
* <li>Segments are classified as numeric, string, qualifiers (special case of string) and min/max.</li>
* <li>Segments are explicitly delimited by a single {@code "." (dot)}, {@code "-" (hyphen)}, or {@code "_" (underscore)} character.</li>
* <li>Segments are implicitly delimited by a transition between ASCII digits and non-digits.</li>
* <li>Segments are classified as numeric, string, qualifiers (special case of string), and min/max.</li>
* <li>Numeric segments are composed of the ASCII digits 0-9. Non-ASCII digits are treated as letters.
* <li>Numeric segments are sorted numerically, ascending.</li>
* <li>Non-numeric segments may be qualifiers (predefined) or strings (non-empty letter sequence). All of them are interpreted as being case-insensitive in terms of the ROOT locale.</li>
* <li>Qualifier segments (strings listed below) and their sort order (ascending) are:
Expand All @@ -48,7 +49,7 @@
* </ul>
* </li>
* <li>String segments are sorted lexicographically and case-insensitively per ROOT locale, ascending.</li>
* <li>There are two special segments, {@code "min"} and {@code "max"}, they represent absolute minimum and absolute maximum in comparisons. They can be used only as trailing segment.</li>
* <li>There are two special segments, {@code "min"} and {@code "max"} that represent absolute minimum and absolute maximum in comparisons. They can be used only as the trailing segment.</li>
* <li>As last step, trailing "zero segments" are trimmed. Similarly, "zero segments" positioned before numeric and non-numeric transitions (either explicitly or implicitly delimited) are trimmed.</li>
* <li>When trimming, "zero segments" are qualifiers {@code "ga"}, {@code "final"}, {@code "release"} only if being last (right-most) segment, empty string and "0" always.</li>
* <li>In comparison of same kind segments, the given type of segment determines comparison rules.</li>
Expand All @@ -57,11 +58,11 @@
* <li>It is common that a version identifier starts with numeric segment (consider this "best practice").</li>
* </ul>
* <p>
* Note: this version spec does not document (nor cover) many corner cases, that we believe are "atypical" or not
* Note: this version spec does not document (or cover) many corner cases that we believe are "atypical" or not
* used commonly. None of these are enforced, but in future implementations they probably will be. Some known examples are:
* <ul>
* <li>Using "min" or "max" special segments as non-trailing segment. This yields in "undefined" behaviour and should be avoided.</li>
* <li>Having non-number as first segment of version. Versions are expected (but not enforced) to start with numbers.</li>
* <li>Using "min" or "max" special segments as a non-trailing segment. This yields in "undefined" behaviour and should be avoided.</li>
* <li>Having a non-number as the first segment of a version. Versions are expected (but not enforced) to start with numbers.</li>
* </ul>
*
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ void testUnlimitedNumberOfDigitsInNumericComponent() {
}

@Test
void testTransitionFromDigitToLetterAndViceVersaIsEqualivantToDelimiter() {
void testTransitionFromDigitToLetterAndViceVersaIsEquivalentToDelimiter() {
assertOrder(X_EQ_Y, "1alpha10", "1.alpha.10");
assertOrder(X_EQ_Y, "1alpha10", "1-alpha-10");

Expand Down Expand Up @@ -495,8 +495,32 @@ void testMaximumSegment() {
assertOrder(X_LT_Y, "1.max", "2.min");
}

@Test
void testCompareLettersToNumbers() {
assertOrder(X_GT_Y, "1.7", "J");
}

@Test
void testCompareDigitToLetter() {
assertOrder(X_GT_Y, "7", "J");
assertOrder(X_GT_Y, "7", "c");
}

@Test
void testNonAsciiDigits() { // These should not be treated as digits.
String arabicEight = "\u0668";
assertOrder(X_GT_Y, "1", arabicEight);
assertOrder(X_GT_Y, "9", arabicEight);
}

@Test
void testLexicographicOrder() {
assertOrder(X_GT_Y, "zebra", "aardvark");
assertOrder(X_GT_Y, "ζέβρα", "zebra");
}

/**
* UT for <a href="https://issues.apache.org/jira/browse/MRESOLVER-314">MRESOLVER-314</a>.
* Test for <a href="https://issues.apache.org/jira/browse/MRESOLVER-314">MRESOLVER-314</a>.
*
* Generates random UUID string based versions and tries to sort them. While this test is not as reliable
* as {@link #testCompareUuidVersionStringStream()}, it covers broader range and in case it fails it records
Expand Down
Loading