Skip to content

Double.Parse gets the wrong answer for inputs with as few as 1 significant digits. #4406

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gafter opened this issue Jul 30, 2015 · 17 comments
Closed
Labels
area-System.Runtime enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors
Milestone

Comments

@gafter
Copy link
Member

gafter commented Jul 30, 2015

There are a number of issues with Double.Parse. Ideally we would like it to follow the IEEE spec section 5.12.2 (for inputs of 20 or fewer digits, round to nearest representable double, but when two distinct representable numbers are precisely equidistant, round to even), give results that are the same when run on x86 vs x64 (we currently differ on the string "0.6822871999174", for example, only returning the correct result on x86), and be monotonic. It is currently none of those.

Here is a short snippet of C# code that illustrates all three issues. When run on x64, it demonstrates that Double.Parse is not monotonic.

            Console.WriteLine("{0:G17}  {0:R}", Double.Parse("0.6822871999174000000"));
            Console.WriteLine("{0:G17}  {0:R}", Double.Parse("0.6822871999174000001"));

The strings "1e126" and "5e-107"appear to parse into different numbers from double.Parse depending on whether you're running on a 32-bit or 64-bit runtime. The former gets the right result from double.Parse only on 64-bit systems; the latter gets the right result from double.Parse only on 32-bit systems.

See also dotnet/roslyn#4221, which demonstrates that this bug is inherited into the compile-time behavior of the C# compiler. Because the VS IDE runs in x86 and the batch compiler runs in x64, the design-time and run-time experiences are inconsistent.

@gafter
Copy link
Member Author

gafter commented Aug 28, 2015

Here are some other input strings from the test program in the comments of dotnet/roslyn#4221 that double.Parse gets wrong:

0.6822871999174
2.2250738585072012e-308
1.3694713649464322631e-11
3.571e+266
3.08984926168550152811e-32
999e-026
7861e-034
75569e-254
928609e-261
5232604057e-298
27235667517e-109
653532977297e-123
3142213164987e-294
46202199371337e-072
231010996856685e-073
20505426358836677347e-221
836168422905420598437e-234
623e+100
3571e+263
81661e+153
245540327e+122
6138508175e+120
83356057653e+193
619534293513e+124
2335141086879e+218
899810892172646163e+283
7120190517612959703e+120
308984926168550152811e-052
6372891218502368041059e+064

Note that we get the wrong answer for some cases of numbers with just three digits.

@gafter gafter changed the title Double.Parse does not follow IEEE spec Double.Parse gets the wrong answer for inputs with as few as 3 significant digits. Aug 28, 2015
@leppie
Copy link
Contributor

leppie commented Aug 28, 2015

On 32-bit, 5e125 is also parsed incorrectly, but due to incorrect printing, it still prints 5e125 instead of 5.0000000000000004e125.

Bit patterns:
Actual: 101101000000111101000101110110011000100000101001010000001000000
Expected: 101101000000111101000101110110011000100000101001010000000111111

This behaves correctly on 64-bit though.

@leppie
Copy link
Contributor

leppie commented Aug 28, 2015

@leppie
Copy link
Contributor

leppie commented Aug 28, 2015

I have written some tests (https://gist.github.com/leppie/4239b6fda4c999c4cd77) for double.Parse and my results do not match the ones above.

Here are the failures I get (running on 32-bit):

Input:    1.3694713649464322631e-11
Output:   1.3694713649464322e-11
Expected: 11110110101110000111010111000000111011101101010111010010011101
Actual:   11110110101110000111010111000000111011101101010111010010011100

Input:    7.8459735791271921e+65
Output:   7.845973579127193e65
Expected: 100110110011101110011010000000010001001110000010011000101001110
Actual:   100110110011101110011010000000010001001110000010011000101001111

Input:    3.08984926168550152811e-32
Output:   3.089849261685501e-32
Expected: 11100101100100000011011110010010000110011101100110010100111011
Actual:   11100101100100000011011110010010000110011101100110010100111010

Input:    5e+125
Output:   5.0000000000000004e125
Expected: 101101000000111101000101110110011000100000101001010000000111111
Actual:   101101000000111101000101110110011000100000101001010000001000000

Input:    75569e-254
Output:   7.556900000000001e-250
Expected: 110000110101101001000110001011011001000111000110101010110011
Actual:   110000110101101001000110001011011001000111000110101010110100

Input:    84863171e+114
Output:   8.486317100000001e121
Expected: 101100101000000011011101001100011110101111011001000111100110111
Actual:   101100101000000011011101001100011110101111011001000111100111000

Input:    653777767e+273
Output:   6.537777670000001e281
Expected: 111101001110010000000100010001111110010101100111010100010000001
Actual:   111101001110010000000100010001111110010101100111010100010000010

Input:    5232604057e-298
Output:   5.2326040570000005e-289
Expected: 10000010100011001011011100010010110110000100100010100100000
Actual:   10000010100011001011011100010010110110000100100010100100001

Input:    3142213164987e-294
Output:   3.1422131649870002e-282
Expected: 10101111101001101000000100111011111101111001010001001101111
Actual:   10101111101001101000000100111011111101111001010001001110000

Input:    46202199371337e-072
Output:   4.6202199371337004e-59
Expected: 11001111010010100011111001111011011111101111010011010000011111
Actual:   11001111010010100011111001111011011111101111010011010000100000

Input:    231010996856685e-073
Output:   2.3101099685668502e-59
Expected: 11001111000010100011111001111011011111101111010011010000011111
Actual:   11001111000010100011111001111011011111101111010011010000100000

Input:    78459735791271921e+049
Output:   7.845973579127193e65
Expected: 100110110011101110011010000000010001001110000010011000101001110
Actual:   100110110011101110011010000000010001001110000010011000101001111

Input:    272104041512242479e+200
Output:   2.721040415122425e217
Expected: 110110100010011101110111011010010111111000001011111000010000111
Actual:   110110100010011101110111011010010111111000001011111000010001000

Input:    623e+100
Output:   6.2299999999999995e102
Expected: 101010101000110010000001010011000101111001110101000001111011111
Actual:   101010101000110010000001010011000101111001110101000001111011110

Input:    245540327e+122
Output:   2.4554032699999998e130
Expected: 101101100000001101101100010001100011110000110001100010111001011
Actual:   101101100000001101101100010001100011110000110001100010111001010

Input:    6138508175e+120
Output:   6.138508174999999e129
Expected: 101101011100001101101100010001100011110000110001100010111001011
Actual:   101101011100001101101100010001100011110000110001100010111001010

Input:    619534293513e+124
Output:   6.195342935129999e135
Expected: 101110000100001000011000010000000110000001111111110000011110001
Actual:   101110000100001000011000010000000110000001111111110000011110000

Input:    609610927149051e-255
Output:   6.096109271490509e-241
Expected: 111000010000010000100111001110110001100010010001100010110001
Actual:   111000010000010000100111001110110001100010010001100010110000

Input:    94080055902682397e-242
Output:   9.408005590268239e-226
Expected: 1000100110110010010011000000111100011100111100110011011001010
Actual:   1000100110110010010011000000111100011100111100110011011001001

Input:    899810892172646163e+283
Output:   8.998108921726461e300
Expected: 111111001101010110111110101000111111010000001010101111000000011
Actual:   111111001101010110111110101000111111010000001010101111000000010

Input:    7120190517612959703e+120
Output:   7.120190517612959e138
Expected: 101110011000011001000100000110111001101010110001001100111111101
Actual:   101110011000011001000100000110111001101010110001001100111111100

Input:    25188282901709339043e-252
Output:   2.5188282901709337e-233
Expected: 111110100100000001011001101011110011110110110010101010000100
Actual:   111110100100000001011001101011110011110110110010101010000011

Input:    308984926168550152811e-052
Output:   3.089849261685501e-32
Expected: 11100101100100000011011110010010000110011101100110010100111011
Actual:   11100101100100000011011110010010000110011101100110010100111010

Input:    6372891218502368041059e+064
Output:   6.372891218502367e85
Expected: 101000111000000011001110000010001111101101110110011100011111110
Actual:   101000111000000011001110000010001111101101110110011100011111101

Note: IronScheme does not use double.ToString for printing, so Output will differ.

I have cherry picked some by hand to check on http://www.exploringbinary.com/floating-point-converter/ and they all match.

All of the results only differ by one bit. There is no pattern that I can see though. ECMA-334 says round-to-nearest mode should be used for real literals, and I expect double.Parse to behave exactly the same.

Update: Updated results as double.Parse was not used for all paths in IronScheme (notably for possible exact numbers, eg 234e234) resulting in fewer failures than previously listed. It would seem the common issue is when exponents are involved for the 32-bit case.

@jskeet
Copy link

jskeet commented Aug 28, 2015

One thing which may be handy here is to use my DoubleConverter.cs which allows the exact value of a double to be displayed. That way we don't need to worry about both parsing and "general" formatting.

Use with:

Console.WriteLine(DoubleConverter.ToExactString(foo));

For example, to show Neal's first example:

using System;

public class Test
{
    public static void Main()
    {
        Parse("0.6822871999174000000");
        Parse("0.6822871999174000001");
    }

    private static void Parse(string text)
    {
        double parsed = double.Parse(text);
        Console.WriteLine("{0} => {1}", text, DoubleConverter.ToExactString(parsed));
    }
}

Output:

0.6822871999174000000 => 0.682287199917400055682037418591789901256561279296875
0.6822871999174000001 => 0.68228719991739994465973495607613585889339447021484375

@leppie
Copy link
Contributor

leppie commented Aug 28, 2015

@jskeet: It was used in the Roslyn issue (dotnet/roslyn#4221) but it this is really a 3 pronged issue given:

  • the compiler has a bug
  • double.Parse has a bug
  • double.ToString has a bug (5.0000000000000004e125 prints 5e125 , 32-bit only)

Testing against broken expectations will not get us anywhere :(

Also, 0.6822871999174000000 and 0.6822871999174000001 is the same number (same bit encoding). Why yours print differently is probably due to one on the above 3 issues a double.Parse bug.

@jskeet
Copy link

jskeet commented Aug 28, 2015

Ha - I didn't know DoubleConverter was really used except by me :)

It sounds like really there are only two root issues:

  • double.Parse (which causes Roslyn to malfunction accordingly)
  • double.ToString

It sounds like my output in the previous comment is due to the double.Parse bug... I was merely trying to make that behaviour clearer by removing double.ToString from the equation. An alternative would be to use BitConverter.DoubleToInt64Bits of course :)

@leppie
Copy link
Contributor

leppie commented Aug 28, 2015

And running on 64-bit I get these failures:

Input:    2.2250738585072012e-308
Output:   2.225073858507201e-308
Expected: 10000000000000000000000000000000000000000000000000000
Actual:   1111111111111111111111111111111111111111111111111111

Input:    1.3694713649464322631e-11
Output:   1.3694713649464322e-11
Expected: 11110110101110000111010111000000111011101101010111010010011101
Actual:   11110110101110000111010111000000111011101101010111010010011100

Input:    3.571e+266
Output:   3.5709999999999997e266
Expected: 111011101000110001001100100010011000110000111010100000110101010
Actual:   111011101000110001001100100010011000110000111010100000110101001

Input:    3.08984926168550152811e-32
Output:   3.089849261685501e-32
Expected: 11100101100100000011011110010010000110011101100110010100111011
Actual:   11100101100100000011011110010010000110011101100110010100111010

Input:    0.6822871999174000000
Output:   0.6822871999174001
Expected: 11111111100101110101010100101111110111010000111111110100011011
Actual:   11111111100101110101010100101111110111010000111111110100011100

Input:    999e-026
Output:   9.990000000000001e-24
Expected: 11101100101000001001111000001010101111111000011000011010011110
Actual:   11101100101000001001111000001010101111111000011000011010011111

Input:    7861e-034
Output:   7.8610000000000004e-31
Expected: 11100110101111111000110101010001000001010001011110100111011000
Actual:   11100110101111111000110101010001000001010001011110100111011001

Input:    75569e-254
Output:   7.556900000000001e-250
Expected: 110000110101101001000110001011011001000111000110101010110011
Actual:   110000110101101001000110001011011001000111000110101010110100

Input:    928609e-261
Output:   9.286090000000001e-256
Expected: 101011111011111000101101110101100110001000000000101111101111
Actual:   101011111011111000101101110101100110001000000000101111110000

Input:    5232604057e-298
Output:   5.2326040570000005e-289
Expected: 10000010100011001011011100010010110110000100100010100100000
Actual:   10000010100011001011011100010010110110000100100010100100001

Input:    27235667517e-109
Output:   2.7235667517000002e-99
Expected: 10101101110111110101000001100000100100110101100100111110110010
Actual:   10101101110111110101000001100000100100110101100100111110110011

Input:    653532977297e-123
Output:   6.5353297729700005e-112
Expected: 10100011011001001001011010000010101010101111001101110001101000
Actual:   10100011011001001001011010000010101010101111001101110001101001

Input:    3142213164987e-294
Output:   3.1422131649870002e-282
Expected: 10101111101001101000000100111011111101111001010001001101111
Actual:   10101111101001101000000100111011111101111001010001001110000

Input:    46202199371337e-072
Output:   4.6202199371337004e-59
Expected: 11001111010010100011111001111011011111101111010011010000011111
Actual:   11001111010010100011111001111011011111101111010011010000100000

Input:    231010996856685e-073
Output:   2.3101099685668502e-59
Expected: 11001111000010100011111001111011011111101111010011010000011111
Actual:   11001111000010100011111001111011011111101111010011010000100000

Input:    20505426358836677347e-221
Output:   2.050542635883668e-202
Expected: 1011000010000000100101001010101001011011010101010101110111010
Actual:   1011000010000000100101001010101001011011010101010101110111011

Input:    836168422905420598437e-234
Output:   8.361684229054207e-214
Expected: 1001110110010000001000000001110100110001010001010100111001010
Actual:   1001110110010000001000000001110100110001010001010100111001011

Input:    623e+100
Output:   6.2299999999999995e102
Expected: 101010101000110010000001010011000101111001110101000001111011111
Actual:   101010101000110010000001010011000101111001110101000001111011110

Input:    3571e+263
Output:   3.5709999999999997e266
Expected: 111011101000110001001100100010011000110000111010100000110101010
Actual:   111011101000110001001100100010011000110000111010100000110101001

Input:    81661e+153
Output:   8.166099999999999e157
Expected: 110000010110111110010101000111000111101011010000101011110001110
Actual:   110000010110111110010101000111000111101011010000101011110001101

Input:    245540327e+122
Output:   2.4554032699999998e130
Expected: 101101100000001101101100010001100011110000110001100010111001011
Actual:   101101100000001101101100010001100011110000110001100010111001010

Input:    6138508175e+120
Output:   6.138508174999999e129
Expected: 101101011100001101101100010001100011110000110001100010111001011
Actual:   101101011100001101101100010001100011110000110001100010111001010

Input:    83356057653e+193
Output:   8.335605765299999e203
Expected: 110101001000101010001001110011011011010111011100010101000011000
Actual:   110101001000101010001001110011011011010111011100010101000010111

Input:    619534293513e+124
Output:   6.195342935129999e135
Expected: 101110000100001000011000010000000110000001111111110000011110001
Actual:   101110000100001000011000010000000110000001111111110000011110000

Input:    2335141086879e+218
Output:   2.3351410868789998e230
Expected: 110111111000011010000001010000111001001001100101100000111101110
Actual:   110111111000011010000001010000111001001001100101100000111101101

Input:    899810892172646163e+283
Output:   8.998108921726461e300
Expected: 111111001101010110111110101000111111010000001010101111000000011
Actual:   111111001101010110111110101000111111010000001010101111000000010

Input:    7120190517612959703e+120
Output:   7.120190517612959e138
Expected: 101110011000011001000100000110111001101010110001001100111111101
Actual:   101110011000011001000100000110111001101010110001001100111111100

Input:    308984926168550152811e-052
Output:   3.089849261685501e-32
Expected: 11100101100100000011011110010010000110011101100110010100111011
Actual:   11100101100100000011011110010010000110011101100110010100111010

Input:    6372891218502368041059e+064
Output:   6.372891218502367e85
Expected: 101000111000000011001110000010001111101101110110011100011111110
Actual:   101000111000000011001110000010001111101101110110011100011111101

There seems to be quite a number of differences in the failures between 32-bit and 64-bit.

Note: IronScheme does not use double.ToString for printing, so Output will differ.

Update: Updated results as double.Parse was not used for all paths in IronScheme (notably for possible exact numbers, eg 234e234) resulting in fewer failures than previously listed.

@leppie
Copy link
Contributor

leppie commented Aug 28, 2015

Interesting fact: All of these tests roundtrip correctly. At least it is consistent in one way 👍

@gafter
Copy link
Member Author

gafter commented Aug 28, 2015

If any of you would care to offer a correct C# implementation of floating-point scanning, we'd use it in Roslyn (C# and VB compilers)!

@leppie
Copy link
Contributor

leppie commented Sep 2, 2015

Another problem number.

2.470328229206232720882844e-324 => 0 while it should be 5e-324 (only ULP bit is set).

@leppie
Copy link
Contributor

leppie commented Sep 2, 2015

@gafter: There seems to be a C# port of the Java port of the V8 implementation of David May's dtoa.c, which would be correct.

Example: https://github.com/dretax/DX-Fougerite/tree/327727654930dc0a8714df07373eb9b4c5e14805/JintPlugin/Jint/Native/Number/Dtoa

(searching for DiyFp shows a few repos, but I have no idea about the origin)

Appears that is only for printing and not parsing :(

@gafter
Copy link
Member Author

gafter commented Sep 18, 2015

Wrong answer with as few as 1 input digits

The strings "1e126" and "5e-107"appear to parse into different numbers from double.Parse depending on whether you're running on a 32-bit or 64-bit runtime. The former gets the right result from double.Parse only on 64-bit systems; the latter gets the right result from double.Parse only on 32-bit systems.

/cc @MattGertz @jaredpar

@gafter
Copy link
Member Author

gafter commented Sep 18, 2015

@terrajobst Would you folks consider a contribution to replace double.Parse, float.Parse and related methods with an IEEE-correct version? We're writing one for Roslyn anyway.

@weshaggard
Copy link
Member

@gafter it is something we would consider. The biggest issue I see is the compat issue with existing code for the full framework, if we felt the risk was too high we could consider #ifdef'ing it for CoreCLR.

/cc: @AlexGhiondea @ellismg @joshfree

@gafter
Copy link
Member Author

gafter commented Sep 20, 2015

@weshaggard Do you expect that we have customers depending on the fact that double.Parse gets different answers on 32-bit and 64-bit systems?
</sarcasm>

@gafter gafter changed the title Double.Parse gets the wrong answer for inputs with as few as 3 significant digits. Double.Parse gets the wrong answer for inputs with as few as 1 significant digits. Sep 22, 2015
@gafter
Copy link
Member Author

gafter commented Oct 24, 2016

@tannergooding tannergooding self-assigned this Sep 11, 2018
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 30, 2020
@msftgits msftgits added this to the Future milestone Jan 30, 2020
@tannergooding tannergooding removed their assignment May 26, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Jan 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Runtime enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors
Projects
None yet
Development

No branches or pull requests

7 participants