-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Spline fitting failure because of rounding #1311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@Antoine-0 Thanks very much for this detailed and complete report. I can reproduce the outcome you report. I'm not opposed to what you propose: some internal checks to the private function _schumaker_qspline that could raise a more informative error. And to adding to the documentation of fit_desoto_sandia (and similar) to more clearly communicate expectations for the input data. The source of the problem here appears to be the input data. The voltages contain negative and duplicate values. These algorithms are untested with negative voltages or currents, and are almost certain to fail with duplicate voltage values. Also, the current values appear to have a precision issue: for example, in the first IV curve, the current corresponding to many low voltages is one of 10.342, 10.327, 10.32, or 10.313. The lack of monotonically non-increasing current is also likely to create issues with the spline fit, which is attempting to return a sequence of quadratic splines that together are concave downward. The |
Exactly, the issue comes from the input data, even after using I will look at what changes in the function behaviour if I round aa and b when they are smaller than 1e-10. But anyway, EDIT : well I've only added |
Thanks again for running this to ground. I agree with you that the code needs to handle floating point round-off error; that's not something I considered when writing the original Matlab function. These two lines:
should assign Do you want to submit a PR? |
Yes, I can submit a PR with the following additional lines : |
You can create a branch in your own fork, and make the pull request from there. Instructions are here. I'm happy to do the PR if you'd prefer. |
Well, I tried to make a PR but it seems that all the checks are failing, I'll look it up again tomorrow |
…1315) * Added rounding of aa and b in _schumaker_qspline.py, see Issue #1311 * adjust EPS constants, remove part of test * lint, whatsnew * add whatnew v0.9.1 to index Co-authored-by: Antoine Ravetta <[email protected]> Co-authored-by: Cliff Hansen <[email protected]>
Hello,
I'm doing an internship about SDM and DDM parameter extraction from experimental IV curves and I've begun to use pvlib (by the way, thank you for the great documentation, it was really helpful to understand the library !).
After some tinkering with my data, I got errors in numpy.linalg but I found out it boils down to the _schumaker_qspline function. #1235
I don't know if this is really a bug, or if it is already known (the only issue that mentioned the qspline #1228 maybe briefly talks about it)
After some analysis, here is what I understand :
Describe the bug
It all starts in the first else of the 5th step of algo 4.1 of the Schumaker publication, (lines 313-314 of pvlib.ivtools.utils.py) :
a = s_i - \delta_i
b = s_{i+1} - \delta_i
the former difference can at some points lead to numerical residues : i.e. b_i = -1e-18 instead of 0.
It can lead to a_i * b_i being negative but really small and thus pass the next if of the algorithm.
But the computation of xi afterwards might lead to rounding of this really small value and in the end we can get xi_i = t_i
This will induce a division by zero in the definition of eta_i (line 354 of utils.py). No problem so far but this np.inf will become np.nan after other calculations. These np.nan will thwart the execution of the coming mean() and cumsum() functions. In the end, it is the least_squares fitting function that throws an error because in cannot converge when one of its arguments is an array of np.nan
To Reproduce
Steps to reproduce the behavior:
So far the only indices where the division by zero occur are within the w indices. In the data provided, I sometimes found other indices where b was really small and a * b negative without leading to a division by zero because rounding did not occur (e.g. xi was 32.0000000000004 and uu 32). I've only checked for the first five curves but it is always the same :
there are some indices where xi == uu. At those indices, w is True and v and z are False. The b values at those indices are really small (max 1e-14) and aa * b at those indices is negative (were it really should be equal to 0)
Expected behavior
IMHO, a test should be added after the computation of aa and b, and if they are close to zero, they should be rounded to ensure proper behaviour of the upcoming tests.
Let me know if you can reproduce this and what you think of it. I think the actual treatment of NaNs is good, replacing mean by nanmean will not do good in the long run.
I realized that smoothing the data could significantly reduce the number of errors but not all of them, so that is why I decided to open this issue.
Versions:
pvlib.__version__
: 0.9.0numpy.__version__
: 1.20.1The text was updated successfully, but these errors were encountered: