Skip to content

Add pvlib.temperature.fuentes #1037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Sep 3, 2020
Merged

Add pvlib.temperature.fuentes #1037

merged 15 commits into from
Sep 3, 2020

Conversation

kandersolar
Copy link
Member

@kandersolar kandersolar commented Aug 29, 2020

  • Partially addresses Add Fuentes temperature model #1032 (this is only the low-level function, doesn't expose it in the high-level APIs)
  • I am familiar with the contributing guidelines
  • Tests added
  • Updates entries to docs/sphinx/source/api.rst for API changes.
  • Adds description and name entries in the appropriate "what's new" file in docs/sphinx/source/whatsnew for all changes. Includes link to the GitHub Issue with :issue:`num` or this Pull Request with :pull:`num`. Includes contributor name and/or GitHub username (link with :ghuser:`user`).
  • New code is fully documented. Includes numpydoc compliant docstrings, examples, and comments where necessary.
  • Pull request is nearly complete and ready for detailed review.
  • Maintainer: Appropriate GitHub Labels and Milestone are assigned to the Pull Request and linked Issue.

This implementation is a port of the original FORTRAN77 code with some minor refactoring. As @wholmgren mentioned in #1032, numba might be a good option for speeding up this non-vectorized code.

If it's okay, I'd prefer to integrate it with PVSystem and ModelChain in a follow-on PR and focus this one on the model implementation. Also, I put the whatsnew entry in v0.8.0 but OK with me to push it to v0.8.1.

@kandersolar kandersolar marked this pull request as ready for review August 30, 2020 23:01
@kandersolar
Copy link
Member Author

Ready for review. Test failures seem unrelated (test_read_midc_raw_data_from_nrel).

@wholmgren
Copy link
Member

Thanks @kanderso-nrel this looks great!

Can you comment on the performance? I'm not so concerned about the absolute performance, just that we have a benchmark for future reference. This is a case where it would be nice to have #530.

What is the motivation for using a DataFrame within the function? My main concern is that if we try to speed this up with numba then we might want to replace the DataFrame with numpy arrays or lists. I suspect the vast majority of the time is spent in the inner loop and that's fine to jit as is. But the outer loop requires the GIL and I worry, perhaps prematurely, that the pandas __getitem__ and especially the __setitem__ will become bottlenecks. In any case, it would probably easier to numba-ify the whole loop instead of just the inner loop.

@kandersolar
Copy link
Member Author

What is the motivation for using a DataFrame within the function?

The only motivation is because I like the df.iterrows() interface. Turns out that df.loc[i, 'colname'] = value is dramatically slower than I would have predicted. I switched to looping through a zip of the individual series and storing results in a numpy array, reducing the runtime from 4-5 seconds to ~1 second for an 8760.

For whether numba is worthwhile, by commenting out the inner loop, runtime goes down to 7ms on my computer, so it is true that 99% of the runtime is now spent in the inner loop. Extracting the body of the inner loop to its own function and decorating it and _fuentes_hconv with @numba.jit brings down runtime to 728ms on the first run and ~72ms for subsequent runs. I'm impressed that even with the overhead of compiling on the first invocation, it still manages to be faster.

If we were to use numba here, would we centralize the "use numba if available method" from spa.py and use it for both spa.py and fuentes?

@wholmgren
Copy link
Member

Yeah, pandas indexing can be much slower than numpy or list operations. You don't get all that fancy pandas behavior for free! Usually doesn't matter in practice but in a loop it can be costly.

I wasn't expecting a numba implementation in this PR but great to hear of the progress!

If we were to use numba here, would we centralize the "use numba if available method" from spa.py and use it for both spa.py and fuentes?

That works for me. Also worth noting that spa numba code is ~5 years old, which is most of numba's lifetime. So it might be worth checking if there's a more modern pattern for conditional numba usage.

@kandersolar
Copy link
Member Author

After taking a closer look, the way numba is used with spa.py and solarposition.py is more complicated than I thought, it's not just swapping in a no-op decorator if numba isn't available. Now that the pure python implementation is not so slow anymore, could we push the numba idea to a different PR? I think it would be worth chewing on it for a bit.

Test failures not related.

@wholmgren
Copy link
Member

Yes, I think that's a good plan.

@cwhanse
Copy link
Member

cwhanse commented Sep 1, 2020

@kanderso-nrel do you have access to the PVWatts code? The reason for adding this model to pvlib-python is to better emulate what PVWatts is doing. It would be good to check that this port from Fuentes's FORTRAN is consistent with the PVWatts code.

@kandersolar
Copy link
Member Author

Here's the implementation of the Fuentes model in the SSC: https://github.com/NREL/ssc/blob/c6e41d7d946723e3cc4111262abcab081c151621/shared/lib_pvwatts.cpp#L173-L290

I haven't gone through it detail but a quick skim suggests it's a straightforward port of the F77 code too. Note that pvwatts.nrel.gov uses PVWatts v5 in the SSC, but I think SAM uses v7. Both v5 and v7 import the above code, so the difference doesn't matter here.

Also in case you missed it, the data for the tests were fetched from pvwatts.nrel.gov, so that's a consistency check. But that doesn't mean there isn't additional value in comparing the code too.

@cwhanse
Copy link
Member

cwhanse commented Sep 1, 2020

interesting - the tilt is also hardwired at 30 degrees. I'm persuaded to keep the defaults to match PVWatts/SAM, but inform the user how to change the defaults for tilt and hydraulic_diameter

@wholmgren wholmgren mentioned this pull request Sep 1, 2020
module. [C]

module_height : float, default 5.0
The height above ground of the module above the ground. The PVWatts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should say at what position on the module the height is determined. I couldn't find that Fuentes specified that point, did you? I'm guessing center of module is appropriate, to represent the average wind speed along the module's face..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most specific description I found was "average array height above ground" in the second paragraph of the introduction. Not sure if that's referring to the change in height across a module or if it's trying to accommodate variation in racking height, but center of module makes sense to me too.

Copy link
Member

@cwhanse cwhanse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can go with v0.8

@wholmgren wholmgren added this to the 0.8.0 milestone Sep 2, 2020
Comment on lines +471 to +478
poa_global : pandas Series
Total incident irradiance [W/m^2]

temp_air : pandas Series
Ambient dry bulb temperature [C]

wind_speed : pandas Series
Wind speed [m/s]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this function would work just as well with any kind of array input, right? If so, we should make the type requirements here and in the return line array_like and change the last line of the function to only return a series if poa_global is a series.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the timedelta between measurements is needed for the thermal capacitance/time lag part of the model, so I think it has to either be Series inputs or another parameter would be added for the measurement timestamps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, right. let's leave it alone then. new merge conflict (sorry)

@wholmgren wholmgren merged commit 56e6498 into pvlib:master Sep 3, 2020
@wholmgren
Copy link
Member

thanks @kanderso-nrel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants