Skip to content

Commit 40ba4bd

Browse files
Add gallery example for modeling with interval averages (#1152)
* add example * whatsnew * fix title * make plots more compact * fix unrelated sphinx issue * forgot this wasn't latex * updates from review Co-authored-by: Will Holmgren <[email protected]>
1 parent 709daa6 commit 40ba4bd

File tree

2 files changed

+173
-1
lines changed

2 files changed

+173
-1
lines changed
Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
"""
2+
Modeling with interval averages
3+
===============================
4+
5+
Transposing interval-averaged irradiance data
6+
"""
7+
8+
# %%
9+
# This example shows how failing to account for the difference between
10+
# instantaneous and interval-averaged time series data can introduce
11+
# error in the modeling process. An instantaneous time series
12+
# represents discrete measurements taken at each timestamp, while
13+
# an interval-averaged time series represents the average value across
14+
# each data interval. For example, the value of an interval-averaged
15+
# hourly time series at 11:00 represents the average value between
16+
# 11:00 (inclusive) and 12:00 (exclusive), assuming the series is left-labeled.
17+
# For a right-labeled time series it would be the average value
18+
# between 10:00 (exclusive) and 11:00 (inclusive). Sometimes timestamps
19+
# are center-labeled, in which case it would be the
20+
# average value between 10:30 and 11:30.
21+
# Interval-averaged time series are common in
22+
# field data, where the datalogger averages high-frequency measurements
23+
# into low-frequency averages for archiving purposes.
24+
#
25+
# It is important to account for this difference when using
26+
# interval-averaged weather data for modeling. This example
27+
# focuses on calculating solar position appropriately for
28+
# irradiance transposition, but this concept is relevant for
29+
# other steps in the modeling process as well.
30+
#
31+
# This example calculates a POA irradiance timeseries at 1-second
32+
# resolution as a "ground truth" value. Then it performs the
33+
# transposition again at lower resolution using interval-averaged
34+
# irradiance components, once using a half-interval shift and
35+
# once just using the unmodified timestamps. The difference
36+
# affects the solar position calculation: for example, assuming
37+
# we have average irradiance for the interval 11:00 to 12:00,
38+
# and it came from a left-labeled time series, naively using
39+
# the unmodified timestamp will calculate solar position for 11:00,
40+
# meaning the calculated solar position is used to represent
41+
# times as far as an hour away. A better option would be to
42+
# calculate the solar position at 11:30 to reduce the maximum
43+
# timing error to only half an hour.
44+
45+
import pvlib
46+
import pandas as pd
47+
import matplotlib.pyplot as plt
48+
49+
# %%
50+
# First, we'll define a helper function that we can re-use several
51+
# times in the following code:
52+
53+
54+
def transpose(irradiance, timeshift):
55+
"""
56+
Transpose irradiance components to plane-of-array, incorporating
57+
a timeshift in the solar position calculation.
58+
59+
Parameters
60+
----------
61+
irradiance: DataFrame
62+
Has columns dni, ghi, dhi
63+
timeshift: float
64+
Number of minutes to shift for solar position calculation
65+
Outputs:
66+
Series of POA irradiance
67+
"""
68+
idx = irradiance.index
69+
# calculate solar position for shifted timestamps:
70+
idx = idx + pd.Timedelta(timeshift, unit='min')
71+
solpos = location.get_solarposition(idx)
72+
# but still report the values with the original timestamps:
73+
solpos.index = irradiance.index
74+
75+
poa_components = pvlib.irradiance.get_total_irradiance(
76+
surface_tilt=20,
77+
surface_azimuth=180,
78+
solar_zenith=solpos['apparent_zenith'],
79+
solar_azimuth=solpos['azimuth'],
80+
dni=irradiance['dni'],
81+
ghi=irradiance['ghi'],
82+
dhi=irradiance['dhi'],
83+
model='isotropic',
84+
)
85+
return poa_components['poa_global']
86+
87+
88+
# %%
89+
# Now, calculate the "ground truth" irradiance data. We'll simulate
90+
# clear-sky irradiance components at 1-second intervals and calculate
91+
# the corresponding POA irradiance. At such a short timescale, the
92+
# difference between instantaneous and interval-averaged irradiance
93+
# is negligible.
94+
95+
# baseline: all calculations done at 1-second scale
96+
location = pvlib.location.Location(40, -80, tz='Etc/GMT+5')
97+
times = pd.date_range('2019-06-01 05:00', '2019-06-01 19:00',
98+
freq='1s', tz='Etc/GMT+5')
99+
solpos = location.get_solarposition(times)
100+
clearsky = location.get_clearsky(times, solar_position=solpos)
101+
poa_1s = transpose(clearsky, timeshift=0) # no shift needed for 1s data
102+
103+
# %%
104+
# Now, we will aggregate the 1-second values into interval averages.
105+
# To see how the averaging interval affects results, we'll loop over
106+
# a few common data intervals and accumulate the results.
107+
108+
fig, ax = plt.subplots(figsize=(5, 3))
109+
110+
results = []
111+
112+
for timescale_minutes in [1, 5, 10, 15, 30, 60]:
113+
114+
timescale_str = f'{timescale_minutes}min'
115+
# get the "true" interval average of poa as the baseline for comparison
116+
poa_avg = poa_1s.resample(timescale_str).mean()
117+
# get interval averages of irradiance components to use for transposition
118+
clearsky_avg = clearsky.resample(timescale_str).mean()
119+
120+
# low-res interval averages of 1-second data, with NO shift
121+
poa_avg_noshift = transpose(clearsky_avg, timeshift=0)
122+
123+
# low-res interval averages of 1-second data, with half-interval shift
124+
poa_avg_halfshift = transpose(clearsky_avg, timeshift=timescale_minutes/2)
125+
126+
df = pd.DataFrame({
127+
'ground truth': poa_avg,
128+
'modeled, half shift': poa_avg_halfshift,
129+
'modeled, no shift': poa_avg_noshift,
130+
})
131+
error = df.subtract(df['ground truth'], axis=0)
132+
# add another trace to the error plot
133+
error['modeled, no shift'].plot(ax=ax, label=timescale_str)
134+
# calculate error statistics and save for later
135+
stats = error.abs().mean() # average absolute error across daylight hours
136+
stats['timescale_minutes'] = timescale_minutes
137+
results.append(stats)
138+
139+
ax.legend(ncol=2)
140+
ax.set_ylabel('Transposition Error [W/m$^2$]')
141+
fig.tight_layout()
142+
143+
df_results = pd.DataFrame(results).set_index('timescale_minutes')
144+
print(df_results)
145+
146+
# %%
147+
# The errors shown above are the average absolute difference in :math:`W/m^2`.
148+
# In this example, using the timestamps unadjusted creates an error that
149+
# increases with increasing interval length, up to a ~40% error
150+
# at hourly resolution. In contrast, incorporating a half-interval shift
151+
# so that solar position is calculated in the middle of the interval
152+
# instead of the edge reduces the error by one or two orders of magnitude:
153+
154+
fig, ax = plt.subplots(figsize=(5, 3))
155+
df_results[['modeled, no shift', 'modeled, half shift']].plot.bar(rot=0, ax=ax)
156+
ax.set_ylabel('Mean Absolute Error [W/m$^2$]')
157+
ax.set_xlabel('Transposition Timescale [minutes]')
158+
fig.tight_layout()
159+
160+
# %%
161+
# We can also plot the underlying time series results of the last
162+
# iteration (hourly in this case). The modeled irradiance using
163+
# no shift is effectively time-lagged compared with ground truth.
164+
# In contrast, the half-shift model is nearly identical to the ground
165+
# truth irradiance.
166+
167+
fig, ax = plt.subplots(figsize=(5, 3))
168+
ax = df.plot(ax=ax, style=[None, ':', None], lw=3)
169+
ax.set_ylabel('Irradiance [W/m$^2$]')
170+
fig.tight_layout()

docs/sphinx/source/whatsnew/v0.9.0.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,9 @@ Documentation
128128
~~~~~~~~~~~~~
129129

130130
* Update intro tutorial to highlight the use of historical meteorological data
131-
and to make the procedural and object oriented results match exactly.
131+
and to make the procedural and OO results match exactly. (:issue:`1116`, :pull:`1144`)
132+
* Add a gallery example showing how to appropriately use interval-averaged
133+
weather data for modeling. (:pull:`1152`)
132134
* Update documentation links in :py:func:`pvlib.iotools.get_psm3`
133135

134136
Requirements

0 commit comments

Comments
 (0)