Skip to content

Commit 66c8b3f

Browse files
kandersolarwholmgren
authored andcommitted
refactor get_psm3 code into parse_psm3, read_psm3 (#842)
* Split out PSM3 parsing code into its own function read_psm3; update docstring to include new PSM3 metadata fields * update tests and code for read_psm3 and get_psm3 - get_psm3: add leap_day parameter - get_psm3: change API endpoint for TMY requests - get_psm3: update allowed names list to include 2018 datasets - get_psm3: add single-year test - get_psm3: change existing test to test TMY data only, remove invalid interval test - read_psm3: fix warning of blank columns from excel CSVs - read_psm3: change dtype of cloud type and fill flag to int - read_psm3: add tests for reading filename and file-like object * Create whatsnew for 0.7.1, add read_psm3 entry to api.rst * add read_psm3 to iotools/__init__.py * Clean up NSRDB PSM3 API urls * Change read_psm3 to parse_psm3, remove ability to read files, add small example of reading files * Add read_psm3 back in, refactor psm3 tests to reduce code duplication * fix typo in parse_psm3 documentation
1 parent a0463d4 commit 66c8b3f

File tree

9 files changed

+35306
-52
lines changed

9 files changed

+35306
-52
lines changed

docs/sphinx/source/api.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -369,6 +369,8 @@ relevant to solar energy modeling.
369369
iotools.read_crn
370370
iotools.read_solrad
371371
iotools.get_psm3
372+
iotools.read_psm3
373+
iotools.parse_psm3
372374

373375
A :py:class:`~pvlib.location.Location` object may be created from metadata
374376
in some files.

docs/sphinx/source/whatsnew.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ What's New
66

77
These are new features and improvements of note in each release.
88

9+
.. include:: whatsnew/v0.7.1.rst
910
.. include:: whatsnew/v0.7.0.rst
1011
.. include:: whatsnew/v0.6.3.rst
1112
.. include:: whatsnew/v0.6.2.rst
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
.. _whatsnew_0710:
2+
3+
v0.7.1 (MONTH DAY, YEAR)
4+
------------------------
5+
6+
7+
API Changes
8+
~~~~~~~~~~~
9+
10+
Enhancements
11+
~~~~~~~~~~~~
12+
* Added :py:func:`~pvlib.iotools.read_psm3` to read local NSRDB PSM3 files and
13+
:py:func:`~pvlib.iotools.parse_psm3` to parse local NSRDB PSM3 file-like
14+
objects. (:issue:`841`)
15+
* Added `leap_day` parameter to `iotools.get_psm3` instead of hardcoding it as
16+
False.
17+
18+
19+
Bug fixes
20+
~~~~~~~~~
21+
* Changed the PSM3 API endpoint for TMY requests in `iotools.get_psm3`.
22+
23+
Testing
24+
~~~~~~~
25+
* Added single-year PSM3 API test for `iotools.get_psm3`.
26+
* Added tests for `iotools.parse_psm3` and `iotools.read_psm3`.
27+
28+
Documentation
29+
~~~~~~~~~~~~~
30+
* Updated list of allowed years for `iotools.get_psm3`.
31+
32+
Contributors
33+
~~~~~~~~~~~~
34+
* Kevin Anderson (:ghuser:`kanderso-nrel`)

pvlib/data/test_psm3_2017.csv

Lines changed: 17521 additions & 0 deletions
Large diffs are not rendered by default.
File renamed without changes.

pvlib/data/test_read_psm3.csv

Lines changed: 17523 additions & 0 deletions
Large diffs are not rendered by default.

pvlib/iotools/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,5 @@
1111
from pvlib.iotools.crn import read_crn # noqa: F401
1212
from pvlib.iotools.solrad import read_solrad # noqa: F401
1313
from pvlib.iotools.psm3 import get_psm3 # noqa: F401
14+
from pvlib.iotools.psm3 import read_psm3 # noqa: F401
15+
from pvlib.iotools.psm3 import parse_psm3 # noqa: F401

pvlib/iotools/psm3.py

Lines changed: 159 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,9 @@
99
import pandas as pd
1010
from json import JSONDecodeError
1111

12-
URL = "http://developer.nrel.gov/api/solar/nsrdb_psm3_download.csv"
12+
NSRDB_API_BASE = "https://developer.nrel.gov"
13+
PSM_URL = NSRDB_API_BASE + "/api/solar/nsrdb_psm3_download.csv"
14+
TMY_URL = NSRDB_API_BASE + "/api/nsrdb_api/solar/nsrdb_psm3_tmy_download.csv"
1315

1416
# 'relative_humidity', 'total_precipitable_water' are not available
1517
ATTRIBUTES = [
@@ -19,9 +21,11 @@
1921

2022

2123
def get_psm3(latitude, longitude, api_key, email, names='tmy', interval=60,
22-
full_name=PVLIB_PYTHON, affiliation=PVLIB_PYTHON, timeout=30):
24+
leap_day=False, full_name=PVLIB_PYTHON, affiliation=PVLIB_PYTHON,
25+
timeout=30):
2326
"""
24-
Get PSM3 data
27+
Retrieve NSRDB [1]_ PSM3 timeseries weather data from the PSM3 API [2]_
28+
[3]_.
2529
2630
Parameters
2731
----------
@@ -38,7 +42,11 @@ def get_psm3(latitude, longitude, api_key, email, names='tmy', interval=60,
3842
PSM3 API parameter specifing year or TMY variant to download, see notes
3943
below for options
4044
interval : int, default 60
41-
interval size in minutes, can only be either 30 or 60
45+
interval size in minutes, can only be either 30 or 60. Only used for
46+
single-year requests (i.e., it is ignored for tmy/tgy/tdy requests).
47+
leap_day : boolean, default False
48+
include leap day in the results. Only used for single-year requests
49+
(i.e., it is ignored for tmy/tgy/tdy requests).
4250
full_name : str, default 'pvlib python'
4351
optional
4452
affiliation : str, default 'pvlib python'
@@ -49,7 +57,8 @@ def get_psm3(latitude, longitude, api_key, email, names='tmy', interval=60,
4957
Returns
5058
-------
5159
headers : dict
52-
metadata from NREL PSM3 about the record, see notes for fields
60+
metadata from NREL PSM3 about the record, see
61+
:func:`pvlib.iotools.parse_psm3` for fields
5362
data : pandas.DataFrame
5463
timeseries data from NREL PSM3
5564
@@ -74,50 +83,25 @@ def get_psm3(latitude, longitude, api_key, email, names='tmy', interval=60,
7483
7584
['1998', '1999', '2000', '2001', '2002', '2003', '2004', '2005',
7685
'2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013',
77-
'2014', '2015', '2016', '2017', 'tmy', 'tmy-2016', 'tmy-2017',
78-
'tdy-2017', 'tgy-2017']
79-
80-
The return is a tuple with two items. The first item is a header with
81-
metadata from NREL PSM3 about the record containing the following fields:
82-
83-
* Source
84-
* Location ID
85-
* City
86-
* State
87-
* Country
88-
* Latitude
89-
* Longitude
90-
* Time Zone
91-
* Elevation
92-
* Local Time Zone
93-
* Dew Point Units
94-
* DHI Units
95-
* DNI Units
96-
* GHI Units
97-
* Temperature Units
98-
* Pressure Units
99-
* Wind Direction Units
100-
* Wind Speed
101-
* Surface Albedo Units
102-
* Version
103-
104-
The second item is a dataframe with the timeseries data downloaded.
86+
'2014', '2015', '2016', '2017', '2018', 'tmy', 'tmy-2016', 'tmy-2017',
87+
'tdy-2017', 'tgy-2017', 'tmy-2018', 'tdy-2018', 'tgy-2018']
10588
10689
.. warning:: PSM3 is limited to data found in the NSRDB, please consult the
10790
references below for locations with available data
10891
10992
See Also
11093
--------
111-
pvlib.iotools.read_tmy2, pvlib.iotools.read_tmy3
94+
pvlib.iotools.read_psm3, pvlib.iotools.parse_psm3
11295
11396
References
11497
----------
11598
116-
.. [1] `NREL Developer Network - Physical Solar Model (PSM) v3
117-
<https://developer.nrel.gov/docs/solar/nsrdb/psm3_data_download/>`_
118-
.. [2] `NREL National Solar Radiation Database (NSRDB)
99+
.. [1] `NREL National Solar Radiation Database (NSRDB)
119100
<https://nsrdb.nrel.gov/>`_
120-
101+
.. [2] `NREL Developer Network - Physical Solar Model (PSM) v3
102+
<https://developer.nrel.gov/docs/solar/nsrdb/psm3_data_download/>`_
103+
.. [3] `NREL Developer Network - Physical Solar Model (PSM) v3 TMY
104+
<https://developer.nrel.gov/docs/solar/nsrdb/psm3_tmy_data_download/>`_
121105
"""
122106
# The well know text (WKT) representation of geometry notation is strict.
123107
# A POINT object is a string with longitude first, then the latitude, with
@@ -137,11 +121,15 @@ def get_psm3(latitude, longitude, api_key, email, names='tmy', interval=60,
137121
'wkt': 'POINT(%s %s)' % (longitude, latitude),
138122
'names': names,
139123
'attributes': ','.join(ATTRIBUTES),
140-
'leap_day': 'false',
124+
'leap_day': str(leap_day).lower(),
141125
'utc': 'false',
142126
'interval': interval
143127
}
144128
# request CSV download from NREL PSM3
129+
if any(prefix in names for prefix in ('tmy', 'tgy', 'tdy')):
130+
URL = TMY_URL
131+
else:
132+
URL = PSM_URL
145133
response = requests.get(URL, params=params, timeout=timeout)
146134
if not response.ok:
147135
# if the API key is rejected, then the response status will be 403
@@ -154,7 +142,97 @@ def get_psm3(latitude, longitude, api_key, email, names='tmy', interval=60,
154142
# the CSV is in the response content as a UTF-8 bytestring
155143
# to use pandas we need to create a file buffer from the response
156144
fbuf = io.StringIO(response.content.decode('utf-8'))
157-
# The first 2 lines of the response are headers with metadat
145+
return parse_psm3(fbuf)
146+
147+
148+
def parse_psm3(fbuf):
149+
"""
150+
Parse an NSRDB [1]_ PSM3 weather file (formatted as SAM CSV [2]_).
151+
152+
Parameters
153+
----------
154+
fbuf: file-like object
155+
File-like object containing data to read.
156+
157+
Returns
158+
-------
159+
headers : dict
160+
metadata from NREL PSM3 about the record, see notes for fields
161+
data : pandas.DataFrame
162+
timeseries data from NREL PSM3
163+
164+
Notes
165+
-----
166+
The return is a tuple with two items. The first item is a header with
167+
metadata from NREL PSM3 about the record containing the following fields:
168+
169+
* Source
170+
* Location ID
171+
* City
172+
* State
173+
* Country
174+
* Latitude
175+
* Longitude
176+
* Time Zone
177+
* Elevation
178+
* Local Time Zone
179+
* Clearsky DHI Units
180+
* Clearsky DNI Units
181+
* Clearsky GHI Units
182+
* Dew Point Units
183+
* DHI Units
184+
* DNI Units
185+
* GHI Units
186+
* Solar Zenith Angle Units
187+
* Temperature Units
188+
* Pressure Units
189+
* Relative Humidity Units
190+
* Precipitable Water Units
191+
* Wind Direction Units
192+
* Wind Speed
193+
* Cloud Type -15
194+
* Cloud Type 0
195+
* Cloud Type 1
196+
* Cloud Type 2
197+
* Cloud Type 3
198+
* Cloud Type 4
199+
* Cloud Type 5
200+
* Cloud Type 6
201+
* Cloud Type 7
202+
* Cloud Type 8
203+
* Cloud Type 9
204+
* Cloud Type 10
205+
* Cloud Type 11
206+
* Cloud Type 12
207+
* Fill Flag 0
208+
* Fill Flag 1
209+
* Fill Flag 2
210+
* Fill Flag 3
211+
* Fill Flag 4
212+
* Fill Flag 5
213+
* Surface Albedo Units
214+
* Version
215+
216+
The second item is a dataframe with the PSM3 timeseries data.
217+
218+
Examples
219+
--------
220+
>>> # Read a local PSM3 file:
221+
>>> with open(filename, 'r') as f: # doctest: +SKIP
222+
... metadata, df = iotools.parse_psm3(f) # doctest: +SKIP
223+
224+
See Also
225+
--------
226+
pvlib.iotools.read_psm3, pvlib.iotools.get_psm3
227+
228+
References
229+
----------
230+
.. [1] `NREL National Solar Radiation Database (NSRDB)
231+
<https://nsrdb.nrel.gov/>`_
232+
.. [2] `Standard Time Series Data File Format
233+
<https://rredc.nrel.gov/solar/old_data/nsrdb/2005-2012/wfcsv.pdf>`_
234+
"""
235+
# The first 2 lines of the response are headers with metadata
158236
header_fields = fbuf.readline().split(',')
159237
header_fields[-1] = header_fields[-1].strip() # strip trailing newline
160238
header_values = fbuf.readline().split(',')
@@ -169,15 +247,54 @@ def get_psm3(latitude, longitude, api_key, email, names='tmy', interval=60,
169247
# get the column names so we can set the dtypes
170248
columns = fbuf.readline().split(',')
171249
columns[-1] = columns[-1].strip() # strip trailing newline
250+
# Since the header has so many columns, excel saves blank cols in the
251+
# data below the header lines.
252+
columns = [col for col in columns if col != '']
172253
dtypes = dict.fromkeys(columns, float) # all floats except datevec
173254
dtypes.update(Year=int, Month=int, Day=int, Hour=int, Minute=int)
255+
dtypes['Cloud Type'] = int
256+
dtypes['Fill Flag'] = int
174257
data = pd.read_csv(
175-
fbuf, header=None, names=columns, dtype=dtypes,
258+
fbuf, header=None, names=columns, usecols=columns, dtype=dtypes,
176259
delimiter=',', lineterminator='\n') # skip carriage returns \r
177260
# the response 1st 5 columns are a date vector, convert to datetime
178261
dtidx = pd.to_datetime(
179262
data[['Year', 'Month', 'Day', 'Hour', 'Minute']])
180-
# in USA all timezones are intergers
263+
# in USA all timezones are integers
181264
tz = 'Etc/GMT%+d' % -header['Time Zone']
182265
data.index = pd.DatetimeIndex(dtidx).tz_localize(tz)
266+
183267
return header, data
268+
269+
270+
def read_psm3(filename):
271+
"""
272+
Read an NSRDB [1]_ PSM3 weather file (formatted as SAM CSV [2]_).
273+
274+
Parameters
275+
----------
276+
filename: str
277+
Filename of a file containing data to read.
278+
279+
Returns
280+
-------
281+
headers : dict
282+
metadata from NREL PSM3 about the record, see
283+
:func:`pvlib.iotools.parse_psm3` for fields
284+
data : pandas.DataFrame
285+
timeseries data from NREL PSM3
286+
287+
See Also
288+
--------
289+
pvlib.iotools.parse_psm3, pvlib.iotools.get_psm3
290+
291+
References
292+
----------
293+
.. [1] `NREL National Solar Radiation Database (NSRDB)
294+
<https://nsrdb.nrel.gov/>`_
295+
.. [2] `Standard Time Series Data File Format
296+
<https://rredc.nrel.gov/solar/old_data/nsrdb/2005-2012/wfcsv.pdf>`_
297+
"""
298+
with open(filename, 'r') as fbuf:
299+
content = parse_psm3(fbuf)
300+
return content

0 commit comments

Comments
 (0)