Skip to content

Commit 906740e

Browse files
committed
ENH: add time-window capability to .rolling
xref pandas-dev#13327 CLN: pep for cython, xref pandas-dev#12995
1 parent e8e0aae commit 906740e

File tree

10 files changed

+2554
-649
lines changed

10 files changed

+2554
-649
lines changed

ci/lint.sh

+14-2
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ source activate pandas
77
RET=0
88

99
if [ "$LINT" ]; then
10-
echo "Linting"
10+
echo "Linting *.py"
1111
for path in 'core' 'indexes' 'types' 'formats' 'io' 'stats' 'compat' 'sparse' 'tools' 'tseries' 'tests' 'computation' 'util'
1212
do
1313
echo "linting -> pandas/$path"
@@ -17,7 +17,19 @@ if [ "$LINT" ]; then
1717
fi
1818

1919
done
20-
echo "Linting DONE"
20+
echo "Linting *.py DONE"
21+
22+
echo "Linting *.pyx"
23+
for path in 'window.pyx'
24+
do
25+
echo "linting -> pandas/$path"
26+
flake8 pandas/$path --filename '*.pyx' --select=E501,E302,E203,E226,E111,E114,E221,E303,E128,E231,E126,E128
27+
if [ $? -ne "0" ]; then
28+
RET=1
29+
fi
30+
31+
done
32+
echo "Linting *.pyx DONE"
2133

2234
echo "Check for invalid testing"
2335
grep -r -E --include '*.py' --exclude nosetester.py --exclude testing.py '(numpy|np)\.testing' pandas

doc/source/computation.rst

+61
Original file line numberDiff line numberDiff line change
@@ -391,6 +391,67 @@ For some windowing functions, additional parameters must be specified:
391391
such that the weights are normalized with respect to each other. Weights
392392
of ``[1, 1, 1]`` and ``[2, 2, 2]`` yield the same result.
393393

394+
.. _stats.moments.ts:
395+
396+
Time-aware Rolling
397+
~~~~~~~~~~~~~~~~~~
398+
399+
.. versionadded:: 0.18.2
400+
401+
New in version 0.18.2 are the ability to pass an offset (or convertible) to a ``.rolling()`` method and have it produce
402+
variable sized windows based on the passed time window. This is useful for a non-regular time frequency index.
403+
404+
.. ipython:: python
405+
406+
dft = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]},
407+
index=pd.date_range('20130101 09:00:00', periods=5))
408+
dft
409+
410+
This is a regular frequency index. Using an integer window parameter works to roll along the window frequency.
411+
412+
.. ipython:: python
413+
414+
dft.rolling(2).sum()
415+
dft.rolling(2, min_periods=1).sum()
416+
417+
Specifying an offset allows a more intuitive specification of the rolling frequency.
418+
419+
.. ipython:: python
420+
421+
dft.rolling('2s').sum()
422+
423+
Using a non-regular, but still monotonic index, rolling with an integer window does not impart any special calculation.
424+
425+
.. ipython:: python
426+
427+
428+
dft = DataFrame({'B': [0, 1, 2, np.nan, 4]},
429+
index = pd.Index([pd.Timestamp('20130101 09:00:00'),
430+
pd.Timestamp('20130101 09:00:02'),
431+
pd.Timestamp('20130101 09:00:03'),
432+
pd.Timestamp('20130101 09:00:05'),
433+
pd.Timestamp('20130101 09:00:06')],
434+
name='foo'))
435+
436+
dft
437+
dft.rolling(2).sum()
438+
439+
440+
Using the time-specification generates variable windows for this sparse data.
441+
442+
.. ipython:: python
443+
444+
dft.rolling('2s').sum()
445+
446+
Furthermore, we now allow an optional ``on`` parameter to specify a column (rather than the
447+
default of the index) in a DataFrame.
448+
449+
.. ipython:: python
450+
451+
dft = dft.reset_index()
452+
dft
453+
dft.rolling('2s', on='foo').sum()
454+
394455
Centering Windows
395456
~~~~~~~~~~~~~~~~~
396457

doc/source/whatsnew/v0.18.2.txt

+59
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ We recommend that all users upgrade to this version.
1010
Highlights include:
1111

1212
- :func:`merge_asof` for asof-style time-series joining, see :ref:`here <whatsnew_0182.enhancements.asof_merge>`
13+
- ``.rolling()`` are now time-series aware, see :ref:`here <whatsnew_0182.enhancements.rolling_ts>`
1314

1415
.. contents:: What's new in v0.18.2
1516
:local:
@@ -111,6 +112,64 @@ that forward filling happens automatically taking the most recent non-NaN value.
111112
This returns a merged DataFrame with the entries in the same order as the original left
112113
passed DataFrame (``trades`` in this case), with the fields of the ``quotes`` merged.
113114

115+
.. _whatsnew_0182.enhancements.rolling_ts:
116+
117+
``.rolling()`` are now time-series aware
118+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
119+
120+
``.rolling()`` objects are now time-series aware and can accept a time-series offset (or convertible) for the ``window`` argument (:issue:`13327`, :issue:`12995`)
121+
See the full documenation :ref:`here <stats.moments.ts>`.
122+
123+
.. ipython:: python
124+
125+
dft = pd.DataFrame({'B': [0, 1, 2, np.nan, 4]},
126+
index=pd.date_range('20130101 09:00:00', periods=5))
127+
dft
128+
129+
This is a regular frequency index. Using an integer window parameter works to roll along the window frequency.
130+
131+
.. ipython:: python
132+
133+
dft.rolling(2).sum()
134+
dft.rolling(2, min_periods=1).sum()
135+
136+
Specifying an offset allows a more intuitive specification of the rolling frequency.
137+
138+
.. ipython:: python
139+
140+
dft.rolling('2s').sum()
141+
142+
Using a non-regular, but still monotonic index, rolling with an integer window does not impart any special calculation.
143+
144+
.. ipython:: python
145+
146+
147+
dft = DataFrame({'B': [0, 1, 2, np.nan, 4]},
148+
index = pd.Index([pd.Timestamp('20130101 09:00:00'),
149+
pd.Timestamp('20130101 09:00:02'),
150+
pd.Timestamp('20130101 09:00:03'),
151+
pd.Timestamp('20130101 09:00:05'),
152+
pd.Timestamp('20130101 09:00:06')],
153+
name='foo'))
154+
155+
dft
156+
dft.rolling(2).sum()
157+
158+
Using the time-specification generates variable windows for this sparse data.
159+
160+
.. ipython:: python
161+
162+
dft.rolling('2s').sum()
163+
164+
Furthermore, we now allow an optional ``on`` parameter to specify a column (rather than the
165+
default of the index) in a DataFrame.
166+
167+
.. ipython:: python
168+
169+
dft = dft.reset_index()
170+
dft
171+
dft.rolling('2s', on='foo').sum()
172+
114173
.. _whatsnew_0182.enhancements.read_csv_dupe_col_names_support:
115174

116175
:func:`read_csv` has improved support for duplicate column names

pandas/core/generic.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -5311,11 +5311,12 @@ def _add_series_or_dataframe_operations(cls):
53115311

53125312
@Appender(rwindow.rolling.__doc__)
53135313
def rolling(self, window, min_periods=None, freq=None, center=False,
5314-
win_type=None, axis=0):
5314+
win_type=None, on=None, axis=0):
53155315
axis = self._get_axis_number(axis)
53165316
return rwindow.rolling(self, window=window,
53175317
min_periods=min_periods, freq=freq,
5318-
center=center, win_type=win_type, axis=axis)
5318+
center=center, win_type=win_type,
5319+
on=on, axis=axis)
53195320

53205321
cls.rolling = rolling
53215322

0 commit comments

Comments
 (0)