Skip to content

Commit 17b6411

Browse files
authored
🔥 delegate nan behavior to aggregators (#294)
* 🔥 delegate nan behavior to aggregators * 🙈 formatting + fixing tests * 💨 formatting * 🖍️ updating changelog * ✨ altering tests (to reduce false negatives) * 💨 adding changelog
1 parent 3ad7dea commit 17b6411

File tree

8 files changed

+169
-328
lines changed

8 files changed

+169
-328
lines changed

Diff for: CHANGELOG.md

+43
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,46 @@
1+
# `TODO`
2+
## New Features
3+
4+
## What's Changed
5+
- Removed the `check_nans` argument of the FigureResampler constructor and its `add_traces` method. This argument was used to check for NaNs in the input data, but this is now handled by the `nan_policy` argument of specific aggregators (see for instance the constructor of the `MinMax` and `MinMaxLTTB` aggregator).
6+
7+
8+
# v0.9.2
9+
### `overview` / `rangeslider` support 🎉
10+
11+
* ➡️ [code example](https://github.com/predict-idlab/plotly-resampler/blob/main/examples/dash_apps/05_cache_overview_subplots.py):
12+
* 🖍️ [high level docs](https://predict-idlab.github.io/plotly-resampler/v0.9.2/getting_started/#overview)
13+
* 🔍 [API docs](https://predict-idlab.github.io/plotly-resampler/v0.9.2/api/figure_resampler/figure_resampler/#figure_resampler.figure_resampler.FigureResampler.__init__)
14+
* make sure to take a look at the doc strings of the `create_overview`, `overview_row_idxs`, and `overview_kwargs` arguments of the `FigureResampler` its constructor.
15+
![Peek 2023-10-25 01-51](https://github.com/predict-idlab/plotly-resampler/assets/38005924/5b3a40e0-f058-4d7e-8303-47e51896347a)
16+
17+
18+
19+
### 💨 remove [traceUpdater](https://github.com/predict-idlab/trace-updater) dash component as a dependency.
20+
> **context**: see #281 #271
21+
> `traceUpdater` was developed during a period when Dash did not yet contain the [Patch ](https://dash.plotly.com/partial-properties)feature for partial property updates. As such, `traceUpdater` has become somewhat redundant is now effectively replaced with Patch.
22+
23+
🚨 This is a breaking change with previous `Dash` apps!!!
24+
25+
## What's Changed
26+
* Support nested admonitions by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/245
27+
* 👷 build: create codeql.yml by @NielsPraet in https://github.com/predict-idlab/plotly-resampler/pull/248
28+
* :sparkles: first draft of improved xaxis filtering by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/250
29+
* :arrow_up: update dependencies by @jvdd in https://github.com/predict-idlab/plotly-resampler/pull/260
30+
* :muscle: update dash-extensions by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/261
31+
* fix for #263 by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/264
32+
* Rangeslider support by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/254
33+
* :pray: fix mkdocs by @jvdd in https://github.com/predict-idlab/plotly-resampler/pull/268
34+
* ✈️ fix for #270 by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/272
35+
* :mag: adding init kwargs to show dash - fix for #265 by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/269
36+
* Refactor/remove trace updater by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/281
37+
* Bug/pop rangeselector by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/279
38+
* :sparkles: fix for #275 by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/286
39+
* Bug/rangeselector by @jonasvdd in https://github.com/predict-idlab/plotly-resampler/pull/287
40+
41+
42+
**Full Changelog**: https://github.com/predict-idlab/plotly-resampler/compare/v0.9.1...v0.9.2
43+
144

245
# v0.9.1
346
## Major changes:

Diff for: plotly_resampler/aggregation/aggregators.py

+23-4
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@
1717
LTTBDownsampler,
1818
MinMaxDownsampler,
1919
MinMaxLTTBDownsampler,
20+
NaNMinMaxDownsampler,
21+
NaNMinMaxLTTBDownsampler,
2022
)
2123

2224
from ..aggregation.aggregation_interface import DataAggregator, DataPointSelector
@@ -171,18 +173,25 @@ class MinMaxAggregator(DataPointSelector):
171173
172174
"""
173175

174-
def __init__(self, **downsample_kwargs):
176+
def __init__(self, nan_policy="omit", **downsample_kwargs):
175177
"""
176178
Parameters
177179
----------
178180
**downsample_kwargs
179181
Keyword arguments passed to the :class:`MinMaxDownsampler`.
180182
- The `parallel` argument is set to False by default.
183+
nan_policy: str, optional
184+
The policy to handle NaNs. Can be 'omit' or 'keep'. By default, 'omit'.
181185
182186
"""
183187
# this downsampler supports all dtypes
184188
super().__init__(**downsample_kwargs)
185-
self.downsampler = MinMaxDownsampler()
189+
if nan_policy not in ("omit", "keep"):
190+
raise ValueError("nan_policy must be either 'omit' or 'keep'")
191+
if nan_policy == "omit":
192+
self.downsampler = MinMaxDownsampler()
193+
else:
194+
self.downsampler = NaNMinMaxDownsampler()
186195

187196
def _arg_downsample(
188197
self,
@@ -208,21 +217,31 @@ class MinMaxLTTB(DataPointSelector):
208217
Paper: [https://arxiv.org/pdf/2305.00332.pdf](https://arxiv.org/pdf/2305.00332.pdf)
209218
"""
210219

211-
def __init__(self, minmax_ratio: int = 4, **downsample_kwargs):
220+
def __init__(
221+
self, minmax_ratio: int = 4, nan_policy: str = "omit", **downsample_kwargs
222+
):
212223
"""
213224
Parameters
214225
----------
215226
minmax_ratio: int, optional
216227
The ratio between the number of data points in the MinMax-prefetching and
217228
the number of data points that will be outputted by LTTB. By default, 4.
229+
nan_policy: str, optional
230+
The policy to handle NaNs. Can be 'omit' or 'keep'. By default, 'omit'.
218231
**downsample_kwargs
219232
Keyword arguments passed to the `MinMaxLTTBDownsampler`.
220233
- The `parallel` argument is set to False by default.
221234
- The `minmax_ratio` argument is set to 4 by default, which was empirically
222235
proven to be a good default.
223236
224237
"""
225-
self.minmaxlttb = MinMaxLTTBDownsampler()
238+
if nan_policy not in ("omit", "keep"):
239+
raise ValueError("nan_policy must be either 'omit' or 'keep'")
240+
if nan_policy == "omit":
241+
self.minmaxlttb = MinMaxLTTBDownsampler()
242+
else:
243+
self.minmaxlttb = NaNMinMaxLTTBDownsampler()
244+
226245
self.minmax_ratio = minmax_ratio
227246

228247
super().__init__(

Diff for: plotly_resampler/figure_resampler/figure_resampler_interface.py

+7-51
Original file line numberDiff line numberDiff line change
@@ -555,7 +555,6 @@ def _parse_get_trace_props(
555555
hf_hovertext: Iterable = None,
556556
hf_marker_size: Iterable = None,
557557
hf_marker_color: Iterable = None,
558-
check_nans: bool = True,
559558
) -> _hf_data_container:
560559
"""Parse and capture the possibly high-frequency trace-props in a datacontainer.
561560
@@ -572,11 +571,6 @@ def _parse_get_trace_props(
572571
hf_hovertext : Iterable, optional
573572
High-frequency trace "hovertext" data, overrides the current trace its
574573
hovertext data.
575-
check_nans: bool, optional
576-
Whether the `hf_y` should be checked for NaNs, by default True.
577-
As checking for NaNs is expensive, this can be disabled when the `hf_y` is
578-
already known to contain no NaNs (or when the downsampler can handle NaNs,
579-
e.g., EveryNthPoint).
580574
581575
Returns
582576
-------
@@ -654,7 +648,8 @@ def _parse_get_trace_props(
654648
if hf_y.ndim != 0: # if hf_y is an array
655649
hf_x = pd.RangeIndex(0, len(hf_y)) # np.arange(len(hf_y))
656650
else: # if no data as y or hf_y is passed
657-
hf_x = np.asarray(None)
651+
hf_x = np.asarray([])
652+
hf_y = np.asarray([])
658653

659654
assert hf_y.ndim == np.ndim(hf_x), (
660655
"plotly-resampler requires scatter data "
@@ -677,22 +672,6 @@ def _parse_get_trace_props(
677672
if isinstance(hf_marker_color, (tuple, list, np.ndarray, pd.Series)):
678673
hf_marker_color = np.asarray(hf_marker_color)
679674

680-
# Remove NaNs for efficiency (storing less meaningless data)
681-
# NaNs introduce gaps between enclosing non-NaN data points & might distort
682-
# the resampling algorithms
683-
if check_nans and pd.isna(hf_y).any():
684-
not_nan_mask = ~pd.isna(hf_y)
685-
hf_x = hf_x[not_nan_mask]
686-
hf_y = hf_y[not_nan_mask]
687-
if isinstance(hf_text, np.ndarray):
688-
hf_text = hf_text[not_nan_mask]
689-
if isinstance(hf_hovertext, np.ndarray):
690-
hf_hovertext = hf_hovertext[not_nan_mask]
691-
if isinstance(hf_marker_size, np.ndarray):
692-
hf_marker_size = hf_marker_size[not_nan_mask]
693-
if isinstance(hf_marker_color, np.ndarray):
694-
hf_marker_color = hf_marker_color[not_nan_mask]
695-
696675
# Try to parse the hf_x data if it is of object type or
697676
if len(hf_x) and (hf_x.dtype.type is np.str_ or hf_x.dtype == "object"):
698677
try:
@@ -876,7 +855,6 @@ def add_trace(
876855
hf_hovertext: Union[str, Iterable] = None,
877856
hf_marker_size: Union[str, Iterable] = None,
878857
hf_marker_color: Union[str, Iterable] = None,
879-
check_nans: bool = True,
880858
**trace_kwargs,
881859
):
882860
"""Add a trace to the figure.
@@ -932,13 +910,6 @@ def add_trace(
932910
hf_marker_color: Iterable, optional
933911
The original high frequency marker color. If set, this has priority over the
934912
trace its ``marker.color`` argument.
935-
check_nans: boolean, optional
936-
If set to True, the trace's data will be checked for NaNs - which will be
937-
removed. By default True.
938-
As this is a costly operation, it is recommended to set this parameter to
939-
False if you are sure that your data does not contain NaNs (or when the
940-
downsampler can handle NaNs, e.g., EveryNthPoint). This should considerably
941-
speed up the graph construction time.
942913
**trace_kwargs: dict
943914
Additional trace related keyword arguments.
944915
e.g.: row=.., col=..., secondary_y=...
@@ -1019,7 +990,6 @@ def add_trace(
1019990
hf_hovertext,
1020991
hf_marker_size,
1021992
hf_marker_color,
1022-
check_nans,
1023993
)
1024994

1025995
# These traces will determine the autoscale its RANGE!
@@ -1078,7 +1048,6 @@ def add_traces(
10781048
downsamplers: None | List[AbstractAggregator] | AbstractAggregator = None,
10791049
gap_handlers: None | List[AbstractGapHandler] | AbstractGapHandler = None,
10801050
limit_to_views: List[bool] | bool = False,
1081-
check_nans: List[bool] | bool = True,
10821051
**traces_kwargs,
10831052
):
10841053
"""Add traces to the figure.
@@ -1124,14 +1093,6 @@ def add_traces(
11241093
by default False.\n
11251094
Remark that setting this parameter to True ensures that low frequency traces
11261095
are added to the ``hf_data`` property.
1127-
check_nans : None | List[bool] | bool, optional
1128-
List of check_nans booleans for the added traces. If set to True, the
1129-
trace's datapoints will be checked for NaNs. If a single boolean is passed,
1130-
all to be added traces will use this value, by default True.\n
1131-
As this is a costly operation, it is recommended to set this parameter to
1132-
False if the data is known to contain no NaNs (or when the downsampler can
1133-
handle NaNs, e.g., EveryNthPoint). This will considerably speed up the graph
1134-
construction time.
11351096
**traces_kwargs: dict
11361097
Additional trace related keyword arguments.
11371098
e.g.: rows=.., cols=..., secondary_ys=...
@@ -1174,16 +1135,11 @@ def add_traces(
11741135
gap_handlers = [gap_handlers] * len(data)
11751136
if isinstance(limit_to_views, bool):
11761137
limit_to_views = [limit_to_views] * len(data)
1177-
if isinstance(check_nans, bool):
1178-
check_nans = [check_nans] * len(data)
11791138

1180-
zipped = zip(
1181-
data, max_n_samples, downsamplers, gap_handlers, limit_to_views, check_nans
1182-
)
1183-
for (
1184-
i,
1185-
(trace, max_out, downsampler, gap_handler, limit_to_view, check_nan),
1186-
) in enumerate(zipped):
1139+
zipped = zip(data, max_n_samples, downsamplers, gap_handlers, limit_to_views)
1140+
for (i, (trace, max_out, downsampler, gap_handler, limit_to_view)) in enumerate(
1141+
zipped
1142+
):
11871143
if (
11881144
trace.type.lower() not in self._high_frequency_traces
11891145
or self._hf_data.get(trace.uid) is not None
@@ -1194,7 +1150,7 @@ def add_traces(
11941150
if not limit_to_view and (trace.y is None or len(trace.y) <= max_out_s):
11951151
continue
11961152

1197-
dc = self._parse_get_trace_props(trace, check_nans=check_nan)
1153+
dc = self._parse_get_trace_props(trace)
11981154
self._hf_data[trace.uid] = self._construct_hf_data_dict(
11991155
dc,
12001156
trace=trace,

0 commit comments

Comments
 (0)