Skip to content

Subpar scattergl performance with date axis #413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cpsievert opened this issue Apr 11, 2016 · 9 comments
Closed

Subpar scattergl performance with date axis #413

cpsievert opened this issue Apr 11, 2016 · 9 comments
Assignees
Milestone

Comments

@cpsievert
Copy link

It seems as though scattergl with a date axis chokes around 250K points — http://codepen.io/cpsievert/pen/WwMpJW

I found that a bit surprising since ~1M points works great with non-date axes

@etpinard
Copy link
Contributor

Not surprising on my end.

scattergl uses one of two gl-vis modules depending on the user data:

  • gl-scatter2d is designed for very large datasets (> 1e6 pts) but of limited dimension. For example, marker.color and marker.size arrays aren't allowed.
  • gl-scatter2d-fancy doesn't perform as well for very large datasets, but mimics all the available plotly.js scatter options.

The split between the two gl-vis modules happens here. You'll notice non-linear axis types are considered fancy.

Moreover, converting a date to coordinate routine could be optimized. I suspect that we could potentially save off a few milliseconds in this step.

@monfera
Copy link
Contributor

monfera commented Sep 30, 2016

I'll start on it early next week; an initial hunch is that dates are too expensive in JS, so for performance it's best to convert to some numeric representation (e.g. date.valueOf()) and convert back to a real date object only at the final stage for axis tick determination (and possibly cache it). Will know more next week.

@monfera
Copy link
Contributor

monfera commented Oct 4, 2016

Yes there's a big speed difference and as mentioned by @etpinard it comes down to the fact that currently, the much slower gl-scatter2d-fancy renderer is being used if the axis is of type date. Though conceptually, date axes are linear, in plotly they are not of type linear, which here means a linear numeric scale.

There are several options:

  1. Acknowledge that it's slower (listed for completeness' sake, given current demand I think it's not realistic)
  2. Make the (non-fancy) gl-vis/gl-scattergl handle dates, as, at the end of the day, it's also a linear scale except ticks are sampled and rendered differently; the plot itself remains the same. Making this plot work with dates probably needs work in gl-vis. Probably this is the fastest option, but it won't help if users bump into the same limitation for other reasons (e.g. log scale or arrays for marker sizes/colors).
  3. Rewrite gl-scatter2d-fancy so that it's fast. Currently, it renders into a bona fide geometry mainly to be able to draw different point marker shapes. However it's possible to turn the (non-fancy) gl-vis/gl-scattergl into something that can render point marker shapes. The drawback is that the fancy version handles other things as well: arrays for marker styles (doable with more WebGL attrib arrays) log scales (am I missing something else?). So if we do this it makes sense to cover that so the fancy version can be dropped. Benefit: one fewer renderers.
  4. A combination of the previous two points above: rewrite both renderers e.g. in regl, retaining the features of both and the speed of the non-fancy version.

@etpinard
Copy link
Contributor

etpinard commented Oct 4, 2016

I'm leaning towards waiting for the regl rewrite to fix this issue.

@jackparmer thoughts?

@monfera
Copy link
Contributor

monfera commented Oct 4, 2016

@etpinard regarding the speeding options, we talked about different markers rendered by shaders. Something like this would obsolete the fancy version. I made a regl example here: http://codepen.io/monfera/full/GjOBkJ/

@jackparmer
Copy link
Contributor

I'm leaning towards waiting for the regl rewrite to fix this issue.

I'm not excited about waiting 3-6 months to make datetimes work in WebGL. Pared down trace options for timeseries could make a lot of sense like pointcloud. The use case is loading and looking at ridiculously huge time series data, then zooming into parts that look intersting/odd for investigation. Imagine time series data coming off sensors in cars or spacecraft on subsecond intervals... Huge amounts of timestamped data. There would have to be some high quality decimation work in JavaScript to make this use case a reality (a decimated view of the time series is rendered at 0% zoom, this gets recalculated on zoom, etc).

@etpinard
Copy link
Contributor

etpinard commented Oct 4, 2016

Make the (non-fancy) gl-vis/gl-scattergl handle dates, as, at the end of the day, it's also a linear scale except ticks are sampled and rendered differently; the plot itself remains the same. Making this plot work with dates probably needs work in gl-vis. Probably this is the fastest option, but it won't help if users bump into the same limitation for other reasons (e.g. log scale or arrays for marker sizes/colors)

is the winner.

@monfera
Copy link
Contributor

monfera commented Oct 7, 2016

#1021 purports to fix it, I'm trying to think of a test case for this.

@etpinard
Copy link
Contributor

etpinard commented Nov 3, 2016

done in #1033

@etpinard etpinard closed this as completed Nov 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants