Skip to content

Slow trisurf #593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
choldgraf opened this issue Oct 25, 2016 · 15 comments
Closed

Slow trisurf #593

choldgraf opened this issue Oct 25, 2016 · 15 comments

Comments

@choldgraf
Copy link
Contributor

Two quick things:

  1. Something in the create_trisurf code has changed, and now the code takes an order of magnitude longer to run. (now at about 45 seconds when it used to take 3-4 seconds). I'm not sure if somebody un-did the optimizations that I added a few months back, or if something new has been added that's slow, but just FYI.
  2. There seems to be a bug in the iplot function for offline plotting. E.g.:

image

@choldgraf
Copy link
Contributor Author

it looks like it this is because of the colors.py module, which loops through triplets of colots (in a for loop) and multiplies each one by 255. This is the same issue that I addressed in the previous PR optimizing the trisurf code.

Looping through a list of triplets and individually multiplying each number by a scalar is a really inefficient way of doing this, and makes the code unusable for any decent-sized amount of data.

@jackparmer
Copy link
Contributor

Hey @choldgraf I'm happy to review and merge if you want to make a quick PR. Thanks for the heads-up on this

@choldgraf
Copy link
Contributor Author

I can make a quick PR on this one, but to be honest I'm a little hesitant to do this, since I spent a while improving the map_face2color and trisurf code, and I opened this issue because somebody basically undid all of that...

@jackparmer
Copy link
Contributor

Oh, yikes, sorry about that! Would the PR be essentially the same changes? If so we can look at those commits and try to cherrypick them back. I don't remember why that work was undone, I'll try to be more vigilant this time around!

@choldgraf
Copy link
Contributor Author

No, I think there was some refactoring that happened, but as a part of the refactoring it reverted to the old way of doing the multiplication. IIRC we had a back and forth on another PR and some folks seemed like they were trying to minimize the use of numpy to reduce dependencies. IMO I think Numpy should be treated as a core part of Python in this field, so I don't see it as a dependency, but some folks disagree and I think that's why the current state of code isn't vectorized w/ numpy (and thus takes like 10 times longer)

@jackparmer
Copy link
Contributor

IMO I think Numpy should be treated as a core part of Python in this field

Yeah, I definitely agree on the core use of numpy.

That's why the current state of code isn't vectorized w/ numpy (and thus takes like 10 times longer)

Can we try to bring this fix back?

@choldgraf
Copy link
Contributor Author

maybe I will have some time to work on this over the weekend or next week,
if I do then I could make some additions, though I'd be curious what the
other python devs think since we've had versions of this conversation

already...

On Fri, Oct 28, 2016 at 4:01 PM Jack Parmer [email protected]
wrote:

IMO I think Numpy should be treated as a core part of Python in this field

Yeah, I definitely agree on the core use of numpy.

That's why the current state of code isn't vectorized w/ numpy (and thus
takes like 10 times longer)

Can we try to bring this fix back?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#593 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABwSHQK6DfXaNnC-o7QPocZBK07POfyUks5q4lSlgaJpZM4Kf_n-
.

@theengineear
Copy link
Contributor

@choldgraf ugh. Sorry this happened. I wonder if there is a way we could add some tests to keep this locked down?

It seems like not having numpy dependency has come up enough at this point that I'd be in favor of just adding it to our requirements list.

If you are willing to make the changes to get this code running again, I can make sure to benchmark this in some way to prevent future regressions.

@choldgraf
Copy link
Contributor Author

I can spend some time to re-implement these changes, though it may be a little bit of time as I've got a few other time-sensitive projects going on right now.

IMO, In the longer term, I think it's only worth making these updates if there's a more general shift towards using numpy within plotly. If people are constantly trying to think of how to do vector-style operations without numpy, it's just going to make code unnecessarily complicated and slow.

So I can fix this issue eventually (though note that it's a 2-part issue actually, as the iplot function seems to be broken in offline mode too).

@theengineear
Copy link
Contributor

I think it's only worth making these updates if there's a more general shift towards using numpy within plotly

I'm fine with making this shift. The original reason for not having it was to decrease the barrier to entry for new users. However, if it's at the cost of increasing the barrier to more real-world use, the choice is pretty obvious to just include it.

As far as the iplot function goes, @Kully, can you investigate this for now? It looks like we're either passing in a redundant argument or keyword argument.

@theengineear
Copy link
Contributor

@choldgraf thanks again for all the help here. It's really appreciated. I think I'm going to be switching off of some other Plotly work to focus on this api library in a couple weeks. I'll open up a milestone when that happens and maybe you can chime in on some requests for improvements that you'd like to see!

@Kully
Copy link
Contributor

Kully commented Oct 31, 2016

can you investigate this for now?

Yes, will do.

And to add to the discussion above, I think I'm the one responsible for the reverting to the non-vectorized computations. I think they are still in tools.py but the trisurf function may just be using colors.py right now.

Regardless, I should take this on as well. I don't think it will take too long as I'm quite familiar with the code. Sounds good?

@theengineear
Copy link
Contributor

Sounds good?

Sounds excellent! Thanks @Kully!

@choldgraf if we can get a PR up soon, can we ping you to give it a whirl and see if it meets performance expectations?

@Kully Kully mentioned this issue Nov 1, 2016
@Kully Kully changed the title Slow trisurf + Bug in displaying with iplot Slow trisurf Dec 1, 2016
@Kully
Copy link
Contributor

Kully commented Dec 1, 2016

changed the title to reflect that the iplot issue is resolved in the newest python version (pip)

@jonmmease
Copy link
Contributor

Based on the timeline of things it looks like #570 addressed the performance regressions reported here. Feel free to re-open, or open a new issue, if there are remaining loose ends from this discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants