Skip to content

DOC: Projected time for running entire performance test suite is outdated #36344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
avinashpancham opened this issue Sep 13, 2020 · 5 comments · Fixed by #36836
Closed

DOC: Projected time for running entire performance test suite is outdated #36344

avinashpancham opened this issue Sep 13, 2020 · 5 comments · Fixed by #36836
Assignees
Labels

Comments

@avinashpancham
Copy link
Contributor

avinashpancham commented Sep 13, 2020

Location of the documentation

https://pandas.pydata.org/pandas-docs/stable/development/contributing.html#running-the-performance-test-suite

Note: You can check the latest versions of the docs on master here.

Documentation problem

Ran entire performance test suite for #36161 and it took close to 8 hours. Docs state: "Running the full test suite can take up to one hour and use up to 3GB of RAM". From @jreback I understood that this is because the test suite has increase a lot in size, since the last doc update.

Suggested fix for documentation

I would add a disclaimer regarding your hardware and state that running the tests on a laptop can take up to X hours. For X we can take 8 or another number if others have experienced an even longer duration for running the full performance test suite. So we end up with something like this:

"Projected duration for running the full test suite may vary depending on your machine, but on a modern laptop it will take up to 8 hours and use up to 3GB of RAM"

@avinashpancham avinashpancham added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 13, 2020
@jreback
Copy link
Contributor

jreback commented Sep 13, 2020

note generally you should not be running the entire test suite
that's sort of the point
we have pretty targeted subset

also something odd about your setup it - this shouldn't take nearly that long in any event

but ynmv esp of other things are running

cc @jbrockmendel

@avinashpancham
Copy link
Contributor Author

I wasn't running anything besides the test suite. I just started it and checked it the next morning.

Is there maybe a parallelization option I'm missing? That is not explained in the docs and might explain the difference.

@jbrockmendel
Copy link
Member

We should probably

  • use "benchmark suite" instead of "test suite"
  • update the docs to reflect that it can be an all-day process
  • update the docs with suggestions on how to target a subset of benchmarks.

If your PR doesn't touch _libs.tslibs, then you can exclude the tslibs asvs. I'm not sure if the -b argument supports exclusion regexes.

@dsaxton dsaxton removed the Needs Triage Issue that has not been reviewed by a pandas team member label Sep 13, 2020
@avinashpancham
Copy link
Contributor Author

avinashpancham commented Sep 20, 2020

  • update the docs with suggestions on how to target a subset of benchmarks.

If your PR doesn't touch _libs.tslibs, then you can exclude the tslibs asvs. I'm not sure if the -b argument supports exclusion regexes.

If we want to update points 1 and 2 then I can update that. The 3rd point to target a subset of benchmarks is luckily already specified on the documentation page and illustrated with examples, see below.

You can run specific benchmarks using the -b flag, which takes a regular expression. For example, this will only run tests from a pandas/asv_bench/benchmarks/groupby.py file

asv continuous -f 1.1 upstream/master HEAD -b ^groupby

If you want to only run a specific group of tests from a file, you can do it using . as a separator. For example:

asv continuous -f 1.1 upstream/master HEAD -b groupby.GroupByMethods
will only run the GroupByMethods benchmark defined in groupby.py.

@avinashpancham
Copy link
Contributor Author

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants