-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
gh-118761: Improve import time of dataclasses
#129925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dataclasses
dataclasses
On Trade-offsHere is the new call trace
And I wrote benchmarks to test calling affected public functions (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use lazy importing for 4 largest modules (
re
,copy
,inspect
,annotationlib
), they are also rarely called (1 and 2 times)
I don't fully understand what you mean by "rarely", I think. It looks like a lot of this PR is essentially delaying imports that will be unconditionally used in all code that actually utilizes a dataclass. The timings are based on import time for the module itself, but given that the primary use of this module is as a decorator, it's always going to actually use the module at import time, unlike other modules where you might import it at the top of the file and then only use it inside of an if
block.
So I don't think you can consider the import time of dataclasses
without considering how it's actually used. Any imports that are unconditionally used when decorating a class aren't beneficial to delay the import of (but will incur the use-time cost of re-running import
itself, which isn't major but does exist).
@eli-schwartz yes, but most of the PRs for this issue are also adding lazy imports. |
I'm not sure what point you're trying to make. Most of the PRs for this issue are adding lazy imports. Lazy imports are a useful tool for making python programs faster, in the principle of "only pay for what you use" -- and stdlib modules often don't know what an application will in fact use. The issue here is that as far as I can tell you're adding lazy imports for things that the consumer will always use, which means that there won't be a benefit to making them lazy... |
I'm closing this one based on Eli's analysis. |
Another attempt to improve import time of stdlib modules.
Importing
dataclasses
takes a long time and affects many other modules so it needs to do is make dataclasses better.I use lazy importing for 4 largest modules (
re
,copy
,inspect
,annotationlib
), they are also rarely called (1 and 2 times)CPython configure flags:
Benchmarks:
Running:
pipx install tuna && ./python -X importtime -c 'import dataclasses' 2> import.log && tuna import.log
Total import time: 0.022s -> 0.008s = x2.75 as fast
dataclasses
import time: 0.015s -> 0.001s = x15 as fasthyperfine: 24.ms -> 10.2ms = x2.4 as fast
Main branch:
PR branch: