Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about profile #501

Open
chairman-lu opened this issue Dec 7, 2024 · 1 comment
Open

Some questions about profile #501

chairman-lu opened this issue Dec 7, 2024 · 1 comment

Comments

@chairman-lu
Copy link

when I run the code like this:
options = onnxruntime.SessionOptions()
options.enable_profiling=True
session = onnxruntime.InferenceSession(model_path, providers=providers, sess_options=options)
session.run_with_iobinding(io_binding)
profile_file = session.end_profiling()

I got a json profile file, and visualize it with edge://tracing/, got the result as follows:

Image

I have two questions here:

  1. why the first conv cost so much? Is the time of loading data from host to device included?
  2. except the first conv, add the latency of other ops, the result is significantly greater than the time I got in my time test code below. So how can I get the exact run time of every single op?

Image

@tianleiwu
Copy link

This first Conv need run all possible kernels to find the fastest kernel for a given input shape. It will take longer than the remaining inference run of the same shape input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants