-
Notifications
You must be signed in to change notification settings - Fork 521
[LLaVA] llava_main output is empty #10096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
module: examples
Issues related to demos under examples/
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Comments
Hi @ruitard, thanks for posting this. I can reproduce locally, looks like we have a bunch of missing operators. Taking a look. cc @larryliu0820 for llava. |
Merged
lucylq
added a commit
that referenced
this issue
Apr 12, 2025
See issue: #10096 Copy llama Cmakelists.txt to link custom ops into the binary: https://github.com/pytorch/executorch/blob/409447d75a1524c1acc8f8ea894c2e13dd723a79/examples/models/llama/CMakeLists.txt#L114 Test plan: Build: ``` cmake -DPYTHON_EXECUTABLE=python -DCMAKE_INSTALL_PREFIX=${BUILD_DIR} -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON -DEXECUTORCH_BUILD_XNNPACK=ON -DCMAKE_PREFIX_PATH="/home/lfq/.conda/envs/executorch/lib/python3.10/site-packages" -Bcmake-out/examples/models/llava examples/models/llava cmake --build cmake-out/examples/models/llava/ -j8 --config Debug ``` Run: ``` cmake-out/examples/models/llava/llava_main --model_path=llava.pte --tokenizer_path=tokenizer.bin --image_path=image.pt --prompt="ASSISTANT:" --temperature=0 --seq_len=650 age.pt --prompt="ASSISTANT:" --temperature=0 --seq_len=650 I 00:00:00.001282 executorch:cpuinfo_utils.cpp:62] Reading file /sys/devices/soc0/image_version I 00:00:00.001330 executorch:cpuinfo_utils.cpp:78] Failed to open midr file /sys/devices/soc0/image_version I 00:00:00.001353 executorch:cpuinfo_utils.cpp:91] Reading file /sys/devices/system/cpu/cpu0/regs/identification/midr_el1 I 00:00:00.001380 executorch:cpuinfo_utils.cpp:100] Failed to open midr file /sys/devices/system/cpu/cpu0/regs/identification/midr_el1 I 00:00:00.001390 executorch:cpuinfo_utils.cpp:116] CPU info and manual query on # of cpus dont match. I 00:00:00.001397 executorch:main.cpp:77] Resetting threadpool with num threads = 0 I 00:00:00.001412 executorch:multimodal_runner.h:45] Creating Multimodal LLM runner: model_path=llava.pte, tokenizer_path=tokenizer.bin I 00:00:00.025122 executorch:main.cpp:107] image size(0): 3, size(1): 240, size(2): 336 I 00:00:21.793359 executorch:llava_runner.cpp:142] RSS after loading model: 6123.457031 MiB (0 if unsupported) I 00:00:23.059576 executorch:text_prefiller.cpp:95] Prefill token result numel(): 32064 I 00:00:33.459186 executorch:llava_runner.cpp:166] RSS after prompt and image prefill: 6255.707031 MiB (0 if unsupported) ASSISTANT:I 00:00:33.948606 executorch:text_prefiller.cpp:95] Prefill token result numel(): 32064 image captures a basketball game in progress, with several players on the court. One player is in the middle of a dunk, while another player is attempting toPyTorchObserver {"prompt_tokens":616,"generated_tokens":33,"model_load_start_ms":1744415212709,"model_load_end_ms":1744415234476,"inference_start_ms":1744415234477,"inference_end_ms":1744415259787,"prompt_eval_end_ms":1744415246632,"first_token_ms":1744415246632,"aggregate_sampling_time_ms":2883588,"SCALING_FACTOR_UNITS_PER_SECOND":1000} I 00:00:47.103512 executorch:stats.h:104] Prompt Tokens: 616 Generated Tokens: 33 I 00:00:47.103520 executorch:stats.h:110] Model Load Time: 21.767000 (seconds) I 00:00:47.103528 executorch:stats.h:117] Total inference time: 25.310000 (seconds) Rate: 1.303832 (tokens/second) I 00:00:47.103533 executorch:stats.h:127] Prompt evaluation: 12.155000 (seconds) Rate: 50.678733 (tokens/second) I 00:00:47.103538 executorch:stats.h:136] Generated 33 tokens: 13.155000 (seconds) Rate: 2.508552 (tokens/second) I 00:00:47.103542 executorch:stats.h:147] Time to first generated token: 12.155000 (seconds) I 00:00:47.103545 executorch:stats.h:153] Sampling time over 649 tokens: 2883.588000 (seconds) I 00:00:47.103549 executorch:llava_runner.cpp:178] RSS after finishing text generation: 6255.707031 MiB (0 if unsupported) ```
Resolved with #10127
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
module: examples
Issues related to demos under examples/
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
🐛 Describe the bug
see https://github.com/pytorch/executorch/actions/runs/14393001844/job/40363667480
It only prints "ASSISTANT:" prefix
Versions
latest main branch
cc @mergennachin @iseeyuan @lucylq @helunwencser @tarun292 @kimishpatel @jackzhxng
The text was updated successfully, but these errors were encountered: