Skip to content

llava: improve clip_ctx destructor to not memleak load_image_size #12834

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2025

Conversation

mattjcly
Copy link
Contributor

@mattjcly mattjcly commented Apr 8, 2025

Detected clip memory leak with leaks on mac after building in debug mode from 2dabf75

sudo leaks --atExit -- /Users/matt/Workspace/forks/llama.cpp/build/bin/llama-qwen2vl-cli -m /Users/matt/.cache/lm-studio/models/lmstudio-community/Qwen2-VL-2B-Instruct-GGUF/Qwen2-VL-2B-Instruct-Q4_K_M.gguf --mmproj /Users/matt/.cache/lm-studio/models/lmstudio-community/Qwen2-VL-2B-Instruct-GGUF/mmproj-model-f32.gguf --image /Users/matt/Documents/dice.jpg -p "What is this?" &> leaks_output.txt

Results in:

STACK OF 1 INSTANCE OF 'ROOT LEAK: <malloc in clip_image_size_init>':
9   dyld                                  0x1828990e0 start + 2360
8   llama-qwen2vl-cli                     0x102022978 main + 836  qwen2vl-cli.cpp:565
7   llama-qwen2vl-cli                     0x102023198 load_image(llava_context*, common_params*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 488  qwen2vl-cli.cpp:228
6   llama-qwen2vl-cli                     0x102037eb4 llava_image_embed_make_with_filename + 152  llava.cpp:565
5   llama-qwen2vl-cli                     0x102037d7c llava_image_embed_make_with_bytes + 156  llava.cpp:503
4   llama-qwen2vl-cli                     0x102036a50 llava_image_embed_make_with_clip_img + 280  llava.cpp:430
3   llama-qwen2vl-cli                     0x102036c74 encode_image_with_clip(clip_ctx*, int, clip_image_u8 const*, float*, int*) + 400  llava.cpp:265
2   llama-qwen2vl-cli                     0x1020479c4 clip_image_size_init + 20  clip.cpp:1607
1   libc++abi.dylib                       0x182bd2ec0 operator new(unsigned long) + 32
0   libsystem_malloc.dylib                0x182a57a44 _malloc_zone_malloc_instrumented_or_legacy + 136 
====
    1 (16 bytes) ROOT LEAK: <malloc in clip_image_size_init 0x136f8c170> [16]

This change fixes that, and makes clip_free safter to call by checking if the ctx is nullptr

@ggerganov ggerganov requested a review from ngxson April 8, 2025 17:44
@ngxson
Copy link
Collaborator

ngxson commented Apr 8, 2025

Please note that these c-type API will soon be refactored into C++ wrapper (and some will be migrated directly to C++ types), so we should no longer need these fixes in near future (I'm talking about in 1-2 weeks)

@ngxson ngxson merged commit b32efad into ggml-org:master Apr 8, 2025
51 checks passed
tastelikefeet added a commit to tastelikefeet/llama.cpp that referenced this pull request Apr 10, 2025
* master: (123 commits)
  cuda : add f32 to bf16 copy op (ggml-org#12806)
  llava: improve clip_ctx destructor to not memleak load_image_size (ggml-org#12834)
  llama : fix FA when KV cache is not used (i.e. embeddings) (ggml-org#12825)
  server : fix thread.join() on exit (ggml-org#12831)
  llava: add more helper functions to check projector types in clip context (ggml-org#12824)
  arg : Including limits file on AIX (ggml-org#12822)
  server : webui : Improve Chat Input with Auto-Sizing Textarea (ggml-org#12785)
  Revert "sycl:remove redundant memcopy in function ggml_backend_sycl_buffer_set_tensor" (ggml-org#12812)
  gguf-py : support lazy tensor splitting (ggml-org#12809)
  llama : Support llama 4 text-only (ggml-org#12791)
  opencl: better identify Adreno GPU (ggml-org#12760)
  hellaswag: display estimated score confidence interval (ggml-org#12797)
  cuda : fix HIP and MUSA BF16 (#0)
  sync : ggml
  ggml : simplify Arm fp16 CPU logic (ggml/1177)
  CUDA: don't convert BF16 weights to FP32 (ggml/1174)
  cpu: move all the operators into a separate c++ file (except mul_mat) (ggml/1167)
  sycl: remove redundant memcopy in function ggml_backend_sycl_buffer_set_tensor (ggml-org#12734)
  ci : no curl on ggml-ci (ggml-org#12796)
  cmake : enable curl by default (ggml-org#12761)
  ...

# Conflicts:
#	common/arg.cpp
#	common/common.cpp
#	common/common.h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants