Skip to content

[User] GGUF conversion, stop sequence Problem #2711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghost opened this issue Aug 22, 2023 · 19 comments
Closed

[User] GGUF conversion, stop sequence Problem #2711

ghost opened this issue Aug 22, 2023 · 19 comments

Comments

@ghost
Copy link

ghost commented Aug 22, 2023

Hi <3 llama.cpp

@KerfuffleV2 shows us that models converted without metadata load different: Loading non-metadata:

llama_model_load_internal: BOS token = 1 ' '
llama_model_load_internal: EOS token = 2 ' '

Loading with one converted with external metadata:

llama_model_load_internal: BOS token = 1 '<s>'
llama_model_load_internal: EOS token = 2 '</s>'

I converted WizardMath-7B-V1.0 to GGUF and here's a couple runs:
ex1:

~/l/b/bin (master) [SIGINT]> ./main -m ~/wizardmath-7b-v1.0.ggmlv3.q4_0.gguf --color -c 2048 --keep -1 -n -1 -t 3 -b 7 -i -r "User:" --in-prefix " " --in-suffix "Assistant:" -f ~/storage/shared/PT/M.txt

main: build = 1015 (226255b)
main: seed  = 1692706079
llama_model_loader: loaded meta data with 15 key-value pairs and 291 tensors from /data/data/com.termux/files/home/wizardmath-7b-v1.0.ggmlv3.q4_0.gguf (version GGUF V1 (latest))
..
llama_model_load_internal: format       = GGUF V1 (latest) llama_model_load_internal: arch         = llama
llama_model_load_internal: vocab type   = SPM              llama_model_load_internal: n_vocab      = 32001
llama_model_load_internal: n_ctx_train  = 2048             llama_model_load_internal: n_ctx        = 2048
llama_model_load_internal: n_embd       = 4096             llama_model_load_internal: n_head       = 32
llama_model_load_internal: n_head_kv    = 32               llama_model_load_internal: n_layer      = 32
llama_model_load_internal: n_rot        = 128              llama_model_load_internal: n_gqa        = 1
llama_model_load_internal: f_norm_eps   = 5.0e-06          llama_model_load_internal: n_ff         = 11008
llama_model_load_internal: freq_base    = 10000.0          llama_model_load_internal: freq_scale   = 1
llama_model_load_internal: model type   = 7B               llama_model_load_internal: model ftype  = mostly Q4_0
llama_model_load_internal: model size   = 6.74 B           llama_model_load_internal: general.name = wizardmath-7b-v1.0.ggmlv3.q4_0.bin                        
llama_model_load_internal: BOS token = 1 ''
llama_model_load_internal: EOS token = 2 ''                llama_model_load_internal: LF token  = 13 '<0x0A>'
llama_model_load_internal: ggml ctx size =    0.09 MB      llama_model_load_internal: mem required  = 3615.73 MB (+ 1024.00 MB per state)                                       
llama_new_context_with_model: kv self size  = 1024.00 MB
llama_new_context_with_model: compute buffer total size =    3.49 MB
                                                           
system_info: n_threads = 3 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |                          

main: interactive mode on.
Reverse prompt: 'User'                                     
Input prefix: ' '
Input suffix: 'Assistant:'                                 
sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 2048, n_batch = 7, n_predict = -1, n_keep = 42
                                                           
== Running in interactive mode. ==                          
- Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.                 
 - To return control without starting a new line, end your input with '/'.                                             
- If you want to submit another line, end your input with '\'.                                                       

Below is an instruction that describes a task. Write a response that appropriately completes the request.
                                                           
### Instruction:
Please list 3 movie titles. 
                               
### Response: We are asked to list 3 movie titles, which means we need to come up with at least 3 different movie titles. Here is a list of 3 movie titles for your reference:   
1. Titanic (1997)                                        
2. The Matrix (1999)
3. Toy Story (1995)                                        

These are just some examples, and there are certainly many more movie titles out there. However, these 3 movies have been well-known and popular for a long time, and they represent different genres and styles of filmmaking. Therefore, I believe that these 3 movie titles will not disappoint you.
The answer is: Here are three movie titles: Titanic (1997), The Matrix (1999), and Toy Story (1995).
                                                           
</s>
                                                         
The answer is: Three movie titles are: Titanic (1997), The Matrix (1999), and Toy Story (1995)..                      
</s>

ex2:

### Instruction:
Please list 3 movie titles.

### Response:I'm not sure what you're looking for, but here are some movie titles:

1. The Shawshank Redemption
2. Schindler's List
3. The Godfather

The answer is: Here are three movie titles:
1. The Shawshank Redemption
2. Schindler's List
3. The Godfather.

</s>

The answer is: Here are three movie titles:
1. The Shawshank Redemption
2. Schindler's List
3. The Godfather.

</s>

It appears due to the way the model is converted it's unable to utilise the stop sequence, thus doesn't return control to the User in this case.

Edit: Error message trying to include metadata:

python3 convert-llama-ggmlv3-to-gguf.py -i ~/wizardmath-7b-v1.0.ggmlv3.q4_0.bin -o ~/wizardM2.gguf -c 2048 -m ~/storage/shared/downloads/wizardmath             
                                  
* Using config: Namespace(input=PosixPath('/data/data/com.termux/files/home/wizardmath-7b-v1.0.ggmlv3.q4_0.bin'), output=PosixPath('/data/data/com.termux/files/home/wizardM2.gguf'), name=None, desc=None, gqa=1, eps='5.0e-06', context_length=2048, model_metadata_dir=PosixPath('/data/data/com.termux/files/home/storage/shared/downloads/wizardmath'), vocab_dir=None, vocabtype='spm')
                                                          
 === WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===
                                                          
* Scanning GGML input file
* GGML model hyperparameters: <Hyperparameters: n_vocab=32001, n_embd=4096, n_mult=5504, n_head=32, n_layer=32, n_rot=128, n_ff=11008, ftype=2>   

Traceback (most recent call last): File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 333, in <module>
    main()                                  
    
File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 323, in main(params_override, vocab_override) = handle_metadata(cfg, model.hyperparameters)                                                     
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^         

File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 274, in handle_metadata import convert File "/data/data/com.termux/files/home/llama.cpp/convert.py", line 27, in <module> from sentencepiece import SentencePieceProcessor  # type: ignore 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ModuleNotFoundError: No module named 'sentencepiece'

Repo & here's the content of ~/storage/shared/downloads/wizardmath:
Screenshot_20230822_100229

@klosax
Copy link
Contributor

klosax commented Aug 22, 2023

ModuleNotFoundError: No module named 'sentencepiece'

I think you need to install the python tokenizer:
pip install sentencepiece

@ghost
Copy link
Author

ghost commented Aug 22, 2023

Ah, that's likely the issue as to why I can't convert with metadata. Thank you.

I got a crazy error, so I'll troubleshoot with Termux people.

pip install sentencepiece
Collecting sentencepiece                                     Downloading sentencepiece-0.1.99.tar.gz (2.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━ 2.6/2.6 MB 1.7 MB/s eta 0:00:00  Installing build dependencies ... done
  Getting requirements to build wheel ... done               Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: sentencepiece        Building wheel for sentencepiece (pyproject.toml) ... error                                                           error: subprocess-exited-with-error
                                                             × Building wheel for sentencepiece (pyproject.toml) did not run successfully.                                         │ exit code: 1
  ╰─> [214 lines of output]                                      running bdist_wheel
      running build                                              running build_py
      creating build                                             creating build/lib.linux-aarch64-cpython-311
      creating build/lib.linux-aarch64-cpython-311/sentencepiece
      copying src/sentencepiece/__init__.py -> build/lib.linux-aarch64-cpython-311/sentencepiece
      copying src/sentencepiece/_version.py -> build/lib.linux-aarch64-cpython-311/sentencepiece
      copying src/sentencepiece/sentencepiece_model_pb2.py -> build/lib.linux-aarch64-cpython-311/sentencepiece
      copying src/sentencepiece/sentencepiece_pb2.py -> build/lib.linux-aarch64-cpython-311/sentencepiece
      running build_ext                                          Package sentencepiece was not found in the pkg-config search path.                                                    Perhaps you should add the directory containing `sentencepiece.pc'                                                    to the PKG_CONFIG_PATH environment variable
      No package 'sentencepiece' found                           -- VERSION: 0.1.99
      -- The C compiler identification is Clang 16.0.6           -- The CXX compiler identification is Clang 16.0.6
      -- Detecting C compiler ABI info                           -- Detecting C compiler ABI info - done
      -- Check for working C compiler: /data/data/com.termux/files/usr/bin/cc - skipped
      -- Detecting C compile features                            -- Detecting C compile features - done
      -- Detecting CXX compiler ABI info                         -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /data/data/com.termux/files/usr/bin/c++ - skipped
      -- Detecting CXX compile features                          -- Detecting CXX compile features - done
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD                 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
      -- Looking for pthread_create in pthreads                  -- Looking for pthread_create in pthreads - not found
      -- Looking for pthread_create in pthread                   -- Looking for pthread_create in pthread - found
      -- Found Threads: TRUE                                     -- Not Found TCMalloc: TCMALLOC_LIB-NOTFOUND
      -- Found atomic: /data/data/com.termux/files/usr/lib/libatomic.a
      -- Configuring done (3.1s)                                 -- Generating done (0.0s)
      -- Build files have been written to: /data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/build                               [  1%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/bytestream.cc.o       [  3%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/arena.cc.o            [  4%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/coded_stream.cc.o
      [  6%] Building CXX object src/CMakeFiles/sentencepiece_train-static.dir/builder.cc.o
      [  8%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/arenastring.cc.o
      [  9%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/common.cc.o
      [ 11%] Building CXX object src/CMakeFiles/sentencepiece_train-static.dir/unicode_script.cc.o
      [ 12%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/extension_set.cc.o                                                               [ 14%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/generated_enum_util.cc.o
      [ 16%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/generated_message_table_driven_lite.cc.o                                         [ 17%] Building CXX object src/CMakeFiles/sentencepiece_train-static.dir/trainer_factory.cc.o                         [ 19%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/generated_message_util.cc.o
      /data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/src/builder.cc:47:15: warning: unused variable 'kMaxUnicode' [-Wunused-const-variable]
      constexpr int kMaxUnicode = 0x10FFFF;                                    ^
      /data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/src/builder.cc:49:23: warning: unused variable 'kDefaultNormalizerName' [-Wunused-const-variable]
      static constexpr char kDefaultNormalizerName[] = "nfkc";
                            ^                                    [ 20%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/implicit_weak_message.cc.o
      [ 22%] Building CXX object src/CMakeFiles/sentencepiece_train-static.dir/trainer_interface.cc.o
      [ 24%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/int128.cc.o
      [ 25%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/io_win32.cc.o
      [ 27%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/message_lite.cc.o                                                                2 warnings generated.
      [ 29%] Building CXX object src/CMakeFiles/sentencepiece_train-static.dir/unigram_model_trainer.cc.o
      [ 30%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/parse_context.cc.o                                                               [ 32%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/repeated_field.cc.o
      [ 33%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/status.cc.o
      [ 35%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/statusor.cc.o
      [ 37%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/stringpiece.cc.o
      [ 38%] Building CXX object src/CMakeFiles/sentencepiece_train-static.dir/word_model_trainer.cc.o
      [ 40%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/stringprintf.cc.o                                                                [ 41%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/structurally_valid.cc.o
      [ 43%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/strutil.cc.o
      [ 45%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/time.cc.o
      [ 46%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/wire_format_lite.cc.o                                                            [ 48%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/zero_copy_stream.cc.o
      [ 50%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/zero_copy_stream_impl.cc.o                                                       [ 51%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/protobuf-lite/zero_copy_stream_impl_lite.cc.o
      [ 53%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/builtin_pb/sentencepiece.pb.cc.o
      [ 54%] Building CXX object src/CMakeFiles/sentencepiece_train-static.dir/char_model_trainer.cc.o
      [ 56%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/builtin_pb/sentencepiece_model.pb.cc.o
      [ 58%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/bpe_model.cc.o
      [ 59%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/char_model.cc.o
      [ 61%] Building CXX object src/CMakeFiles/sentencepiece_train-static.dir/bpe_model_trainer.cc.o
      [ 62%] Building CXX object src/CMakeFiles/sentencepiece_train-static.dir/sentencepiece_trainer.cc.o
      [ 64%] Building CXX object src/CMakeFiles/sentencepiece_train-static.dir/pretokenizer_for_training.cc.o
      [ 66%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/error.cc.o
      [ 67%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/filesystem.cc.o
      [ 69%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/model_factory.cc.o
      [ 70%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/model_interface.cc.o
      [ 72%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/normalizer.cc.o
      [ 74%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/sentencepiece_processor.cc.o
      [ 75%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/unigram_model.cc.o
      [ 77%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/util.cc.o
      [ 79%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/word_model.cc.o
      [ 80%] Building CXX object src/CMakeFiles/sentencepiece-static.dir/__/third_party/absl/flags/flag.cc.o
      [ 82%] Linking CXX static library libsentencepiece_train.a
      [ 82%] Built target sentencepiece_train-static             [ 83%] Linking CXX static library libsentencepiece.a
      [ 83%] Built target sentencepiece-static                   [ 85%] Building CXX object src/CMakeFiles/spm_decode.dir/spm_decode_main.cc.o                                         [ 87%] Building CXX object src/CMakeFiles/spm_train.dir/spm_train_main.cc.o                                           [ 88%] Building CXX object src/CMakeFiles/spm_encode.dir/spm_encode_main.cc.o                                         [ 90%] Building CXX object src/CMakeFiles/spm_normalize.dir/spm_normalize_main.cc.o                                   [ 91%] Building CXX object src/CMakeFiles/spm_export_vocab.dir/spm_export_vocab_main.cc.o                             [ 93%] Linking CXX executable spm_export_vocab
      ld.lld: error: undefined symbol: __android_log_write       >>> referenced by common.cc:150 (/data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/third_party/protobuf-lite/common.cc:150)                                             >>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)) in archive libsentencepiece.a
      >>> referenced by common.cc:158 (/data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/third_party/protobuf-lite/common.cc:158)
      >>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)) in archive libsentencepiece.a                                        clang-16: error: linker command failed with exit code 1 (use -v to see invocation)                                    make[2]: *** [src/CMakeFiles/spm_export_vocab.dir/build.make:98: src/spm_export_vocab] Error 1                        make[1]: *** [CMakeFiles/Makefile2:286: src/CMakeFiles/spm_export_vocab.dir/all] Error 2                              make[1]: *** Waiting for unfinished jobs....
      [ 95%] Linking CXX executable spm_train                    ld.lld: error: undefined symbol: __android_log_write
      >>> referenced by common.cc:150 (/data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/third_party/protobuf-lite/common.cc:150)
      >>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)) in archive libsentencepiece.a                                        >>> referenced by common.cc:158 (/data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/third_party/protobuf-lite/common.cc:158)                                             >>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)) in archive libsentencepiece.a
      clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
      make[2]: *** [src/CMakeFiles/spm_train.dir/build.make:100: src/spm_train] Error 1
      make[1]: *** [CMakeFiles/Makefile2:260: src/CMakeFiles/spm_train.dir/all] Error 2
      [ 96%] Linking CXX executable spm_decode                   ld.lld: error: undefined symbol: __android_log_write
      >>> referenced by common.cc:150 (/data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/third_party/protobuf-lite/common.cc:150)
      >>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)) in archive libsentencepiece.a                                        >>> referenced by common.cc:158 (/data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/third_party/protobuf-lite/common.cc:158)                                             >>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)) in archive libsentencepiece.a
      clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
      make[2]: *** [src/CMakeFiles/spm_decode.dir/build.make:98: src/spm_decode] Error 1
      make[1]: *** [CMakeFiles/Makefile2:206: src/CMakeFiles/spm_decode.dir/all] Error 2
      [ 98%] Linking CXX executable spm_normalize                ld.lld: error: undefined symbol: __android_log_write
      >>> referenced by common.cc:150 (/data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/third_party/protobuf-lite/common.cc:150)
      >>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)) in archive libsentencepiece.a                                        >>> referenced by common.cc:158 (/data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/third_party/protobuf-lite/common.cc:158)                                             >>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)) in archive libsentencepiece.a
      clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
      make[2]: *** [src/CMakeFiles/spm_normalize.dir/build.make:100: src/spm_normalize] Error 1
      make[1]: *** [CMakeFiles/Makefile2:233: src/CMakeFiles/spm_normalize.dir/all] Error 2
      [100%] Linking CXX executable spm_encode                   ld.lld: error: undefined symbol: __android_log_write
      >>> referenced by common.cc:150 (/data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/third_party/protobuf-lite/common.cc:150)
      >>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)) in archive libsentencepiece.a                                        >>> referenced by common.cc:158 (/data/data/com.termux/files/usr/tmp/pip-install-fp0uprvx/sentencepiece_46b10aa7b9164fc8b7b95ce97999ae64/sentencepiece/third_party/protobuf-lite/common.cc:158)                                             >>>               common.cc.o:(google::protobuf::internal::DefaultLogHandler(google::protobuf::LogLevel, char const*, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)) in archive libsentencepiece.a
      clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
      make[2]: *** [src/CMakeFiles/spm_encode.dir/build.make:98: src/spm_encode] Error 1
      make[1]: *** [CMakeFiles/Makefile2:180: src/CMakeFiles/spm_encode.dir/all] Error 2
      make: *** [Makefile:156: all] Error 2                      Traceback (most recent call last):
        File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>                                  main()
        File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main                                      json_out['return_val'] = hook(**hook_input['kwargs'])                                                                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                                               File "/data/data/com.termux/files/usr/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
          return _build_backend().build_wheel(wheel_directory, config_settings,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 434, in build_wheel                          return self._build_with_temp_dir(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^                        File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 419, in _build_with_temp_dir
          self.run_setup()                                         File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 507, in run_setup
          super(_BuildMetaLegacyBackend, self).run_setup(setup_script=setup_script)
        File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/build_meta.py", line 341, in run_setup                            exec(code, locals())
        File "<string>", line 167, in <module>                     File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/__init__.py", line 107, in setup
          return distutils.core.setup(**attrs)                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 185, in setup                           return run_commands(dist)
                 ^^^^^^^^^^^^^^^^^^                                File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
          dist.run_commands()                                      File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
          self.run_command(cmd)                                    File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 1233, in run_command
          super().run_command(command)                             File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()                                            File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/wheel/bdist_wheel.py", line 364, in run
          self.run_command("build")                                File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)                   File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 1233, in run_command
          super().run_command(command)                             File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()                                            File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/build.py", line 131, in run
          self.run_command(cmd_name)                               File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
          self.distribution.run_command(command)                   File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/dist.py", line 1233, in run_command
          super().run_command(command)                             File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
          cmd_obj.run()                                            File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 88, in run
          _build_ext.run(self)
        File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
          self.build_extensions()
        File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
          self._build_extensions_serial()
        File "/data/data/com.termux/files/usr/tmp/pip-build-env-0buzgxd0/overlay/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
          self.build_extension(ext)
        File "<string>", line 87, in build_extension
        File "/data/data/com.termux/files/usr/lib/python3.11/subprocess.py", line 413, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['./build_bundled.sh', '0.1.99']' returned non-zero exit status 2.
      [end of output]                                      
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for sentencepiece
Failed to build sentencepiece
ERROR: Could not build wheels for sentencepiece, which is required to install pyproject.toml-based projects

@ghost ghost closed this as completed Aug 22, 2023
@KerfuffleV2
Copy link
Collaborator

models converted without metadata load different: Loading non-metadata:
llama_model_load_internal: BOS token = 1 ' '
llama_model_load_internal: EOS token = 2 ' '
Loading with one converted with external metadata:
llama_model_load_internal: BOS token = 1 '<s>'
llama_model_load_internal: EOS token = 2 '</s>'

Yeah, unfortunately this is a real problem but I'm not sure how to get around it. As far as I can see, the <s> and </s> don't exist in the original GGML model.

Debug output from dumping the vocab items from the GGML model:

* Adding 32002 vocab item(s)
   0: b'\xe2\x96\x81\xe2\x81\x87\xe2\x96\x81'  | b' \xe2\x81\x87 '
   1: b''  | b''
   2: b''  | b''
   3: b'<0x00>'  | b'\x00'

So I don't think there's a way to recover that. I could possibly add arguments to override the value for specific vocab items but if you're going through that much effort you might as well get the metadata and use that which is likely to be a lot more reliable.

I'm not 100% sure this should be closed since it's a legit issue, but I don't know how to fix it either. (With the k-quants improvements on the horizon, converting GGML models is looking less appealing also.)

I got a crazy error, so I'll troubleshoot with Termux people.

That looks like something specific to compiling on Android.

@klosax
Copy link
Contributor

klosax commented Aug 22, 2023

I'm not 100% sure this should be closed since it's a legit issue, but I don't know how to fix it either.

We could have default values for bos eos etc using the same tokens and token ids as the original LLaMA models.

@KerfuffleV2
Copy link
Collaborator

We could have default values for bos eos etc using the same tokens and token ids as the original LLaMA models.

llama.cpp already seems to know 1 is BOS and 2 is EOS though from the output. It's just the actual value of the token is wrong.

@ghost
Copy link
Author

ghost commented Aug 22, 2023

Yeah, unfortunately this is a real problem but I'm not sure how to get around it. As far as I can see, the <s> and </s> don't exist in the original GGML model.

All good - at least you communicated that beforehand, so we can chat about it. I considered you might override the values, but like you mentioned: obviously, metadata is more reliable. I'm converting 'cuz there's barely any gguf models out there right now.

That looks like something specific to compiling on Android.

Yes, it's not a llama.cpp issue.

@klosax
Copy link
Contributor

klosax commented Aug 22, 2023

llama.cpp already seems to know 1 is BOS and 2 is EOS though from the output. It's just the actual value of the token is wrong.

llama.cpp default the token ids to original token ids, so I suggest also defaulting the tokens:

llama_model_load_internal: BOS token = 1 '<s>'
llama_model_load_internal: EOS token = 2 '</s>'
llama_model_load_internal: UNK token = 0 '<unk>'

Conversion with special token mapping kv works by using convert-llama-hf-to-gguf.py

llama_model_loader: - kv  15:                tokenizer.ggml.bos_token_id u32     
llama_model_loader: - kv  16:                tokenizer.ggml.eos_token_id u32     
llama_model_loader: - kv  17:            tokenizer.ggml.unknown_token_id u32  

@ghost
Copy link
Author

ghost commented Aug 22, 2023

Folks@Termux solved the sentencepiece installation.

Here's a new error, excuse me if it's dumb or obvious, but I don't see what to do:

python3 convert-llama-ggmlv3-to-gguf.py -i ~/wizardmath-7b-v1.0.ggmlv3.q4_0.bin -o ~/wizardM2.gguf -c 2048 -m ~/storage/shared/downloads/Wizardmath                                            

* Using config: Namespace(input=PosixPath('/data/data/com.termux/files/home/wizardmath-7b-v1.0.ggmlv3.q4_0.bin'), output=PosixPath('/data/data/com.termux/files/home/wizardM2.gguf'), name=None, desc=None, gqa=1, eps='5.0e-06', context_length=2048, model_metadata_dir=PosixPath('/data/data/com.termux/files/home/storage/shared/downloads/Wizardmath'), vocab_dir=None, vocabtype='spm')
                                                           
=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===
                                                           
* Scanning GGML input file
* GGML model hyperparameters: <Hyperparameters: n_vocab=32001, n_embd=4096, n_mult=5504, n_head=32, n_layer=32, n_rot=128, n_ff=11008, ftype=2>                                 

Traceback (most recent call last):  File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 333, in <module>  main()                                                   
File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 323, in main (params_override, vocab_override) = handle_metadata(cfg, model.hyperparameters)                                                                          
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^                                     
File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 275, in handle_metadata         assert cfg.model_metadata_dir.is_dir(), 'Metadata dir is not a directory'                                         

AssertionError: Metadata dir is not a directory

Thank you.

@ghost ghost reopened this Aug 22, 2023
@KerfuffleV2
Copy link
Collaborator

so I suggest also defaulting the tokens:

@klosax

Sure, I can add something like that. So just set the token value to <unk> for token id 0, <s> for token id 1, </s> for token id 2? Are there any GGML LLaMA models that use a different value for those tokens?

AssertionError: Metadata dir is not a directory

~/storage/shared/downloads/Wizardmath ( /data/data/com.termux/files/home/storage/shared/downloads/Wizardmath ) needs to be a directory with tokenizer.model, etc in it.

@klosax
Copy link
Contributor

klosax commented Aug 22, 2023

Are there any GGML LLaMA models that use a different value for those tokens?

OpenLLaMA-v2 3b and 7b: same ids and values, but incorrectly maps unknown to padding..

llama_model_load_internal: BOS token = 1 '<s>'
llama_model_load_internal: EOS token = 2 '</s>'
llama_model_load_internal: PAD token = 0 '<unk>

@KerfuffleV2
Copy link
Collaborator

KerfuffleV2 commented Aug 22, 2023

but incorrectly maps unknown to padding..

Just to be clear, you're saying they're doing something weird but it should be like what you pasted in the generated GGUF file?

In other words, if I converted an OpenLLaMA 3b or 7b GGML to GGUF and:

llama_model_load_internal: UNK token = 0 '<unk>'

I would have made a mistake. Is that correct?

@klosax
Copy link
Contributor

klosax commented Aug 22, 2023

I would have made a mistake. Is that correct?

The mapping of token value and token id is the same, so use the mapping I posted first for the original LLaMA.

The other identifier BOS EOS PAD UNK is the mapping for the use of the special token id:
https://github.com/ggerganov/llama.cpp/blob/519c981f8b65ee6c87c2965539685ced0a17223b/llama.cpp#L703-L710

@KerfuffleV2
Copy link
Collaborator

The mapping of token value and token id is the same, so use the mapping I posted first for the original LLaMA.

Sorry, I'm still not sure I fully understand.

When converting the GGML file without metadata (which is what we're talking about here), all I have is the token ID and the value (bytes, string value for the token).

Just as example, dumping the information from the GGML file for openorca-platypus2-13b looks like:

* Adding 32002 vocab item(s)
   0: b'\xe2\x96\x81\xe2\x81\x87\xe2\x96\x81'  | b' \xe2\x81\x87 '
   1: b''  | b''
   2: b''  | b''
   3: b'<0x00>'  | b'\x00'

The part after | is the original bytes, the part before it is the fixed up one. Not really relevant for 0, 1, 2 but \x00 for example is the bytes that were associated with token id 3 and got fixed up to <0x00>

Anyway, at least for the model I looked at, the values for tokens 1 and 2 were just a 0-length string. As far as I know, the problem here is with the string value of the tokens, not anything to do with the IDs. Since llama.cpp prints out llama_model_load_internal: BOS token = 1 ' ' it knows 1 = BOS, right?

@klosax
Copy link
Contributor

klosax commented Aug 22, 2023

llama_model_load_internal: BOS token = 1 ' ' it knows 1 = BOS, right?

Yes correct. The initial problem was the string values of the 0,1,2 tokens. Use as default the original LLaMA mapping:

id 0 = <unk>
id 1 = <s>
id 2 = </s>

@klosax
Copy link
Contributor

klosax commented Aug 22, 2023

When you PR the changes, it would be good to also fix the default UNK mapping in llama.cpp, setting it to 0 instead of -1
https://github.com/ggerganov/llama.cpp/blob/519c981f8b65ee6c87c2965539685ced0a17223b/llama.cpp#L706

@ghost
Copy link
Author

ghost commented Aug 22, 2023

AssertionError: Metadata dir is not a directory

~/storage/shared/downloads/Wizardmath ( /data/data/com.termux/files/home/storage/shared/downloads/Wizardmath ) needs to be a directory with tokenizer.model, etc in it.

Oddly, wizardMath doesn't have tokenizer.model, so I want to convert wizardlm-7b-v1.0-uncensored.ggmlv3.q4_0.bin. Here's the error:

python3 convert-llama-ggmlv3-to-gguf.py -i ~/wizardlm-7b-v1.0-uncensored.ggmlv3.q4_0.bin -o ~/wizardLM.gguf -c 2048 -m ~/wizardm        

* Using config: Namespace(input=PosixPath('/data/data/com.termux/files/home/wizardlm-7b-v1.0-uncensored.ggmlv3.q4_0.bin'), output=PosixPath('/data/data/com.termux/files/home/wizardLM.gguf'), name=None, desc=None, gqa=1, eps='5.0e-06', context_length=2048, model_metadata_dir=PosixPath('/data/data/com.termux/files/home/wizardm'), vocab_dir=None, vocabtype='spm') 
                                                  
=== WARNING === Be aware that this conversion script is best-effort. Use a native GGUF model if possible. === WARNING ===                                                        

* Scanning GGML input file                                 
* GGML model hyperparameters: <Hyperparameters: n_vocab=32000, n_embd=4096, n_mult=256, n_head=32, n_layer=32, n_rot=128, n_ff=11008, ftype=2>
Traceback (most recent call last):                           
File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 333, in <module> main()
  File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 323, in main (params_override, vocab_override) = handle_metadata(cfg, model.hyperparameters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/data/com.termux/files/home/llama.cpp/convert-llama-ggmlv3-to-gguf.py", line 292, in handle_metadata

raise ValueError('Unable to load metadata')            
ValueError: Unable to load metadata

Here's the content of wizardm:
Screenshot_20230822_121054

Edit: I just noticed some of the file formats change to .txt after downloading. I'll try to fix it.

@KerfuffleV2
Copy link
Collaborator

Oddly, wizardMath doesn't have tokenizer.model, so I want to convert wizardlm-7b-v1.0-uncensored.ggmlv3.q4_0.bin

Your link there does have the expected files, including tokenizer.model. Something went wrong with your download, though. The link has config.json, the directory you listed has config.txt. Did you save the .json files as .txt or something?

@ghost
Copy link
Author

ghost commented Aug 22, 2023

Oddly, wizardMath doesn't have tokenizer.model, so I want to convert wizardlm-7b-v1.0-uncensored.ggmlv3.q4_0.bin

Your link there does have the expected files, including tokenizer.model. Something went wrong with your download, though. The link has config.json, the directory you listed has config.txt. Did you save the .json files as .txt or something?

Yeah, somehow I messed up. I corrected the formats, thanks. It converted!

~/l/b/bin (master)> ./main -m ~/wizardLM.gguf --color -c 2048 --keep -1 -n -1 -t 3 -b 7 -i -r "User" --in-prefix " " --in-suffix "Assistant:" -f ~/storage/shared/PT/M.txt

main: build = 1015 (226255b)
main: seed  = 1692717766
llama_model_loader: loaded meta data with 15 key-value pairs and 291 tensors from /data/data/com.termux/files/home/wizardLM.gguf (version GGUF V1 (latest))
...
llama_model_load_internal: BOS token = 1 '<s>'
llama_model_load_internal: EOS token = 2 '</s>'
llama_model_load_internal: LF token  = 13 '<0x0A>'
llama_model_load_internal: ggml ctx size =    0.09 MB
llama_model_load_internal: mem required  = 3615.73 MB (+ 1024.00 MB per state)
llama_new_context_with_model: kv self size  = 1024.00 MB
llama_new_context_with_model: compute buffer total size =    3.49 MB

system_info: n_threads = 3 / 8 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 0 | VSX = 0 |
main: interactive mode on.
Reverse prompt: 'User'
Input prefix: ' '
Input suffix: 'Assistant:'
sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
generate: n_ctx = 2048, n_batch = 7, n_predict = -1, n_keep = 42


== Running in interactive mode. ==
 - Press Ctrl+C to interject at any time.
 - Press Return to return control to LLaMa.
 - To return control without starting a new line, end your input with '/'.
 - If you want to submit another line, end your input with '\'.

 Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
Please list 3 movie titles.

### Response:
 Sure, here are three movie titles for you:
1. The Shawshank Redemption
2. Schindler's List
3. Pulp Fiction
 Thank you.
Assistant: You're welcome! I hope you enjoy watching these movies.
 

Even better, it stopped as expected, so converting with metadata definitely works.

@KerfuffleV2
Copy link
Collaborator

#2725 should help here even when converting without metadata.

@ghost ghost closed this as completed Aug 23, 2023
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants