You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OPENBLAS_BUILD = @echo 'Your OS $(OS) does not appear to be Windows. For faster speeds, install and link a BLAS library. Set LLAMA_OPENBLAS=1 to compile with OpenBLAS support or LLAMA_CLBLAST=1 to compile with ClBlast support. This is just a reminder, not an error.'
Copy file name to clipboardExpand all lines: README.md
+3-3
Original file line number
Diff line number
Diff line change
@@ -24,7 +24,7 @@ What does it mean? You get llama.cpp with a fancy UI, persistent stories, editin
24
24

25
25
26
26
## Usage
27
-
-[Download the latest release here](https://github.com/LostRuins/koboldcpp/releases/latest) or clone the repo.
27
+
-**[Download the latest .exe release here](https://github.com/LostRuins/koboldcpp/releases/latest)** or clone the git repo.
28
28
- Windows binaries are provided in the form of **koboldcpp.exe**, which is a pyinstaller wrapper for a few **.dll** files and **koboldcpp.py**. If you feel concerned, you may prefer to rebuild it yourself with the provided makefiles and scripts.
29
29
- Weights are not included, you can use the official llama.cpp `quantize.exe` to generate them from your official weight files (or download them from other places).
30
30
- To run, execute **koboldcpp.exe** or drag and drop your quantized `ggml_model.bin` file onto the .exe, and then connect with Kobold or Kobold Lite. If you're not on windows, then run the script **KoboldCpp.py** after compiling the libraries.
@@ -40,9 +40,9 @@ For more information, be sure to run the program with the `--help` flag.
40
40
- You will have to compile your binaries from source. A makefile is provided, simply run `make`
41
41
- If you want you can also link your own install of OpenBLAS manually with `make LLAMA_OPENBLAS=1`
42
42
- Alternatively, if you want you can also link your own install of CLBlast manually with `make LLAMA_CLBLAST=1`, for this you will need to obtain and link OpenCL and CLBlast libraries.
43
-
- For a full featured build, do `make LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1`
44
43
- For Arch Linux: Install `cblas``openblas` and `clblast`.
45
44
- For Debian: Install `libclblast-dev` and `libopenblas-dev`.
45
+
- For a full featured build, do `make LLAMA_OPENBLAS=1 LLAMA_CLBLAST=1 LLAMA_CUBLAS=1`
46
46
- After all binaries are built, you can run the python script with the command `koboldcpp.py [ggml_model.bin] [port]`
47
47
- Note: Many OSX users have found that the using Accelerate is actually faster than OpenBLAS. To try, you may wish to run with `--noblas` and compare speeds.
48
48
@@ -65,7 +65,7 @@ For more information, be sure to run the program with the `--help` flag.
65
65
- See https://github.com/ggerganov/llama.cpp/pull/1828/files
66
66
67
67
## CuBLAS?
68
-
- You can attempt a CuBLAS build with LLAMA_CUBLAS=1 or using the provided CMake file (best for visual studio users). Note that support for CuBLAS is limited.
68
+
- You can attempt a CuBLAS build with `LLAMA_CUBLAS=1` or using the provided CMake file (best for visual studio users). If you use the CMake file to build, copy the `koboldcpp_cublas.dll` generated into the same directory as the `koboldcpp.py` file. If you are bundling executables, you may need to include CUDA dynamic libraries (such as `cublasLt64_11.dll` and `cublas64_11.dll`) in order for the executable to work correctly on a different PC. Note that support for CuBLAS is limited.
69
69
70
70
## Considerations
71
71
- For Windows: No installation, single file executable, (It Just Works)
0 commit comments