You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+10-4
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,15 @@
1
1
# koboldcpp-ROCM
2
2
3
-
To install, run
4
-
```make LLAMA_HIPBLAS=1```
5
-
To use ROCM, set GPU layers with --gpulayers when starting koboldcpp
6
-
Original [llama.cpp rocm port](https://github.com/ggerganov/llama.cpp/pull/1087) by SlyEcho, ported to koboldcpp by yellowrosecx
3
+
To install, navigate to the folder you want to download to in Terminal and run
4
+
```
5
+
git clone https://github.com/YellowRoseCx/koboldcpp-rocm.git -b main --depth 1 && \
6
+
cd koboldcpp-rocm && \
7
+
make LLAMA_HIPBLAS=1 -j4 && \
8
+
./koboldcpp.py
9
+
```
10
+
When the KoboldCPP GUI appears, make sure to select "Use CuBLAS/hipBLAS" and set GPU layers
11
+
12
+
Original [llama.cpp rocm port](https://github.com/ggerganov/llama.cpp/pull/1087) by SlyEcho, modified and ported to koboldcpp by YellowRoseCx
7
13
8
14
Comparison with OpenCL using 6800xt
9
15
| Model | Offloading Method | Time Taken - Processing 593 tokens| Time Taken - Generating 200 tokens| Total Time | Perf. Diff.
compatgroup.add_argument("--noblas", help="Do not use OpenBLAS for accelerated prompt ingestion", action='store_true')
1451
1451
compatgroup.add_argument("--useclblast", help="Use CLBlast for GPU Acceleration. Must specify exactly 2 arguments, platform ID and device ID (e.g. --useclblast 1 0).", type=int, choices=range(0,9), nargs=2)
1452
-
compatgroup.add_argument("--usecublas", help="Use CuBLAS for GPU Acceleration. Requires CUDA. Select lowvram to not allocate VRAM scratch buffer. Enter a number afterwards to select and use 1 GPU. Leaving no number will use all GPUs.", nargs='*',metavar=('[lowvram|normal] [main GPU ID]'), choices=['normal', 'lowvram', '0', '1', '2'])
1452
+
compatgroup.add_argument("--usecublas", help="Use CuBLAS/hipBLAS for GPU Acceleration. Requires CUDA. Select lowvram to not allocate VRAM scratch buffer. Enter a number afterwards to select and use 1 GPU. Leaving no number will use all GPUs.", nargs='*',metavar=('[lowvram|normal] [main GPU ID]'), choices=['normal', 'lowvram', '0', '1', '2'])
1453
1453
parser.add_argument("--gpulayers", help="Set number of layers to offload to GPU when using GPU. Requires GPU.",metavar=('[GPU layers]'), type=int, default=0)
0 commit comments