Skip to content

Commit 46f9f88

Browse files
committedOct 27, 2023
Merge branch 'master' into develop/personal
* master: (350 commits) speculative : ensure draft and target model vocab matches (ggml-org#3812) llama : correctly report GGUFv3 format (ggml-org#3818) simple : fix batch handling (ggml-org#3803) cuda : improve text-generation and batched decoding performance (ggml-org#3776) server : do not release slot on image input (ggml-org#3798) batched-bench : print params at start log : disable pid in log filenames server : add parameter -tb N, --threads-batch N (ggml-org#3584) (ggml-org#3768) server : do not block system prompt update (ggml-org#3767) sync : ggml (conv ops + cuda MSVC fixes) (ggml-org#3765) cmake : add missed dependencies (ggml-org#3763) cuda : add batched cuBLAS GEMM for faster attention (ggml-org#3749) Add more tokenizer tests (ggml-org#3742) metal : handle ggml_scale for n%4 != 0 (close ggml-org#3754) Revert "make : add optional CUDA_NATIVE_ARCH (ggml-org#2482)" issues : separate bug and enhancement template + no default title (ggml-org#3748) Update special token handling in conversion scripts for gpt2 derived tokenizers (ggml-org#3746) llama : remove token functions with `context` args in favor of `model` (ggml-org#3720) Fix baichuan convert script not detecing model (ggml-org#3739) make : add optional CUDA_NATIVE_ARCH (ggml-org#2482) ...
2 parents 855b808 + 41aee4d commit 46f9f88

File tree

210 files changed

+48517
-18053
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

210 files changed

+48517
-18053
lines changed
 

Diff for: ‎.clang-tidy

+5
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ Checks: >
33
bugprone-*,
44
-bugprone-easily-swappable-parameters,
55
-bugprone-implicit-widening-of-multiplication-result,
6+
-bugprone-misplaced-widening-cast,
67
-bugprone-narrowing-conversions,
78
readability-*,
89
-readability-avoid-unconditional-preprocessor-if,
@@ -15,4 +16,8 @@ Checks: >
1516
-clang-analyzer-security.insecureAPI.DeprecatedOrUnsafeBufferHandling,
1617
performance-*,
1718
portability-*,
19+
misc-*,
20+
-misc-const-correctness,
21+
-misc-non-private-member-variables-in-classes,
22+
-misc-no-recursion,
1823
FormatStyle: none

Diff for: ‎.devops/cloud-v-pipeline

+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
node('x86_runner1'){ // Running on x86 runner containing latest vector qemu, latest vector gcc and all the necessary libraries
2+
stage('Cleanup'){
3+
cleanWs() // Cleaning previous CI build in workspace
4+
}
5+
stage('checkout repo'){
6+
retry(5){ // Retry if the cloning fails due to some reason
7+
checkout scm // Clone the repo on Runner
8+
}
9+
}
10+
stage('Compiling llama.cpp'){
11+
sh'''#!/bin/bash
12+
make RISCV=1 RISCV_CROSS_COMPILE=1 # Compiling llama for RISC-V
13+
'''
14+
}
15+
stage('Running llama.cpp'){
16+
sh'''#!/bin/bash
17+
module load gnu-bin2/0.1 # loading latest versions of vector qemu and vector gcc
18+
qemu-riscv64 -L /softwares/gnu-bin2/sysroot -cpu rv64,v=true,vlen=256,elen=64,vext_spec=v1.0 ./main -m /home/alitariq/codellama-7b.Q4_K_M.gguf -p "Anything" -n 9 > llama_log.txt # Running llama.cpp on vector qemu-riscv64
19+
cat llama_log.txt # Printing results
20+
'''
21+
}
22+
}

0 commit comments

Comments
 (0)