Skip to content

Commit d217eb7

Browse files
apicalsharkdependabot[bot]pminevykhrustalevggerganov
authored
a (#28)
* Temp (#23) * Merge (#21) * merge (#20) * Master1 (#17) * Merge PR (#10) (#11) (#13) Merge --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dennyxbox890 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump requests from 2.31.0 to 2.32.2 in the pip group across 1 directory Bumps the pip group with 1 update in the / directory: [requests](https://github.com/psf/requests). Updates `requests` from 2.31.0 to 2.32.2 - [Release notes](https://github.com/psf/requests/releases) - [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md) - [Commits](psf/requests@v2.31.0...v2.32.2) --- updated-dependencies: - dependency-name: requests dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <[email protected]> * Temp (#15) * metal : fix minor string leaks (ggml/1004) * cmake : make it possible linking ggml as external lib (ggml/1003) * sync : ggml * CANN: adjust backend registry refactor. (ggml-org#10158) remove buffer->iface.get_name that used in cann as it was removed in backend registry refactor PR. * metal : move dequantize templates to beginning of MSL source (#0) * metal : simplify f16 and f32 dequant kernels (#0) * cuda : clear error after changing peer access (ggml-org#10153) * fix build break on arm64 linux (ggml-org#10166) This fixes the build break from the recent changes to move the CPU backend to separate files ggml-org#10144 * server : clarify /slots endpoint, add is_processing (ggml-org#10162) * server : clarify /slots endpoint, add is_processing * fix tests * ggml : fix q4xx mat mul, increase ggml_aligned_malloc alignment (ggml-org#10167) * ggml : fix gelu tables initialization (ggml-org#10172) * Q6_K AVX improvements (ggml-org#10118) * q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86 * ggml : fix arch check in bf16_to_fp32 (ggml-org#10164) * llama : add <|tool_call|> formatting to Granite template (ggml-org#10177) Branch: GraniteToolCallTemplate Signed-off-by: Gabe Goodhart <[email protected]> * metal : add quantized FA support (ggml-org#10149) * metal : add quantized FA (vec) support ggml-ci * metal : add quantized FA (non-vec) support * metal : fix support check ggml-ci * metal : clean-up * metal : clean-up (cont) * metal : fix shared memory calc + reduce smem + comments * metal : float-correctness * metal : minor [no ci] * ggml : adjust is_first_call init value (ggml-org#10193) ggml-ci * metal : fix from ptr buffer name (ggml-org#10189) * server : remove hack for extra parallel slot (ggml-org#10187) ggml-ci * metal : add BF16 support (ggml-org#8439) * ggml : add initial BF16 support ggml-ci * metal : add mul_mat_id BF16 support ggml-ci * metal : check for bfloat support on the Metal device ggml-ci * metal : better var names [no ci] * metal : do not build bfloat kernels when not supported ggml-ci * metal : try to fix BF16 support check ggml-ci * metal : this should correctly check bfloat support --------- Signed-off-by: Gabe Goodhart <[email protected]> Co-authored-by: Plamen Minev <[email protected]> Co-authored-by: Yuri Khrustalev <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: leo-pony <[email protected]> Co-authored-by: Diego Devesa <[email protected]> Co-authored-by: snadampal <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]> Co-authored-by: Eve <[email protected]> Co-authored-by: Gabe Goodhart <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Gabe Goodhart <[email protected]> Co-authored-by: dennyxbox890 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <[email protected]> Co-authored-by: Yuri Khrustalev <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: leo-pony <[email protected]> Co-authored-by: Diego Devesa <[email protected]> Co-authored-by: snadampal <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]> Co-authored-by: Eve <[email protected]> Co-authored-by: Gabe Goodhart <[email protected]> * Rename build.yml to build-ci.yml * build.yml * Update build-ci.yml * Update CMakeLists.txt * Update CMakeLists.txt * Update CMakeLists.txt * Delete ggml/src/vulkan-shaders/CMakeLists.txt * Update build.yml * Update build-ci.yml * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Gabe Goodhart <[email protected]> Co-authored-by: dennyxbox890 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <[email protected]> Co-authored-by: Yuri Khrustalev <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: leo-pony <[email protected]> Co-authored-by: Diego Devesa <[email protected]> Co-authored-by: snadampal <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]> Co-authored-by: Eve <[email protected]> Co-authored-by: Gabe Goodhart <[email protected]> * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Gabe Goodhart <[email protected]> Co-authored-by: dennyxbox890 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <[email protected]> Co-authored-by: Yuri Khrustalev <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: leo-pony <[email protected]> Co-authored-by: Diego Devesa <[email protected]> Co-authored-by: snadampal <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]> Co-authored-by: Eve <[email protected]> Co-authored-by: Gabe Goodhart <[email protected]> * Update build-ci.yml * Update build-ci.yml --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Gabe Goodhart <[email protected]> Co-authored-by: dennyxbox890 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <[email protected]> Co-authored-by: Yuri Khrustalev <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: leo-pony <[email protected]> Co-authored-by: Diego Devesa <[email protected]> Co-authored-by: snadampal <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]> Co-authored-by: Eve <[email protected]> Co-authored-by: Gabe Goodhart <[email protected]> * Bump the pip group across 2 directories with 2 updates (#24) Updates the requirements on [pillow](https://github.com/python-pillow/Pillow) and [aiohttp](https://github.com/aio-libs/aiohttp) to permit the latest version. Updates `pillow` to 11.0.0 - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst) - [Commits](python-pillow/Pillow@10.2.0...11.0.0) Updates `aiohttp` to 3.11.7 - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](aio-libs/aiohttp@v3.9.3...v3.11.7) --- updated-dependencies: - dependency-name: pillow dependency-type: direct:production dependency-group: pip - dependency-name: aiohttp dependency-type: direct:production dependency-group: pip ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: apicalshark <[email protected]> * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Update build-ci.yml * Create docker.yml * Create python-lint.yml * Create server.yml * Update requirements.txt --------- Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Gabe Goodhart <[email protected]> Co-authored-by: dennyxbox890 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Plamen Minev <[email protected]> Co-authored-by: Yuri Khrustalev <[email protected]> Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: leo-pony <[email protected]> Co-authored-by: Diego Devesa <[email protected]> Co-authored-by: snadampal <[email protected]> Co-authored-by: Xuan Son Nguyen <[email protected]> Co-authored-by: Eve <[email protected]> Co-authored-by: Gabe Goodhart <[email protected]>
1 parent 43957ef commit d217eb7

14 files changed

+602
-60
lines changed

.github/workflows/build-ci.yml

+488
Large diffs are not rendered by default.

.github/workflows/build.yml

+1-8
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,7 @@ on:
77
description: 'Create new release'
88
required: true
99
type: boolean
10-
push:
11-
branches:
12-
- master
13-
paths: ['.github/workflows/build.yml', '**/CMakeLists.txt', '**/Makefile', '**/*.h', '**/*.hpp', '**/*.c', '**/*.cpp', '**/*.cu', '**/*.cuh', '**/*.swift', '**/*.m', '**/*.metal']
14-
pull_request:
15-
types: [opened, synchronize, reopened]
16-
paths: ['.github/workflows/build.yml', '**/CMakeLists.txt', '**/Makefile', '**/*.h', '**/*.hpp', '**/*.c', '**/*.cpp', '**/*.cu', '**/*.cuh', '**/*.swift', '**/*.m', '**/*.metal']
17-
10+
1811
concurrency:
1912
group: ${{ github.workflow }}-${{ github.head_ref && github.ref || github.run_id }}
2013
cancel-in-progress: true

.github/workflows/editorconfig.yml

-27
This file was deleted.

CMakeLists.txt

+17-2
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,22 @@ project("llama.cpp" C CXX)
33
include(CheckIncludeFileCXX)
44

55
#set(CMAKE_WARN_DEPRECATED YES)
6-
set(CMAKE_WARN_UNUSED_CLI YES)
6+
#set(CMAKE_WARN_UNUSED_CLI YES)
7+
8+
#enable O3 flag
9+
if(NOT CMAKE_BUILD_TYPE)
10+
set(CMAKE_BUILD_TYPE Release)
11+
endif()
12+
13+
set(CMAKE_CXX_FLAGS "-Wall -Wextra -Wno-unused-variable -Wno-unused-function -Wno-error")
14+
set(CMAKE_CXX_FLAGS_DEBUG "-g")
15+
set(CMAKE_CXX_FLAGS_RELEASE "-Ofast")
16+
17+
#add lto
18+
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE)
19+
20+
#no warn
21+
#set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS}")
722

823
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
924

@@ -58,7 +73,7 @@ endif()
5873
#
5974

6075
# debug
61-
option(LLAMA_ALL_WARNINGS "llama: enable all compiler warnings" ON)
76+
option(LLAMA_ALL_WARNINGS "llama: enable all compiler warnings" OFF)
6277
option(LLAMA_ALL_WARNINGS_3RD_PARTY "llama: enable all compiler warnings in 3rd party libs" OFF)
6378

6479
# build

common/CMakeLists.txt

+10
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
# common
2+
#enable O3 flag
3+
if(NOT CMAKE_BUILD_TYPE)
4+
set(CMAKE_BUILD_TYPE Release)
5+
endif()
6+
7+
set(CMAKE_CXX_FLAGS "-Wall -Wextra -Wno-unused-variable -Wno-unused-function -Wno-error")
8+
set(CMAKE_CXX_FLAGS_DEBUG "-g")
9+
set(CMAKE_CXX_FLAGS_RELEASE "-Ofast")
210

11+
#add lto
12+
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE)
313
find_package(Threads REQUIRED)
414

515
llama_add_compile_flags()

examples/CMakeLists.txt

+10
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
# dependencies
2+
#enable O3 flag
3+
if(NOT CMAKE_BUILD_TYPE)
4+
set(CMAKE_BUILD_TYPE Release)
5+
endif()
6+
7+
set(CMAKE_CXX_FLAGS "-Wall -Wextra -Wno-unused-variable -Wno-unused-function -Wno-error")
8+
set(CMAKE_CXX_FLAGS_DEBUG "-g")
9+
set(CMAKE_CXX_FLAGS_RELEASE "-Ofast")
210

11+
#add lto
12+
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE)
313
find_package(Threads REQUIRED)
414

515
# third-party

examples/llava/requirements.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
-r ../../requirements/requirements-convert_legacy_llama.txt
22
--extra-index-url https://download.pytorch.org/whl/cpu
3-
pillow~=10.2.0
3+
pillow~=11.0.0
44
torch~=2.2.1
55
torchvision~=0.17.1

ggml/CMakeLists.txt

+10
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,17 @@
11
cmake_minimum_required(VERSION 3.14) # for add_link_options and implicit target directories.
22
project("ggml" C CXX)
33
include(CheckIncludeFileCXX)
4+
#enable O3 flag
5+
if(NOT CMAKE_BUILD_TYPE)
6+
set(CMAKE_BUILD_TYPE Release)
7+
endif()
8+
9+
set(CMAKE_CXX_FLAGS "-Wall -Wextra -Wno-unused-variable -Wno-unused-function -Wno-error")
10+
set(CMAKE_CXX_FLAGS_DEBUG "-g")
11+
set(CMAKE_CXX_FLAGS_RELEASE "-Ofast")
412

13+
#add lto
14+
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE)
515
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
616

717
if (NOT XCODE AND NOT MSVC AND NOT CMAKE_BUILD_TYPE)

pocs/CMakeLists.txt

+10
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,15 @@
11
# dependencies
2+
#enable O3 flag
3+
if(NOT CMAKE_BUILD_TYPE)
4+
set(CMAKE_BUILD_TYPE Release)
5+
endif()
6+
7+
set(CMAKE_CXX_FLAGS "-Wall -Wextra -Wno-unused-variable -Wno-unused-function -Wno-error")
8+
set(CMAKE_CXX_FLAGS_DEBUG "-g")
9+
set(CMAKE_CXX_FLAGS_RELEASE "-Ofast")
210

11+
#add lto
12+
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE)
313
find_package(Threads REQUIRED)
414

515
# third-party

pocs/vdot/CMakeLists.txt

+11
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,14 @@
1+
#enable O3 flag
2+
if(NOT CMAKE_BUILD_TYPE)
3+
set(CMAKE_BUILD_TYPE Release)
4+
endif()
5+
6+
set(CMAKE_CXX_FLAGS "-Wall -Wextra -Wno-unused-variable -Wno-unused-function -Wno-error")
7+
set(CMAKE_CXX_FLAGS_DEBUG "-g")
8+
set(CMAKE_CXX_FLAGS_RELEASE "-Ofast")
9+
10+
#add lto
11+
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE)
112
set(TARGET llama-vdot)
213
add_executable(${TARGET} vdot.cpp)
314
target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT})

poetry.lock

+21-21
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

scripts/sync-ggml.last

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
c598cbe30621251e80acbcf3b601589a37c17f4d
1+
c598cbe30621251e80acbcf3b601589a37c17f4d

src/CMakeLists.txt

+11
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,14 @@
1+
#enable O3 flag
2+
if(NOT CMAKE_BUILD_TYPE)
3+
set(CMAKE_BUILD_TYPE Release)
4+
endif()
5+
6+
set(CMAKE_CXX_FLAGS "-Wall -Wextra -Wno-unused-variable -Wno-unused-function -Wno-error")
7+
set(CMAKE_CXX_FLAGS_DEBUG "-g")
8+
set(CMAKE_CXX_FLAGS_RELEASE "-Ofast")
9+
10+
#add lto
11+
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE)
112
# TODO: should not use this
213
if (WIN32)
314
if (BUILD_SHARED_LIBS)

tests/CMakeLists.txt

+11
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,14 @@
1+
#enable O3 flag
2+
if(NOT CMAKE_BUILD_TYPE)
3+
set(CMAKE_BUILD_TYPE Release)
4+
endif()
5+
6+
set(CMAKE_CXX_FLAGS "-Wall -Wextra -Wno-unused-variable -Wno-unused-function -Wno-error")
7+
set(CMAKE_CXX_FLAGS_DEBUG "-g")
8+
set(CMAKE_CXX_FLAGS_RELEASE "-Ofast")
9+
10+
#add lto
11+
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE)
112
function(llama_test target)
213
include(CMakeParseArguments)
314
set(options)

0 commit comments

Comments
 (0)