-
Notifications
You must be signed in to change notification settings - Fork 23
/
Copy pathexample-single-thread.html
81 lines (64 loc) · 4.03 KB
/
example-single-thread.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>llama-cpp-wasm single thread</title>
<link rel="icon" type="image/png" href="favicon.png" />
<!-- picocss -->
<link
rel="stylesheet"
href="https://cdn.jsdelivr.net/npm/@picocss/pico@2/css/pico.min.css"
/>
</head>
<body>
<header class="container">
<hgroup>
<h1><a href="/">llama-cpp-wasm</a> 🐢 <mark>single thread</mark> wasm32 </h2>
<br />
<p> WebAssembly (Wasm) Build and Bindings for <a href="https://github.com/ggerganov/llama.cpp" target="_blank">llama.cpp</a>. </p>
<br />
<p> This demonstration enables you to run LLM models directly in your browser utilizing JavaScript, WebAssembly, and llama.cpp. </p>
<br />
<p> Repository: <a href="https://github.com/tangledgroup/llama-cpp-wasm"> https://github.com/tangledgroup/llama-cpp-wasm </a></p>
<br />
<p> When you click <b>Run</b>, model will be first downloaded and cached in browser. </p>
</hgroup>
</header>
<main class="container">
<section>
<h2> Demo </h2>
<label> Model: </label>
<select id="model" name="model" aria-label="Select model" required>
<!-- <option value="https://huggingface.co/Qwen/Qwen1.5-0.5B-Chat-GGUF/resolve/main/qwen1_5-0_5b-chat-q3_k_m.gguf" selected>Qwen/Qwen1.5-0.5B-Chat Q3_K_M (350 MB)</option> -->
<option value="https://huggingface.co/afrideva/TinyMistral-248M-SFT-v4-GGUF/resolve/main/tinymistral-248m-sft-v4.q8_0.gguf">tinymistral-248m-sft-v4 q8_0 (265.26 MB)</option>
<option value="https://huggingface.co/TheBloke/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf">TinyLlama/TinyLlama-1.1B-Chat-v1.0 Q4_K_M (669 MB)</option>
<option value="https://huggingface.co/Qwen/Qwen1.5-1.8B-Chat-GGUF/resolve/main/qwen1_5-1_8b-chat-q3_k_m.gguf">Qwen/Qwen1.5-1.8B-Chat Q3_K_M (1.02 GB)</option>
<option value="https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b/resolve/main/stablelm-2-zephyr-1_6b-Q4_1.gguf">stabilityai/stablelm-2-zephyr-1_6b Q4_1 (1.07 GB)</option>
<option value="https://huggingface.co/TKDKid1000/phi-1_5-GGUF/resolve/main/phi-1_5-Q4_K_M.gguf">microsoft/phi-1_5 Q4_K_M (918 MB)</option>
<option value="https://huggingface.co/TheBloke/phi-2-GGUF/resolve/main/phi-2.Q3_K_M.gguf">microsoft/phi-2 Q3_K_M (1.48 GB)</option>
<option value="https://huggingface.co/SanctumAI/Phi-3-mini-4k-instruct-GGUF/resolve/main/phi-3-mini-4k-instruct.Q3_K_M.gguf">microsoft/phi-3-mini-4k Q3_K_M (1.96 GB)</option>
<option value="https://huggingface.co/Felladrin/gguf-flan-t5-small/resolve/main/flan-t5-small.Q3_K_M.gguf">google/flan-t5-small Q3_K_M (88.3 MB)</option>
</select>
<label> Prompt: </label>
<textarea id="prompt" name="prompt" rows="5">Suppose Alice originally had 3 apples, then Bob gave Alice 7 apples, then Alice gave Cook 5 apples, and then Tim gave Alice 3x the amount of apples Alice had. How many apples does Alice have now? Let’s think step by step.</textarea>
<label> Result: </label>
<!-- <textarea id="result" name="result" rows="10" autocomplete="off"></textarea> -->
<pre id="result" name="result"></pre>
</section>
<section>
<button id="run"> Run </button>
</section>
<section>
<button id="run-progress-loading-model" aria-busy="true"hidden="hidden"> Loading model... </button>
<button id="run-progress-loaded-model" aria-busy="true" hidden="hidden"> Loaded model </button>
<button id="run-progress-generating" aria-busy="true" hidden="hidden"> Generating... </button>
</section>
<section>
<progress id="model-progress" hidden="hidden" />
</section>
</main>
<!-- example -->
<script type="module" src="example-single-thread.js?v=240213-5"></script>
</body>
</html>