-
Notifications
You must be signed in to change notification settings - Fork 1k
[Config] Enhance ModelRecord #435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced May 30, 2024
CharlieFRuan
added a commit
that referenced
this pull request
May 30, 2024
### Changes Main changes include: - New prebuilt models: - Phi3-mini - StableLM-2-zephyr-1.6B - Qwen1.5-1.8B - Hermes2-Pro-Llama-3-8B to prebuilt models - Updates on `ModelRecord` fields - For detail see: #435 - Update all WASMs - For detail see: #433 - Update all WASMs to v0.2.39 - Support grammar for Llama3, hence update examples/json-mode to use `Llama3` and `Hermes2-pro-Llama3-8B` for function calling in `examples/json-schema` - Use `loglevel` package: - For details see #427 - Fix `index.js.map` issue for Vite - #420 - Enhance error handling and ServiceWorker ### TVMjs TVMjs compiled at apache/tvm@71f7af7 - Main changes include: - apache/tvm#17031 - apache/tvm#17028 - apache/tvm#17021 ### WASM version - All wasms updated to 0.2.39 via mlc-ai/binary-mlc-llm-libs#123 for new MLC-LLM runtime (mainly grammar)
jingyi-zhao-01
pushed a commit
to jingyi-zhao-01/web-llm
that referenced
this pull request
Dec 8, 2024
There are three changes to `ModelRecord` this PR brings: ### 1. Update model ids to match HF repo name We rename `modelId` in `webllm.prebuiltAppConfig` to be the exact same as the HF repo name. For most models, that means we simply append `-MLC` to the `modelId`. For the low-context version of the model, we would have `{HF-repo}-1k`, suggesting 1k context length. As a result, we rename Phi2 and phi1.5 models since their `modelId` did not match with the repo name - `Phi2-q4f32_1` → `phi-2-q4f32_1-MLC` - `Phi1.5-q4f16_1` → `phi-1_5-q4f16_1-MLC` ### 2. Rename `model_url` and `model_lib_url` to `model` and `model_lib` To better match with other platforms of MLC-LLM (e.g. iOS, Android), we rename the `ModelRecord` fields. ### 3. Remove `resolve/main` from `model` URL Instead of `"https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC/resolve/main/"`, we now make it `"https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC/"`; note the trailing `/` will be appended by us if it is not there. ### Example As an example, we would have: ```typescript { model: "https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC", model_id: "Llama-3-8B-Instruct-q4f16_1-MLC", model_lib: "path/to/Llama-3-8B-Instruct-q4f16_1-ctx1k_cs1k-webgpu.wasm", }, ``` instead of ```typescript { model_url: "https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC/resolve/main/", model_id: "Llama-3-8B-Instruct-q4f16_1", model_lib_url: "path/to/Llama-3-8B-Instruct-q4f16_1-ctx4k_cs1k-webgpu.wasm", }, ``` --------- Co-authored-by: Nestor Qin <[email protected]>
jingyi-zhao-01
pushed a commit
to jingyi-zhao-01/web-llm
that referenced
this pull request
Dec 8, 2024
### Changes Main changes include: - New prebuilt models: - Phi3-mini - StableLM-2-zephyr-1.6B - Qwen1.5-1.8B - Hermes2-Pro-Llama-3-8B to prebuilt models - Updates on `ModelRecord` fields - For detail see: mlc-ai#435 - Update all WASMs - For detail see: mlc-ai#433 - Update all WASMs to v0.2.39 - Support grammar for Llama3, hence update examples/json-mode to use `Llama3` and `Hermes2-pro-Llama3-8B` for function calling in `examples/json-schema` - Use `loglevel` package: - For details see mlc-ai#427 - Fix `index.js.map` issue for Vite - mlc-ai#420 - Enhance error handling and ServiceWorker ### TVMjs TVMjs compiled at apache/tvm@71f7af7 - Main changes include: - apache/tvm#17031 - apache/tvm#17028 - apache/tvm#17021 ### WASM version - All wasms updated to 0.2.39 via mlc-ai/binary-mlc-llm-libs#123 for new MLC-LLM runtime (mainly grammar)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There are three changes to
ModelRecord
this PR brings:1. Update model ids to match HF repo name
We rename
modelId
inwebllm.prebuiltAppConfig
to be the exact same as the HF repo name. For most models, that means we simply append-MLC
to themodelId
. For the low-context version of the model, we would have{HF-repo}-1k
, suggesting 1k context length.As a result, we rename Phi2 and phi1.5 models since their
modelId
did not match with the repo namePhi2-q4f32_1
→phi-2-q4f32_1-MLC
Phi1.5-q4f16_1
→phi-1_5-q4f16_1-MLC
2. Rename
model_url
andmodel_lib_url
tomodel
andmodel_lib
To better match with other platforms of MLC-LLM (e.g. iOS, Android), we rename the
ModelRecord
fields.3. Remove
resolve/main
frommodel
URLInstead of
"https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC/resolve/main/"
, we now make it"https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC/"
; note the trailing/
will be appended by us if it is not there.Example
As an example, we would have:
instead of