Skip to content

feat: Model Pull has clear API and CLI to support Huggingface Repos #1242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
dan-menlo opened this issue Sep 18, 2024 · 7 comments · Fixed by #1365 or #1460
Closed
4 tasks done

feat: Model Pull has clear API and CLI to support Huggingface Repos #1242

dan-menlo opened this issue Sep 18, 2024 · 7 comments · Fixed by #1365 or #1460
Assignees
Labels
category: model management Model pull, yaml, model state
Milestone

Comments

@dan-menlo
Copy link
Contributor

dan-menlo commented Sep 18, 2024

Goal

  • cortex model pull should have clear APIs that support different model repo sources
  • e.g. Huggingface, Cortex Hub

Tasklist

CLI

# Pulls immediately
cortex model pull <huggingface_url_sppecific_gguf>

# Lets user select quantization using CLI
cortex model pull <huggingface_url> 

# NOT SURE: Do we need an "info" equivalent?
# Gets repo type (e.g. GGUF, in future ONNX, TensorRT-LLM, dumps possible versions)
# Will power the "select quantization"
cortex model info <huggingface_url>
cortex model info <cortex_repo_url>    # Dumps tags

API

- How do we represent Huggingface strings?
- How do we handle Huggingface Repos (i.e. where user has to select quantization)? 

Key Questions

  • Does Cortex need an API to get Huggingface Repo metadata?
  • Does this need to be bubbled to Jan, to support the Huggingface Repo page?

Linked Issues

Jan's Requirements

  1. User enters Huggingface URL in import box
  2. User clicks deep link from Huggingface

Cortex should support an API, that can support the following UI:

Image

@dan-menlo dan-menlo added this to Menlo Sep 18, 2024
@dan-menlo dan-menlo converted this from a draft issue Sep 18, 2024
@dan-menlo dan-menlo added the category: model management Model pull, yaml, model state label Sep 18, 2024
@dan-menlo dan-menlo changed the title feat: Model pull from Huggingface has API and CLI feat: Huggingface Model Pull has clear API and CLI Sep 18, 2024
@dan-menlo dan-menlo changed the title feat: Huggingface Model Pull has clear API and CLI feat: Model Pull has clear API and CLI to support Huggingface Repos Sep 18, 2024
@dan-menlo dan-menlo moved this to Scheduled in Menlo Sep 18, 2024
@namchuai
Copy link
Contributor

@dan-homebrew , are you sure about the command cortex model pull? Because, in cortexjs, I think we're using cortex pull. Also, ollama is using ollama pull.

One thing to note from the API: http://127.0.0.1:1337/v1/models/__MODELID__/pull. The __MODELID__ has to be put inside body of the request. This is because Drogon does not play nice if __MODELID__ contains slash.

@namchuai
Copy link
Contributor

  • How do we represent Huggingface strings?
    -> Hmm, I'm not fully understand your question. Please elaborate.

  • How do we handle Huggingface Repos (i.e. where user has to select quantization)?
    -> There are some cases:

    • "normally", for gguf repo, all quants are placed at main branch, with same level directory.
    • cortexso will store quants in each branch.

We already handled both cases above and allowing user to select a quant that they want to use.
However, for now, we only support GGUF and single GGUF file repository (multile GGUF for a model is not supported at the moment)

@namchuai
Copy link
Contributor

  • Does Cortex need an API to get Huggingface Repo metadata?
    -> Yes, mostly git API. I'm not using 3rd party lib at the moment. And I don't think there's a c++ version for the lib.

  • Does this need to be bubbled to Jan, to support the Huggingface Repo page?
    -> I can't decide this. Let's discuss this. cc @louis-jan

@dan-menlo
Copy link
Contributor Author

@namchuai Can I check: can I close this issue, it seems we have implemented this already?

@namchuai
Copy link
Contributor

@dan-homebrew , this one still need my last PR to support stop download. The PR is #1460 , need some testing before I mark it Ready.

@gabrielle-ong
Copy link
Contributor

QAing: API for Huggingface repo

1. POST models/pull starts download
2. POST models/pull halfway shows downloaded bytes
image

3. DELETE models/pull stops the download

curl --location --request DELETE 'http://127.0.0.1:39281/models/pull' \
--header 'Content-Type: application/json' \
--data '{
    "taskId": <your_download_task_id>
}'
image

4. web sockets events emitted when model pull starts /events
5. web sockets events stopped when model pull stops /events
image

Pending task to add DELETE models/pull to Swagger & docs @gabrielle-ong

@gabrielle-ong
Copy link
Contributor

Closing this huge epic, thanks @namchuai!
Minor UX issue of stdout sent to CLI (inactive terminal), tracking in #1519

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: model management Model pull, yaml, model state
Projects
Archived in project
3 participants