Skip to content

QA: release 1.0.9 #1920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TC117 opened this issue Feb 4, 2025 · 7 comments
Closed

QA: release 1.0.9 #1920

TC117 opened this issue Feb 4, 2025 · 7 comments
Assignees
Labels
Milestone

Comments

@TC117
Copy link

TC117 commented Feb 4, 2025

QA details:

Version: v1.0.9

OS (select one)

  • Windows 11 (online & offline)
  • Ubuntu 24, 22 (online & offline)
  • Mac Silicon OS 14/15 (online & offline)
  • Mac Intel (online & offline)

1. Manual QA (CLI)

Installation

  • it should install with local installer (default; no internet required during installation, all dependencies bundled)
  • it should install with network installer
  • it should install 2 binaries (cortex and cortex-server) [mac: binaries in /usr/local/bin]
  • it should install with correct folder permissions
  • it should install with folders: /engines /logs (no /models folder until model pull)
  • It should install with Docker image https://cortex.so/docs/installation/docker/

Data/Folder structures

  • cortex.so models are stored in cortex.so/model_name/variants/, with .gguf and model.yml file
  • huggingface models are stored huggingface.co/author/model_name with .gguf and model.yml file
  • downloaded models are saved in cortex.db with the right fields: model, author_repo_id, branch_name, path_to_model_yaml (view via SQL)

Cortex Update

  • cortex -v should check output current version and check for updates
  • cortex update replaces the app, installer, uninstaller and binary file (without installing cortex.llamacpp)
  • cortex update should update from ~3-5 versions ago to latest (+3 to 5 bump)
  • cortex update should update from the previous version to latest (+1 bump)
  • cortex update -v 1.x.x-xxx should update from the previous version to specified version
  • cortex update should update from previous stable version to latest
  • it should gracefully update when server is actively running

Overall / App Shell

  • cortex returns helpful text in a timely* way (< 5s)
  • cortex or cortex -h displays help commands
  • CLI commands should start the API server, if not running [except
  • it should correctly log to cortex-cli.log and cortex.log
  • There should be no stdout from inactive shell session

Engines

  • llama.cpp should be installed by default
  • it should run gguf models on llamacpp
  • it should list engines
  • it should get engines
  • it should install engines (latest version if not specified)
  • it should install engines (with specified variant and version)
  • it should get default engine
  • it should set default engine (with specified variant/version)
  • it should load engine
  • it should unload engine
  • it should update engine (to latest version)
  • it should update engine (to specified version)
  • it should uninstall engines
  • it should gracefully continue engine installation if interrupted halfway (partial download)
  • it should gracefully handle when users try to CRUD incompatible engines (No variant found for xxx)
  • it should run trtllm models on trt-llm [WIP, not tested]
  • it shoud handle engine variants [WIP, not tested]
  • it should update engines versions [WIP, not tested]

Server

  • cortex start should start server and output localhost URL & port number
  • users can access API Swagger documentation page at localhost URL & port number
  • cortex start can be configured with parameters (port, logLevel [WIP]) https://cortex.so/docs/cli/start/
  • it should correctly log to cortex logs (logs/cortex.log, logs/cortex-cli.log)
  • cortex ps should return server status and running models (or no model loaded)
  • cortex stop should stop server

Model Pulling

  • Pulling a model should pull .gguf and model.yml file
  • Model download progress should appear as download bars for each file
  • Model download progress should be accurate (%, total time, download size, speed)

cortex.so

  • it should pull by built in model_ID
  • pull by model_ID should recommend default variant at the top (set in HF model.yml)
  • it should pull by built-in model_id:variant

huggingface.co

  • it should pull by HF repo/model ID
  • it should pull by full HF url (ending in .gguf)

Interrupted Download

  • it should allow user to interrupt / stop download
  • pulling again after interruption should accurately calculates remainder of model file size neeed to be downloaded (Found unfinished download! Additional XGB needs to be downloaded)
  • it should allow to continue downloading the remainder after interruption

Model Management

  • it should list downloaded models
  • it should get a local model
  • it should update model parameters in model.yaml
  • it should delete a model
  • it should import models with model_id and model_path

Model Running

  • cortex run <cortexso model> - if no local models detected, shows pull model menu
  • cortex run - if local model detected, runs the local model
  • cortex run - if multiple local models detected, shows list of local models (from multiple model sources eg cortexso, HF authors) for users to select (via regex search)
  • cortex run <invalid model id> should return gracefully Model not found!
  • run should autostart server
  • cortex run <model> starts interactive chat (by default)
  • cortex run <model> -d runs in detached mode
  • cortex models start <model>
  • terminate StdIn or exit() should exit interactive chat

Hardware Detection / Acceleration [WIP, no need to QA]

  • it should auto offload max ngl
  • it should correctly detect available GPUs
  • it should gracefully detect missing dependencies/drivers
    CPU Extension (e.g. AVX-2, noAVX, AVX-512)
    GPU Acceleration (e.g. CUDA11, CUDA12, Vulkan, sycl, etc)

Uninstallation / Reinstallation

  • it should uninstall 2 binaries (cortex and cortex-server)
  • it should uninstall with 2 options to delete or not delete data folder
  • it should gracefully uninstall when server is still running
  • uninstalling should not leave any dangling files
  • uninstalling should not leave any dangling processes
  • it should reinstall without having conflict issues with existing cortex data folders

--

2. API QA

Checklist for each endpoint

  • Upon cortex start, API page is displayed at localhost:port endpoint
  • Endpoints should support the parameters stated in API reference (towards OpenAI Compatibility)
  • https://cortex.so/api-reference is updated

Endpoints

Chat Completions

Engines

  • List engines: GET /v1/engines
  • Get engine: GET /v1/engines/{name}
  • Install engine: POST /v1/engines/install/{name}
  • Get default engine variant/version: GET v1/engines/{name}/default
  • Set default engine variant/version: POST v1/engines/{name}/default
  • Load engine: POST v1/engines/{name}/load
  • Unload engine: DELETE v1/engines/{name}/load
  • Update engine: POST v1/engines/{name}/update
  • uninstall engine: DELETE /v1/engines/install/{name}
  • remote engine: ...

Pulling Models

  • Pull model: POST /v1/models/pull starts download (websockets)
  • Pull model: websockets /events emitted
  • Stop model download: DELETE /v1/models/pull (websockets)
  • Stop model download: websockets /events stopped
  • Import model: POST v1/models/import

Running Models

  • List models: GET v1/models
  • Start model: POST /v1/models/start
  • Stop model: POST /v1/models/stop
  • Get model: GET /v1/models/{id}
  • Delete model: DELETE /v1/models/{id}
  • Update model: PATCH /v1/models/{model} updates model.yaml params

Threads

  • List threads: GET v1/threads
  • Get threads with ID: GET `v1/threads/{id}
    ....

Server

  • CORs [WIP]
  • health: GET /healthz
  • terminate server: DELETE /processManager/destroy

Test list for reference:

@TC117 TC117 added the type: QA checklist QA checklist label Feb 4, 2025
@github-project-automation github-project-automation bot moved this to Investigating in Menlo Feb 4, 2025
@TC117 TC117 self-assigned this Feb 4, 2025
@TC117 TC117 moved this from Investigating to QA in Menlo Feb 4, 2025
@TC117 TC117 added this to the v1.0.9 milestone Feb 4, 2025
@TC117 TC117 changed the title QA: [1.0.9] QA: release 1.0.9 Feb 4, 2025
@TC117
Copy link
Author

TC117 commented Feb 4, 2025

@vansangpfiev auto start server but not show on cli
cortex-beta - window - VM 114
step to reproduce:

  • Install with network version
  • Run command to start server
    Image

@TC117
Copy link
Author

TC117 commented Feb 5, 2025

cortex ps not show loaded models
1.0.9-rc7 - window
Step:

  • Pull a model
  • start models with API
  • on CLI run cortex ps to check loaded model

Image

@vansangpfiev
Copy link
Contributor

@vansangpfiev auto start server but not show on cli cortex-beta - window - VM 114 step to reproduce:

  • Install with network version
  • Run command to start server
    Image

Image

Seems like the Server was running at that time. @TC117 Please help to verify.

@vansangpfiev vansangpfiev self-assigned this Feb 5, 2025
@TC117
Copy link
Author

TC117 commented Feb 5, 2025

Image

@vansangpfiev auto start server but not show on cli cortex-beta - window - VM 114 step to reproduce:

  • Install with network version
  • Run command to start server
    Image

Image

Seems like the Server was running at that time. @TC117 Please help to verify.

yes it only happen the 1st time so just a minor issue

@TC117
Copy link
Author

TC117 commented Feb 5, 2025

@vansangpfiev Cant download new engines its stop in the middle
1.0.9-rc7 - Window - VM 114

Image

@TC117
Copy link
Author

TC117 commented Feb 5, 2025

CURL request fail
MAC - Linux
1.0.9-rc7
Image

@vansangpfiev
Copy link
Contributor

vansangpfiev commented Feb 6, 2025

CURL request fail MAC - Linux 1.0.9-rc7 Image

@TC117 This should be fixed by cortex-beta-rc8 and llama-engine v0.1.49-4a0e548
The root cause is the llama-server does not have execute permission on macOS/Ubuntu after we pull llama-engine from github.

@TC117 TC117 closed this as completed Feb 6, 2025
@TC117 TC117 moved this from QA to Completed in Menlo Feb 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Archived in project
Development

No branches or pull requests

2 participants