Skip to content

bug: Model corruption during download leads to loading failure #1139

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #1077
imtuyethan opened this issue Sep 2, 2024 · 2 comments
Closed
Tracked by #1077

bug: Model corruption during download leads to loading failure #1139

imtuyethan opened this issue Sep 2, 2024 · 2 comments
Assignees
Labels
category: model management Model pull, yaml, model state type: bug Something isn't working
Milestone

Comments

@imtuyethan
Copy link
Contributor

imtuyethan commented Sep 2, 2024

Describe the bug

https://discord.com/channels/1107178041848909847/1279500849470509066

  1. Users are experiencing model loading failures due to corrupted downloads
  2. The error message indicates that tensor data is not within file bounds
  3. Currently, there's no built-in validation to catch this issue after download
  4. The workaround is to manually remove and re-download the model
Screenshot 2024-09-02 at 6 04 31 PM

Proposed Solution:

Short-term:

  • Provide clear instructions to users on how to remove and reinstall corrupted models
  • Consider adding this process to the app's troubleshooting guide

Long-term:

  • Implement post-download file validation to detect corrupted or incomplete model files
  • Add error handling to gracefully manage and report download issues
  • Consider implementing resume functionality for interrupted downloads to prevent corruption
@imtuyethan imtuyethan added the type: bug Something isn't working label Sep 2, 2024
@freelerobot freelerobot transferred this issue from menloresearch/jan Sep 6, 2024
@freelerobot freelerobot added the category: model management Model pull, yaml, model state label Sep 6, 2024
@dan-menlo
Copy link
Contributor

Linking this to main issue #1077 and queuing for Sprint 20

@dan-menlo dan-menlo moved this from Planning to Scheduled in Menlo Sep 8, 2024
@freelerobot
Copy link
Contributor

Closing as a dupe of #1288

@github-project-automation github-project-automation bot moved this from Scheduled to Completed in Menlo Sep 23, 2024
@gabrielle-ong gabrielle-ong added this to the v1.0.0 milestone Oct 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: model management Model pull, yaml, model state type: bug Something isn't working
Projects
Archived in project
Development

No branches or pull requests

5 participants