Skip to content

Commit ec0cab0

Browse files
authored
update clip dependencies, readme (#1203)
1 parent 31a71a9 commit ec0cab0

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

Diff for: setup.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -164,7 +164,7 @@ def _parse_requirements_file(file_path):
164164
_haystack_integration_deps = _parse_requirements_file(_haystack_requirements_file_path)
165165
_clip_deps = [
166166
"open_clip_torch==2.20.0",
167-
"scipy==1.10.1",
167+
"scipy<1.9.2,>=1.8",
168168
f"{'nm-transformers' if is_release else 'nm-transformers-nightly'}",
169169
]
170170

Diff for: src/deepsparse/clip/README.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ Before you start your adventure with the DeepSparse Engine, make sure that your
1616
### Model Format
1717
By default, to deploy CLIP models using the DeepSparse Engine, it is required to supply the model in the ONNX format. This grants the engine the flexibility to serve any model in a framework-agnostic environment. To see examples of pulling CLIP models and exporting them to ONNX, please see the [sparseml documentation](https://github.com/neuralmagic/sparseml/tree/main/integrations/clip).
1818

19-
For the Zero-shot image classification workflow, two ONNX models are required, a visual model for CLIP's visual branch, and a text model for CLIP's text branch. Both of these models can be produced through the sparseml integration linked above. For caption generation, specific models called CoCa models are required and instructions on how to export CoCa models are also provided in the sparseml documentation above. The CoCa exporting pathway will generate one additional decoder model, along with the text and visual models.
19+
For the Zero-shot image classification workflow, two ONNX models are required, a visual model for CLIP's visual branch, and a text model for CLIP's text branch. Both of these models can be produced through the sparseml integration linked above. For caption generation, specific models called CoCa models are required and instructions on how to export CoCa models are also provided in the sparseml documentation. The CoCa exporting pathway will generate one additional decoder model, along with the text and visual models.
2020

2121
### Deployment examples:
2222
The following example uses pipelines to run the CLIP models for inference. For Zero-shot prediction, the pipeline ingests a list of images and a list of possible classes. A class is returned for each of the provided images. For caption generation, only an image file is required.
@@ -60,8 +60,8 @@ from deepsparse.clip import (
6060
possible_classes = ["ice cream", "an elephant", "a dog", "a building", "a church"]
6161
images = ["basilica.jpg", "buddy.jpeg", "thailand.jpg"]
6262

63-
model_path_text = "zeroshot_research/text/model.onnx"
64-
model_path_visual = "zeroshot_research/visual/model.onnx"
63+
model_path_text = "zeroshot_research/clip_text.onnx"
64+
model_path_visual = "zeroshot_research/clip_visual.onnx"
6565

6666
kwargs = {
6767
"visual_model_path": model_path_visual,

0 commit comments

Comments
 (0)