You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-4
Original file line number
Diff line number
Diff line change
@@ -236,13 +236,13 @@ marvin.paint("a simple cup of coffee, still warm")
236
236
237
237
Learn more about image generation [here](https://askmarvin.ai/docs/images/generation).
238
238
239
-
## 🔍 Classify images (beta)
239
+
## 🔍 Converting images to data
240
240
241
-
In addition to text, Marvin has beta support for captioning, classifying, transforming, and extracting entities from images using the GPT-4 vision model:
241
+
In addition to text, Marvin has support for captioning, classifying, transforming, and extracting entities from images using the GPT-4 vision model:
Copy file name to clipboardExpand all lines: docs/docs/vision/captioning.md
+38-9
Original file line number
Diff line number
Diff line change
@@ -2,9 +2,6 @@
2
2
3
3
Marvin can use OpenAI's vision API to process images as inputs.
4
4
5
-
!!! tip "Beta"
6
-
Please note that vision support in Marvin is still in beta, as OpenAI has not finalized the vision API yet. While it works as expected, it is subject to change.
7
-
8
5
<divclass="admonition abstract">
9
6
<pclass="admonition-title">What it does</p>
10
7
<p>
@@ -18,19 +15,18 @@ Marvin can use OpenAI's vision API to process images as inputs.
18
15
19
16
Generate a description of the following image, hypothetically available at `/path/to/marvin.png`:
"This is a digital illustration featuring a stylized, cute character resembling a Funko Pop vinyl figure with large, shiny eyes and a square-shaped head, sitting on abstract wavy shapes that simulate a landscape. The whimsical figure is set against a dark background with sparkling, colorful bokeh effects, giving it a magical, dreamy atmosphere."
29
+
"A cute, small robot with a square head and large, glowing eyes sits on a surface of wavy, colorful lines. The background is dark with scattered, glowing particles, creating a magical and futuristic atmosphere."
34
30
35
31
36
32
<divclass="admonition info">
@@ -41,6 +37,23 @@ Marvin can use OpenAI's vision API to process images as inputs.
41
37
</div>
42
38
43
39
40
+
## Providing instructions
41
+
42
+
The `instructions` parameter offers an additional layer of control, enabling more nuanced caption generation, especially in ambiguous or complex scenarios.
43
+
44
+
## Captions for multiple images
45
+
46
+
To generate a single caption for multiple images, pass a list of `Image` objects to `caption`:
47
+
48
+
```python
49
+
marvin.caption(
50
+
[
51
+
marvin.Image.from_path('/path/to/img1.png'),
52
+
marvin.Image.from_path('/path/to/img2.png')
53
+
],
54
+
instructions='...'
55
+
)
56
+
```
44
57
45
58
46
59
## Model parameters
@@ -53,5 +66,21 @@ You can pass parameters to the underlying API via the `model_kwargs` argument of
53
66
If you are using Marvin in an async environment, you can use `caption_async`:
To generate individual captions for a list of inputs at once, use `.map`. Note that this is different than generating a single caption for multiple images, which is done by passing a list of `Image` objects to `caption`.
74
+
75
+
```python
76
+
inputs = [
77
+
marvin.Image.from_path('/path/to/img1.png'),
78
+
marvin.Image.from_path('/path/to/img2.png')
79
+
]
80
+
result = marvin.caption.map(inputs)
81
+
assertlen(result) ==2
82
+
```
83
+
84
+
(`marvin.cast_async.map` is also available for async environments.)
85
+
86
+
Mapping automatically issues parallel requests to the API, making it a highly efficient way to work with multiple inputs at once. The result is a list of outputs in the same order as the inputs.
Copy file name to clipboardExpand all lines: docs/docs/vision/classification.md
+7-11
Original file line number
Diff line number
Diff line change
@@ -2,10 +2,6 @@
2
2
3
3
Marvin can use OpenAI's vision API to process images and classify them into categories.
4
4
5
-
The `marvin.beta.classify` function is an enhanced version of `marvin.classify` that accepts images as well as text.
6
-
7
-
!!! tip "Beta"
8
-
Please note that vision support in Marvin is still in beta, as OpenAI has not finalized the vision API yet. While it works as expected, it is subject to change.
9
5
10
6
<divclass="admonition abstract">
11
7
<pclass="admonition-title">What it does</p>
@@ -36,14 +32,14 @@ The `marvin.beta.classify` function is an enhanced version of `marvin.classify`
@@ -60,15 +56,15 @@ The `marvin.beta.classify` function is an enhanced version of `marvin.classify`
60
56
61
57
62
58
## Model parameters
63
-
You can pass parameters to the underlying API via the `model_kwargs`and `vision_model_kwargs` arguments of `classify`. These parameters are passed directly to the respective APIs, so you can use any supported parameter.
59
+
You can pass parameters to the underlying API via the `model_kwargs`argument of `classify`. These parameters are passed directly to the API, so you can use any supported parameter.
64
60
65
61
66
62
## Async support
67
63
68
64
If you are using Marvin in an async environment, you can use `classify_async`:
69
65
70
66
```python
71
-
result =await marvin.beta.classify_async(
67
+
result =await marvin.classify_async(
72
68
"The app crashes when I try to upload a file.",
73
69
labels=["bug", "feature request", "inquiry"]
74
70
)
@@ -85,10 +81,10 @@ inputs = [
85
81
"The app crashes when I try to upload a file.",
86
82
"How do change my password?"
87
83
]
88
-
result = marvin.beta.classify.map(inputs, ["bug", "feature request", "inquiry"])
84
+
result = marvin.classify.map(inputs, ["bug", "feature request", "inquiry"])
89
85
assert result == ["bug", "inquiry"]
90
86
```
91
87
92
-
(`marvin.beta.classify_async.map` is also available for async environments.)
88
+
(`marvin.classify_async.map` is also available for async environments.)
93
89
94
90
Mapping automatically issues parallel requests to the API, making it a highly efficient way to classify multiple inputs at once. The result is a list of classifications in the same order as the inputs.
Copy file name to clipboardExpand all lines: docs/docs/vision/extraction.md
+6-10
Original file line number
Diff line number
Diff line change
@@ -2,12 +2,8 @@
2
2
3
3
Marvin can use OpenAI's vision API to process images and convert them into structured data, transforming unstructured information into native types that are appropriate for a variety of programmatic use cases.
4
4
5
-
The `marvin.beta.extract` function is an enhanced version of `marvin.extract` that accepts images as well as text.
6
5
7
6
8
-
!!! tip "Beta"
9
-
Please note that vision support in Marvin is still in beta, as OpenAI has not finalized the vision API yet. While it works as expected, it is subject to change.
10
-
11
7
<divclass="admonition abstract">
12
8
<pclass="admonition-title">What it does</p>
13
9
<p>
@@ -37,11 +33,11 @@ The `marvin.beta.extract` function is an enhanced version of `marvin.extract` th
result = marvin.beta.extract(img, target=str, instructions="dog breeds")
40
+
result = marvin.extract(img, target=str, instructions="dog breeds")
45
41
```
46
42
47
43
!!! success "Result"
@@ -50,14 +46,14 @@ The `marvin.beta.extract` function is an enhanced version of `marvin.extract` th
50
46
```
51
47
52
48
## Model parameters
53
-
You can pass parameters to the underlying API via the `model_kwargs`and `vision_model_kwargs` arguments of `extract`. These parameters are passed directly to the respective APIs, so you can use any supported parameter.
49
+
You can pass parameters to the underlying API via the `model_kwargs`argument of `extract`. These parameters are passed directly to the API, so you can use any supported parameter.
54
50
55
51
56
52
## Async support
57
53
If you are using Marvin in an async environment, you can use `extract_async`:
58
54
59
55
```python
60
-
result =await marvin.beta.extract_async(
56
+
result =await marvin.extract_async(
61
57
"I drove from New York to California.",
62
58
target=str,
63
59
instructions="2-letter state codes",
@@ -75,10 +71,10 @@ inputs = [
75
71
"I drove from New York to California.",
76
72
"I took a flight from NYC to BOS."
77
73
]
78
-
result = marvin.beta.extract.map(inputs, target=str, instructions="2-letter state codes")
74
+
result = marvin.extract.map(inputs, target=str, instructions="2-letter state codes")
79
75
assert result == [["NY", "CA"], ["NY", "MA"]]
80
76
```
81
77
82
-
(`marvin.beta.extract_async.map` is also available for async environments.)
78
+
(`marvin.extract_async.map` is also available for async environments.)
83
79
84
80
Mapping automatically issues parallel requests to the API, making it a highly efficient way to work with multiple inputs at once. The result is a list of outputs in the same order as the inputs.
Copy file name to clipboardExpand all lines: docs/docs/vision/transformation.md
+10-15
Original file line number
Diff line number
Diff line change
@@ -2,10 +2,6 @@
2
2
3
3
Marvin can use OpenAI's vision API to process images and convert them into structured data, transforming unstructured information into native types that are appropriate for a variety of programmatic use cases.
4
4
5
-
The `marvin.beta.cast` function is an enhanced version of `marvin.cast` that accepts images as well as text.
6
-
7
-
!!! tip "Beta"
8
-
Please note that vision support in Marvin is still in beta, as OpenAI has not finalized the vision API yet. While it works as expected, it is subject to change.
9
5
10
6
<divclass="admonition abstract">
11
7
<pclass="admonition-title">What it does</p>
@@ -41,10 +37,10 @@ The `marvin.beta.cast` function is an enhanced version of `marvin.cast` that acc
41
37
state: str = Field(description="2-letter state abbreviation")
instructions=f"Did I forget anything on my list: {shopping_list}?",
108
104
)
@@ -113,15 +109,14 @@ If the target type isn't self-documenting, or you want to provide additional gui
113
109
```python
114
110
assert missing_items == ["eggs", "oranges"]
115
111
```
116
-
117
112
## Model parameters
118
-
You can pass parameters to the underlying API via the `model_kwargs`and `vision_model_kwargs` arguments of `cast`. These parameters are passed directly to the respective APIs, so you can use any supported parameter.
113
+
You can pass parameters to the underlying API via the `model_kwargs`argument of `cast`. These parameters are passed directly to the API, so you can use any supported parameter.
119
114
120
115
## Async support
121
116
If you are using `marvin` in an async environment, you can use `cast_async`:
122
117
123
118
```python
124
-
result =await marvin.beta.cast_async("one", int)
119
+
result =await marvin.cast_async("one", int)
125
120
126
121
assert result ==1
127
122
```
@@ -135,10 +130,10 @@ inputs = [
135
130
"I bought two donuts.",
136
131
"I bought six hot dogs."
137
132
]
138
-
result = marvin.beta.cast.map(inputs, int)
133
+
result = marvin.cast.map(inputs, int)
139
134
assert result == [2, 6]
140
135
```
141
136
142
-
(`marvin.beta.cast_async.map` is also available for async environments.)
137
+
(`marvin.cast_async.map` is also available for async environments.)
143
138
144
139
Mapping automatically issues parallel requests to the API, making it a highly efficient way to work with multiple inputs at once. The result is a list of outputs in the same order as the inputs.
0 commit comments