Automatic image blob creation doesn't handle RGBA images with JPEG. #160

FrostyTheSouthernSnowman · 2024-01-02T18:08:57Z

Description of the bug:

Calling generate_content on a Gemini Pro Vision model returns an error when it receives a PNG image saying KeyError: 'RGBA' which causes another execption saying OSError: cannot write mode RGBA as JPEG. This seems to indicate that PNG is not supported, but according to the Gemini API docs, PNG is a supported MIME type. Note that the png example from that docs page doesn't seem to work. It uses a contents kwarg to generate_content, but that argument doesn't exist. Modifying the code to use the right arguments gives the error google.api_core.exceptions.InvalidArgument: 400 Request contains an invalid argument.

Actual vs expected behavior:

The expected behavior is for this code:

screenshot = get_screen_data()

prompt = "What are your thoughts on this screenshot? I think"

response = model.generate_content(
    [prompt, screenshot], stream=True
)

response.resolve()

print(response.text)

to work successfully. This code was modified from the text from image and text example in the quickstart. Instead, it outputs the KeyError and OSError above. Changing the code to:

screenshot = get_screen_data()

screenshot_data = {
    'mime_type': 'image/png',
    'data': screenshot.tobytes()
}

prompt = "What are your thoughts on this screenshot? I think"

response = model.generate_content(
    [prompt, screenshot_data], stream=True
)

response.resolve()
print(response.text)

Raises a 400 error as described above. This code is modified from that Gemini API Overview

Any other information you'd like to share?

#112 is related to this. Specifically, it deals with my second attempt at solving this problem. This issue is about the fact that generate_content doesn't handle PNG by default even though it is supposedly supported.

The text was updated successfully, but these errors were encountered:

Andy963 · 2024-03-14T11:23:52Z

it seems that the code in Gemini API Overview is not correct,

model = genai.GenerativeModel('gemini-pro-vision')

cookie_picture = [{
    'mime_type': 'image/png',
    'data': Path('cookie.png').read_bytes()
}]
prompt = "Do these look store-bought or homemade?"

response = model.generate_content(
    model="gemini-pro-vision", # parameter model is no need here
    content=[prompt, cookie_picture]
)
print(response.text)

FrostyTheSouthernSnowman · 2024-03-20T11:46:37Z

Definitely seems to be the case

MarkDaoust · 2024-05-17T22:24:55Z

In my tests PNG is working fine.

IDK what your screenshot = get_screen_data() function is.

Can you share a colab that reproducs es the problem?

it seems that the code in Gemini API Overview is not correct,

Thanks, I'm sending a fix for this.

ya-stack · 2024-05-18T15:26:44Z

Hi, I am trying to read image from https: URL, but it seems to be not working, it's showing below error:
ChatGoogleGenerativeAIError: Invalid argument provided to Gemini: 400 Add an image to use models/gemini-pro-vision, or switch your model to a text model.

FrostyTheSouthernSnowman · 2024-05-18T20:44:20Z

In my tests PNG is working fine.

IDK what your screenshot = get_screen_data() function is.

Can you share a colab that reproducs es the problem?

it seems that the code in Gemini API Overview is not correct,

Thanks, I'm sending a fix for this.

Here's the get_screen_data():

    screen = ImageGrab.grab(bbox=(0, 0, *primary_monitor_dimensions))

    screen = draw_mouse(screen)

    screen = screen.resize((int(screen.size[0] / 2), int(screen.size[1] / 2)))

    if save_screenshot:
        screen.save('screen.png')

    return screen```

ImageGrab comes from PIL.

github-actions · 2024-06-02T01:50:34Z

Marking this issue as stale since it has been open for 14 days with no activity. This issue will be closed if no further activity occurs.

MarkDaoust · 2024-06-03T15:00:17Z

This is caused because the code generates the bytes to send tries to create a JPEG file, but the image is RGBA.
Adding a .convert('RGB') before saving it fixes this.

In [13]: model = genai.GenerativeModel(model_name='gemini-pro-vision')

In [14]: model.generate_content([img2, "what's this"])
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/Projects/venv3/lib/python3.11/site-packages/PIL/JpegImagePlugin.py:650, in _save(im, fp, filename)
    649 try:
--> 650     rawmode = RAWMODE[im.mode]
    651 except KeyError as e:

KeyError: 'RGBA'

The above exception was the direct cause of the following exception:

OSError                                   Traceback (most recent call last)
Cell In[14], line 1
----> 1 model.generate_content([img2, "what's this"])

File ~/Projects/generative-ai-python/google/generativeai/generative_models.py:236, in GenerativeModel.generate_content(self, contents, generation_config, safety_settings, stream, tools, tool_config, request_options)
    233 if not contents:
    234     raise TypeError("contents must not be empty")
--> 236 request = self._prepare_request(
    237     contents=contents,
    238     generation_config=generation_config,
    239     safety_settings=safety_settings,
    240     tools=tools,
    241     tool_config=tool_config,
    242 )
    243 if self._client is None:
    244     self._client = client.get_default_generative_client()

File ~/Projects/generative-ai-python/google/generativeai/generative_models.py:139, in GenerativeModel._prepare_request(self, contents, generation_config, safety_settings, tools, tool_config)
    136 else:
    137     tool_config = content_types.to_tool_config(tool_config)
--> 139 contents = content_types.to_contents(contents)
    141 generation_config = generation_types.to_generation_config_dict(generation_config)
    142 merged_gc = self._generation_config.copy()

File ~/Projects/generative-ai-python/google/generativeai/types/content_types.py:293, in to_contents(contents)
    288     except TypeError:
    289         # If you get a TypeError here it's probably because that was a list
    290         # of parts, not a list of contents, so fall back to `to_content`.
    291         pass
--> 293 contents = [to_content(contents)]
    294 return contents

File ~/Projects/generative-ai-python/google/generativeai/types/content_types.py:256, in to_content(content)
    254     return content
    255 elif isinstance(content, Iterable) and not isinstance(content, str):
--> 256     return protos.Content(parts=[to_part(part) for part in content])
    257 else:
    258     # Maybe this is a Part?
    259     return protos.Content(parts=[to_part(content)])

File ~/Projects/generative-ai-python/google/generativeai/types/content_types.py:256, in <listcomp>(.0)
    254     return content
    255 elif isinstance(content, Iterable) and not isinstance(content, str):
--> 256     return protos.Content(parts=[to_part(part) for part in content])
    257 else:
    258     # Maybe this is a Part?
    259     return protos.Content(parts=[to_part(content)])

File ~/Projects/generative-ai-python/google/generativeai/types/content_types.py:224, in to_part(part)
    220     return protos.Part(function_response=part)
    222 else:
    223     # Maybe it can be turned into a blob?
--> 224     return protos.Part(inline_data=to_blob(part))

File ~/Projects/generative-ai-python/google/generativeai/types/content_types.py:164, in to_blob(blob)
    162     return blob
    163 elif isinstance(blob, IMAGE_TYPES):
--> 164     return image_to_blob(blob)
    165 else:
    166     if isinstance(blob, Mapping):

File ~/Projects/generative-ai-python/google/generativeai/types/content_types.py:89, in image_to_blob(image)
     87 if PIL is not None:
     88     if isinstance(image, PIL.Image.Image):
---> 89         return pil_to_blob(image)
     91 if IPython is not None:
     92     if isinstance(image, IPython.display.Image):

File ~/Projects/generative-ai-python/google/generativeai/types/content_types.py:79, in pil_to_blob(img)
     77     mime_type = "image/png"
     78 else:
---> 79     img.save(bytesio, format="JPEG")
     80     mime_type = "image/jpeg"
     81 bytesio.seek(0)

File ~/Projects/venv3/lib/python3.11/site-packages/PIL/Image.py:2439, in Image.save(self, fp, format, **params)
   2436         fp = builtins.open(filename, "w+b")
   2438 try:
-> 2439     save_handler(self, fp, filename)
   2440 except Exception:
   2441     if open_fp:

File ~/Projects/venv3/lib/python3.11/site-packages/PIL/JpegImagePlugin.py:653, in _save(im, fp, filename)
    651 except KeyError as e:
    652     msg = f"cannot write mode {im.mode} as JPEG"
--> 653     raise OSError(msg) from e
    655 info = im.encoderinfo
    657 dpi = [round(x) for x in info.get("dpi", (0, 0))]

FrostyTheSouthernSnowman added component:python sdk Issue/PR related to Python SDK type:bug Something isn't working labels Jan 2, 2024

MarkDaoust mentioned this issue May 17, 2024

Readme for new Gemini Code Samples on Recipie to be updated to accept Image #112

Closed

MarkDaoust added the status:awaiting user response Awaiting a response from the author label May 17, 2024

github-actions bot added the status:stale Issue/PR will be closed automatically if there's no further activity label Jun 2, 2024

MarkDaoust changed the title ~~Gemini Pro Vision generate_content doesn't handle PNG by default~~ Automatic image blob creation doesn't handle RGBA images with JPEG. Jun 3, 2024

MarkDaoust mentioned this issue Jun 3, 2024

Handle image mode #374

Merged

shilpakancharla closed this as completed in #374 Jun 3, 2024

BluePigman mentioned this issue Aug 19, 2024

Automatic image blob creation doesn't handle P images with JPEG #511

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatic image blob creation doesn't handle RGBA images with JPEG. #160

Automatic image blob creation doesn't handle RGBA images with JPEG. #160

FrostyTheSouthernSnowman commented Jan 2, 2024

Andy963 commented Mar 14, 2024

FrostyTheSouthernSnowman commented Mar 20, 2024

MarkDaoust commented May 17, 2024

ya-stack commented May 18, 2024

FrostyTheSouthernSnowman commented May 18, 2024

github-actions bot commented Jun 2, 2024

MarkDaoust commented Jun 3, 2024

Automatic image blob creation doesn't handle RGBA images with JPEG. #160

Automatic image blob creation doesn't handle RGBA images with JPEG. #160

Comments

FrostyTheSouthernSnowman commented Jan 2, 2024

Description of the bug:

Actual vs expected behavior:

Any other information you'd like to share?

Andy963 commented Mar 14, 2024

FrostyTheSouthernSnowman commented Mar 20, 2024

MarkDaoust commented May 17, 2024

ya-stack commented May 18, 2024

FrostyTheSouthernSnowman commented May 18, 2024

github-actions bot commented Jun 2, 2024

MarkDaoust commented Jun 3, 2024