Skip to content

Image Captioning using Finetuned PaliGemma2 #112

Closed
@sitamgithub-MSIT

Description

@sitamgithub-MSIT

Description of the feature request:

Recently, PaliGemma2 was released. We also have a dedicated folder for PaliGemma2. Both of the notebooks are actually about the fine-tuning domain using Jax and Keras. But we also need an inferencing notebook with PaliGemma2 that is missing right now. So my proposed notebook will add an inferencing notebook using Keras with the latest released fine-tuned checkpoint for the image captioning task.

What problem are you trying to solve with this feature?

With this PaliGemma2 release, we have a finetuned checkpoint for the DOCCI dataset. This can be a great use case for showing the image captioning task with this latest checkpoint. We can extend the feature to multilingual use cases as well.

Any other information you'd like to share?

The notebook will run the 3B PaliGemma2 version with bfloat16 that eventually can be run in the Colab T4 GPU via the multibackend Keras and Keras Hub.

cc: @windmaple

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions