|
| 1 | +<!-- |
| 2 | + SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
| 3 | + SPDX-License-Identifier: Apache-2.0 |
| 4 | +--> |
| 5 | +# Multimodal Data with NeMo Guardrails |
| 6 | + |
| 7 | +## About Working with Multimodal Data |
| 8 | + |
| 9 | +NeMo Guardrails toolkit supports adding safety checks to multimodal content---images and text. |
| 10 | +The support is for input and output guardrails only. |
| 11 | +Depending on the image reasoning model, you can specify the image to check as a base64 encoded data or as a URL. |
| 12 | + |
| 13 | +The safety check uses the image reasoning model as LLM as-a-judge to determine if the content is safe. |
| 14 | +The OpenAI, Llama Vision, and Llama Guard models can accept multimodal input and act as a judge model. |
| 15 | + |
| 16 | +You must ensure the image size and prompt length do not exceed the maximum context length of the model. |
| 17 | + |
| 18 | +## Sample Configuration |
| 19 | + |
| 20 | +1. Create a directory, such as `configs/content_safety_vision`, and add a `config.yml` file with the following content: |
| 21 | + |
| 22 | + ```{literalinclude} ../../examples/configs/content_safety_vision/config.yml |
| 23 | + :language: yaml |
| 24 | + ``` |
| 25 | + |
| 26 | +1. Add a `configs/content_safety_vision/prompts.yml` file with the following content: |
| 27 | + |
| 28 | + ```{literalinclude} ../../examples/configs/content_safety_vision/prompts.yml |
| 29 | + :language: yaml |
| 30 | + ``` |
| 31 | + |
| 32 | +## Example |
| 33 | + |
| 34 | +The following sample code uses the preceding configuration and sends requests to OpenAI endpoints. |
| 35 | +The sample image is a handgun. |
| 36 | + |
| 37 | +1. Set the OpenAI environment variable with your token: |
| 38 | + |
| 39 | + ```console |
| 40 | + export OPENAI_API_KEY=<api-key> |
| 41 | + ``` |
| 42 | + |
| 43 | +1. Import required libraries: |
| 44 | + |
| 45 | + ```{literalinclude} ../../examples/configs/content_safety_vision/demo.py |
| 46 | + :language: python |
| 47 | + :start-after: "# start-prerequisites" |
| 48 | + :end-before: "# end-prerequisites" |
| 49 | + ``` |
| 50 | + |
| 51 | +1. Load the vision content safety configuration: |
| 52 | + |
| 53 | + ```{literalinclude} ../../examples/configs/content_safety_vision/demo.py |
| 54 | + :language: python |
| 55 | + :start-after: "# start-config" |
| 56 | + :end-before: "# end-config" |
| 57 | + ``` |
| 58 | + |
| 59 | +1. Send an image reasoning request: |
| 60 | + |
| 61 | + ```{literalinclude} ../../examples/configs/content_safety_vision/demo.py |
| 62 | + :language: python |
| 63 | + :start-after: "# start-image-reasoning" |
| 64 | + :end-before: "# end-image-reasoning" |
| 65 | + ``` |
| 66 | + |
| 67 | +1. Send a potentially unsafe request: |
| 68 | + |
| 69 | + ```{literalinclude} ../../examples/configs/content_safety_vision/demo.py |
| 70 | + :language: python |
| 71 | + :start-after: "# start-potentially-unsafe" |
| 72 | + :end-before: "# end-potentially-unsafe" |
| 73 | + ``` |
| 74 | + |
| 75 | +## Tips for Base 64 Encoded Images |
| 76 | + |
| 77 | +Some models, such as the Llama Vision models, do not read an image from a URL. |
| 78 | +For these models, encode the image in base 64 and provide the encoded image to the model. |
| 79 | + |
| 80 | +The following code sample shows the common Python statements. |
| 81 | + |
| 82 | +```{code-block} python |
| 83 | +:emphasize-lines: 11, 23 |
| 84 | +
|
| 85 | +import base64 |
| 86 | +import json |
| 87 | +
|
| 88 | +from nemoguardrails import RailsConfig |
| 89 | +from nemoguardrails.rails.llm.llmrails import LLMRails |
| 90 | +
|
| 91 | +config = RailsConfig.from_path("./content_safety_vision") |
| 92 | +rails = LLMRails(config) |
| 93 | +
|
| 94 | +with open("<path-to-image>", "rb") as image_file: |
| 95 | + base64_image = base64.b64encode(image_file.read()).decode() |
| 96 | +
|
| 97 | +messages = [{ |
| 98 | + "role": "user", |
| 99 | + "content": [ |
| 100 | + { |
| 101 | + "type": "text", |
| 102 | + "text": "what is the surface color that the object is placed on?", |
| 103 | + }, |
| 104 | + { |
| 105 | + "type": "image_url", |
| 106 | + "image_url": { |
| 107 | + "url": f"data:image/jpeg;base64,{base64_image}" |
| 108 | + }, |
| 109 | + }, |
| 110 | + ], |
| 111 | +}] |
| 112 | +
|
| 113 | +response = rails.generate(messages=messages) |
| 114 | +print(json.dumps(response, indent=2)) |
| 115 | +``` |
0 commit comments