Multi modal eval fix (#38134)

w-javed · azure-sdk · weshaggard · web-flow · commit 3046e7a374ea · 2024-10-28T23:41:31.000Z
* Initial-Commit-multimodal * Fix * Sync eng/common directory with azure-sdk-tools for PR 9092 (#37713) * Export the subscription data from the service connection * Update deploy-test-resources.yml --------- Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com> Co-authored-by: Wes Haggard <weshaggard@users.noreply.github.com> * Removing private parameter from __call__ of AdversarialSimulator (#37709) * Update task_query_response.prompty remove required keys * Update task_simulate.prompty * Update task_query_response.prompty * Update task_simulate.prompty * Remove private variable and use kwargs * Add experimental tag to adv sim --------- Co-authored-by: Nagkumar Arkalgud <nagkumar@naarkalgworkmac.lan> * Enabling option to disable response payload on writes (#37365) * Initial draft * Adding tests * Renaming parameter * Update container.py * Renaming test file * Fixing LINT issues * Update container.py * Update _base.py * Update _base.py * Fixing tests * Fixing tests * Adding support to disable response payload on write for AIO * Update CHANGELOG.md * Update _cosmos_client.py * Reacting to code review comments * Addressing code review feedback * Addressed CR feedback * Fixing pyLint errors * Fixing pylint errors * Update test_crud.py * Fixing svc regression * Update sdk/cosmos/azure-cosmos/azure/cosmos/aio/_container.py Co-authored-by: Anna Tisch <antisch@microsoft.com> * Reacting to code review feedback. * Update container.py * Update test_query_vector_similarity.py --------- Co-authored-by: Anna Tisch <antisch@microsoft.com> * deprecate azure_germany (#37654) * deprecate azure_germany * update * update * Update sdk/identity/azure-identity/azure/identity/_constants.py Co-authored-by: Paul Van Eck <paulvaneck@microsoft.com> * update --------- Co-authored-by: Paul Van Eck <paulvaneck@microsoft.com> * Add default impl to handle token challenges (#37652) * Add default impl to handle token challenges * update version * update * update * update * update * Update sdk/core/azure-core/azure/core/pipeline/policies/_utils.py Co-authored-by: Paul Van Eck <paulvaneck@microsoft.com> * Update sdk/core/azure-core/azure/core/pipeline/policies/_utils.py Co-authored-by: Paul Van Eck <paulvaneck@microsoft.com> * update * Update sdk/core/azure-core/tests/test_utils.py Co-authored-by: Paul Van Eck <paulvaneck@microsoft.com> * Update sdk/core/azure-core/azure/core/pipeline/policies/_utils.py Co-authored-by: Paul Van Eck <paulvaneck@microsoft.com> * update --------- Co-authored-by: Paul Van Eck <paulvaneck@microsoft.com> * Make Credentials Required for Content Safety and Protected Materials Evaluators (#37707) * Make Credentials Required for Content Safety Evaluators * fix a typo * lint, fix content safety evaluator * revert test change * remove credential from rai_service * addFeedRangesAndUseFeedRangeInQueryChangeFeed (#37687) * Add getFeedRanges API * Add feedRange support in query changeFeed Co-authored-by: annie-mac <xinlian@microsoft.com> * Update release date for core (#37723) * Improvements to mindependency dev_requirement conflict resolution (#37669) * during mindependency runs, dev_requirements on local relative paths are now checked for conflict with the targeted set of minimum dependencies * multiple type clarifications within azure-sdk-tools * added tests for new conflict resolution logic --------- Co-authored-by: McCoy Patiño <39780829+mccoyp@users.noreply.github.com> * Need to add environment to subscription configuration (#37726) Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com> * Enable samples for formrecognizer (#37676) * multi-modal-changes * fixes * Fix with latest * dict-fix * adding-protected-material * adding-protected-material * adding-protected-material * bumping-version * adding assets * Added image in simulator * Added image in simulator * bumping-version * push-asset * assets * pushing asset * remove-containt-on-key * asset * asset2 * asset3 * asset4 * adding conftest * conftest * cred fix * asset-new * fix * asset * adding multi-modal-without-tests * asset-from-main * asset-from-main * fix * adding one test only * new asset * tests,fix: Sanitizer should replace with enum value not enum name * test-asset * [AutoRelease] t2-containerservicefleet-2024-09-24-42036(can only be merged by SDK owner) (#37538) * code and test * Update CHANGELOG.md * update-testcase --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <119990644+ChenxiJiang333@users.noreply.github.com> Co-authored-by: ChenxiJiang333 <v-chenjiang@microsoft.com> * [AutoRelease] t2-dns-2024-09-25-81486(can only be merged by SDK owner) (#37560) * code and test * update-testcase * Update CHANGELOG.md * Update test_mgmt_dns_test.py --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <v-chenjiang@microsoft.com> Co-authored-by: ChenxiJiang333 <119990644+ChenxiJiang333@users.noreply.github.com> * [AutoRelease] t2-appconfiguration-2024-10-09-68726(can only be merged by SDK owner) (#37800) * code and test * update-testcase * Update pyproject.toml --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <v-chenjiang@microsoft.com> Co-authored-by: Yuchao Yan <yuchaoyan@microsoft.com> * code and test (#37855) Co-authored-by: azure-sdk <PythonSdkPipelines> * [AutoRelease] t2-servicefabricmanagedclusters-2024-10-08-57405(can only be merged by SDK owner) (#37768) * code and test * update-testcase * update-testcases --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <v-chenjiang@microsoft.com> * [AutoRelease] t2-containerinstance-2024-10-21-66631(can only be merged by SDK owner) (#38005) * code and test * update-testcase * Update CHANGELOG.md * Update CHANGELOG.md * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <v-chenjiang@microsoft.com> Co-authored-by: ChenxiJiang333 <119990644+ChenxiJiang333@users.noreply.github.com> * [sdk generation pipeline] bump typespec-python 0.36.1 (#38008) * update version * update package.json * [AutoRelease] t2-dnsresolver-2024-10-12-16936(can only be merged by SDK owner) (#37864) * code and test * update-testcase * Update CHANGELOG.md * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <v-chenjiang@microsoft.com> Co-authored-by: ChenxiJiang333 <119990644+ChenxiJiang333@users.noreply.github.com> Co-authored-by: Yuchao Yan <yuchaoyan@microsoft.com> * new asset after fix in conftest * asset * chore: Update assets.json * Move perf pipelines to TME subscription (#38020) Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com> * fix * after-comments * fix * asset * new asset with 1 test recording only * chore: Update assets.json * conftest fix * assets change * new test * few changes * removing proxy start * added all tests * asset * fixes * fixes with asset * asset-after-tax * enabling 2 more tests * unit test fix * asset * new asset * fixes per comments * changes by black * merge fix * pylint fix * pylint fix * ground test fix * fixes - pylint, black, mypy * more tests * docstring fixes * doc string fix * asset * few updates after Nagkumar review * fix * fix mypy --------- Co-authored-by: Azure SDK Bot <53356347+azure-sdk@users.noreply.github.com> Co-authored-by: Wes Haggard <Wes.Haggard@microsoft.com> Co-authored-by: Wes Haggard <weshaggard@users.noreply.github.com> Co-authored-by: Nagkumar Arkalgud <nagkumar91@users.noreply.github.com> Co-authored-by: Nagkumar Arkalgud <nagkumar@naarkalgworkmac.lan> Co-authored-by: Fabian Meiswinkel <fabianm@microsoft.com> Co-authored-by: Anna Tisch <antisch@microsoft.com> Co-authored-by: Xiang Yan <xiangsjtu@gmail.com> Co-authored-by: Paul Van Eck <paulvaneck@microsoft.com> Co-authored-by: Neehar Duvvuri <40341266+needuv@users.noreply.github.com> Co-authored-by: Annie Liang <64233642+xinlian12@users.noreply.github.com> Co-authored-by: annie-mac <xinlian@microsoft.com> Co-authored-by: Scott Beddall <45376673+scbedd@users.noreply.github.com> Co-authored-by: McCoy Patiño <39780829+mccoyp@users.noreply.github.com> Co-authored-by: kdestin <101366538+kdestin@users.noreply.github.com> Co-authored-by: ChenxiJiang333 <119990644+ChenxiJiang333@users.noreply.github.com> Co-authored-by: ChenxiJiang333 <v-chenjiang@microsoft.com> Co-authored-by: Yuchao Yan <yuchaoyan@microsoft.com>
diff --git a/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_utils.py b/sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_utils.py
@@ -90,24 +90,25 @@ def _store_multimodal_content(messages, tmpdir: str):
 
     # traverse all messages and replace base64 image data with new file name.
     for message in messages:
-        for content in message.get("content", []):
-            if content.get("type") == "image_url":
-                image_url = content.get("image_url")
-                if image_url and "url" in image_url and image_url["url"].startswith("data:image/jpg;base64,"):
-                    # Extract the base64 string
-                    base64image = image_url["url"].replace("data:image/jpg;base64,", "")
-
-                    # Generate a unique filename
-                    image_file_name = f"{str(uuid.uuid4())}.jpg"
-                    image_url["url"] = f"images/{image_file_name}"  # Replace the base64 URL with the file path
-
-                    # Decode the base64 string to binary image data
-                    image_data_binary = base64.b64decode(base64image)
-
-                    # Write the binary image data to the file
-                    image_file_path = os.path.join(images_folder_path, image_file_name)
-                    with open(image_file_path, "wb") as f:
-                        f.write(image_data_binary)
+        if isinstance(message.get("content", []), list):
+            for content in message.get("content", []):
+                if content.get("type") == "image_url":
+                    image_url = content.get("image_url")
+                    if image_url and "url" in image_url and image_url["url"].startswith("data:image/jpg;base64,"):
+                        # Extract the base64 string
+                        base64image = image_url["url"].replace("data:image/jpg;base64,", "")
+
+                        # Generate a unique filename
+                        image_file_name = f"{str(uuid.uuid4())}.jpg"
+                        image_url["url"] = f"images/{image_file_name}"  # Replace the base64 URL with the file path
+
+                        # Decode the base64 string to binary image data
+                        image_data_binary = base64.b64decode(base64image)
+
+                        # Write the binary image data to the file
+                        image_file_path = os.path.join(images_folder_path, image_file_name)
+                        with open(image_file_path, "wb") as f:
+                            f.write(image_data_binary)
 
 
 def _log_metrics_and_instance_results(
diff --git a/sdk/evaluation/azure-ai-evaluation/tests/unittests/data/generated_conv_image_urls.jsonl b/sdk/evaluation/azure-ai-evaluation/tests/unittests/data/generated_conv_image_urls.jsonl
@@ -0,0 +1,2 @@
+{"messages":[{"role":"system","content":[{"type":"text","text":"This is a nature boardwalk at the University of Wisconsin-Madison."}]},{"role":"user","content":[{"type":"text","text":"Can you describe this image?"},{"type":"image_url","image_url":{"url":"https://cdn.britannica.com/68/178268-050-5B4E7FB6/Tom-Cruise-2013.jpg"}}]}]}
+{"messages":[{"role":"system","content":[{"type":"text","text":"This is a nature boardwalk at the University of Wisconsin-Madison."}]},{"role":"user","content":[{"type":"text","text":"Can you describe this image?"},{"type":"image_url","image_url":{"url":"https://cdn.britannica.com/68/178268-050-5B4E7FB6/Tom-Cruise-2013.jpg"}}]}]}
diff --git a/sdk/evaluation/azure-ai-evaluation/tests/unittests/data/generated_conv_images.jsonl b/sdk/evaluation/azure-ai-evaluation/tests/unittests/data/generated_conv_images.jsonl
diff --git a/sdk/evaluation/azure-ai-evaluation/tests/unittests/test_eval_run.py b/sdk/evaluation/azure-ai-evaluation/tests/unittests/test_eval_run.py
@@ -4,8 +4,10 @@
 import time
 from unittest.mock import MagicMock, patch
 from uuid import uuid4
-
+import tempfile
 import jwt
+import pandas as pd
+import pathlib
 import pytest
 from promptflow.azure._utils._token_cache import ArmTokenCache
 
@@ -303,6 +305,30 @@ def test_wrong_artifact_path(
             assert len(caplog.records) == 1
             assert expected_error in caplog.records[0].message
 
+    def test_store_multi_modal_no_images(self, token_mock, caplog):
+        data_path = os.path.join(pathlib.Path(__file__).parent.resolve(), "data")
+        data_file = os.path.join(data_path, "generated_qa_chat_conv.jsonl")
+        data_convo = pd.read_json(data_file, lines=True)
+        with tempfile.TemporaryDirectory() as tmpdir:
+            for value in data_convo["messages"]:
+                ev_utils._store_multimodal_content(value, tmpdir)
+
+    def test_store_multi_modal_image_urls(self, token_mock, caplog):
+        data_path = os.path.join(pathlib.Path(__file__).parent.resolve(), "data")
+        data_file = os.path.join(data_path, "generated_conv_image_urls.jsonl")
+        data_convo = pd.read_json(data_file, lines=True)
+        with tempfile.TemporaryDirectory() as tmpdir:
+            for value in data_convo["messages"]:
+                ev_utils._store_multimodal_content(value, tmpdir)
+
+    def test_store_multi_modal_images(self, token_mock, caplog):
+        data_path = os.path.join(pathlib.Path(__file__).parent.resolve(), "data")
+        data_file = os.path.join(data_path, "generated_conv_images.jsonl")
+        data_convo = pd.read_json(data_file, lines=True)
+        with tempfile.TemporaryDirectory() as tmpdir:
+            for value in data_convo["messages"]:
+                ev_utils._store_multimodal_content(value, tmpdir)
+
     def test_log_metrics_and_instance_results_logs_error(self, token_mock, caplog):
         """Test that we are logging the error when there is no trace destination."""
         logger = logging.getLogger(ev_utils.__name__)

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+{"messages":[{"role":"system","content":[{"type":"text","text":"This is a nature boardwalk at the University of Wisconsin-Madison."}]},{"role":"user","content":[{"type":"text","text":"Can you describe this image?"},{"type":"image_url","image_url":{"url":"https://cdn.britannica.com/68/178268-050-5B4E7FB6/Tom-Cruise-2013.jpg"}}]}]}`
	`2`	`+{"messages":[{"role":"system","content":[{"type":"text","text":"This is a nature boardwalk at the University of Wisconsin-Madison."}]},{"role":"user","content":[{"type":"text","text":"Can you describe this image?"},{"type":"image_url","image_url":{"url":"https://cdn.britannica.com/68/178268-050-5B4E7FB6/Tom-Cruise-2013.jpg"}}]}]}`