Skip to content

Commit 3046e7a

Browse files
w-javedazure-sdkweshaggardnagkumar91Nagkumar Arkalgud
authored
Multi modal eval fix (#38134)
* Initial-Commit-multimodal * Fix * Sync eng/common directory with azure-sdk-tools for PR 9092 (#37713) * Export the subscription data from the service connection * Update deploy-test-resources.yml --------- Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Wes Haggard <[email protected]> * Removing private parameter from __call__ of AdversarialSimulator (#37709) * Update task_query_response.prompty remove required keys * Update task_simulate.prompty * Update task_query_response.prompty * Update task_simulate.prompty * Remove private variable and use kwargs * Add experimental tag to adv sim --------- Co-authored-by: Nagkumar Arkalgud <[email protected]> * Enabling option to disable response payload on writes (#37365) * Initial draft * Adding tests * Renaming parameter * Update container.py * Renaming test file * Fixing LINT issues * Update container.py * Update _base.py * Update _base.py * Fixing tests * Fixing tests * Adding support to disable response payload on write for AIO * Update CHANGELOG.md * Update _cosmos_client.py * Reacting to code review comments * Addressing code review feedback * Addressed CR feedback * Fixing pyLint errors * Fixing pylint errors * Update test_crud.py * Fixing svc regression * Update sdk/cosmos/azure-cosmos/azure/cosmos/aio/_container.py Co-authored-by: Anna Tisch <[email protected]> * Reacting to code review feedback. * Update container.py * Update test_query_vector_similarity.py --------- Co-authored-by: Anna Tisch <[email protected]> * deprecate azure_germany (#37654) * deprecate azure_germany * update * update * Update sdk/identity/azure-identity/azure/identity/_constants.py Co-authored-by: Paul Van Eck <[email protected]> * update --------- Co-authored-by: Paul Van Eck <[email protected]> * Add default impl to handle token challenges (#37652) * Add default impl to handle token challenges * update version * update * update * update * update * Update sdk/core/azure-core/azure/core/pipeline/policies/_utils.py Co-authored-by: Paul Van Eck <[email protected]> * Update sdk/core/azure-core/azure/core/pipeline/policies/_utils.py Co-authored-by: Paul Van Eck <[email protected]> * update * Update sdk/core/azure-core/tests/test_utils.py Co-authored-by: Paul Van Eck <[email protected]> * Update sdk/core/azure-core/azure/core/pipeline/policies/_utils.py Co-authored-by: Paul Van Eck <[email protected]> * update --------- Co-authored-by: Paul Van Eck <[email protected]> * Make Credentials Required for Content Safety and Protected Materials Evaluators (#37707) * Make Credentials Required for Content Safety Evaluators * fix a typo * lint, fix content safety evaluator * revert test change * remove credential from rai_service * addFeedRangesAndUseFeedRangeInQueryChangeFeed (#37687) * Add getFeedRanges API * Add feedRange support in query changeFeed Co-authored-by: annie-mac <[email protected]> * Update release date for core (#37723) * Improvements to mindependency dev_requirement conflict resolution (#37669) * during mindependency runs, dev_requirements on local relative paths are now checked for conflict with the targeted set of minimum dependencies * multiple type clarifications within azure-sdk-tools * added tests for new conflict resolution logic --------- Co-authored-by: McCoy Patiño <[email protected]> * Need to add environment to subscription configuration (#37726) Co-authored-by: Wes Haggard <[email protected]> * Enable samples for formrecognizer (#37676) * multi-modal-changes * fixes * Fix with latest * dict-fix * adding-protected-material * adding-protected-material * adding-protected-material * bumping-version * adding assets * Added image in simulator * Added image in simulator * bumping-version * push-asset * assets * pushing asset * remove-containt-on-key * asset * asset2 * asset3 * asset4 * adding conftest * conftest * cred fix * asset-new * fix * asset * adding multi-modal-without-tests * asset-from-main * asset-from-main * fix * adding one test only * new asset * tests,fix: Sanitizer should replace with enum value not enum name * test-asset * [AutoRelease] t2-containerservicefleet-2024-09-24-42036(can only be merged by SDK owner) (#37538) * code and test * Update CHANGELOG.md * update-testcase --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-dns-2024-09-25-81486(can only be merged by SDK owner) (#37560) * code and test * update-testcase * Update CHANGELOG.md * Update test_mgmt_dns_test.py --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-appconfiguration-2024-10-09-68726(can only be merged by SDK owner) (#37800) * code and test * update-testcase * Update pyproject.toml --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> * code and test (#37855) Co-authored-by: azure-sdk <PythonSdkPipelines> * [AutoRelease] t2-servicefabricmanagedclusters-2024-10-08-57405(can only be merged by SDK owner) (#37768) * code and test * update-testcase * update-testcases --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-containerinstance-2024-10-21-66631(can only be merged by SDK owner) (#38005) * code and test * update-testcase * Update CHANGELOG.md * Update CHANGELOG.md * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * [sdk generation pipeline] bump typespec-python 0.36.1 (#38008) * update version * update package.json * [AutoRelease] t2-dnsresolver-2024-10-12-16936(can only be merged by SDK owner) (#37864) * code and test * update-testcase * Update CHANGELOG.md * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> * new asset after fix in conftest * asset * chore: Update assets.json * Move perf pipelines to TME subscription (#38020) Co-authored-by: Wes Haggard <[email protected]> * fix * after-comments * fix * asset * new asset with 1 test recording only * chore: Update assets.json * conftest fix * assets change * new test * few changes * removing proxy start * added all tests * asset * fixes * fixes with asset * asset-after-tax * enabling 2 more tests * unit test fix * asset * new asset * fixes per comments * changes by black * merge fix * pylint fix * pylint fix * ground test fix * fixes - pylint, black, mypy * more tests * docstring fixes * doc string fix * asset * few updates after Nagkumar review * fix * fix mypy --------- Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Nagkumar Arkalgud <[email protected]> Co-authored-by: Nagkumar Arkalgud <[email protected]> Co-authored-by: Fabian Meiswinkel <[email protected]> Co-authored-by: Anna Tisch <[email protected]> Co-authored-by: Xiang Yan <[email protected]> Co-authored-by: Paul Van Eck <[email protected]> Co-authored-by: Neehar Duvvuri <[email protected]> Co-authored-by: Annie Liang <[email protected]> Co-authored-by: annie-mac <[email protected]> Co-authored-by: Scott Beddall <[email protected]> Co-authored-by: McCoy Patiño <[email protected]> Co-authored-by: kdestin <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Yuchao Yan <[email protected]>
1 parent 9592bf7 commit 3046e7a

File tree

4 files changed

+49
-19
lines changed

4 files changed

+49
-19
lines changed

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_utils.py

+19-18
Original file line numberDiff line numberDiff line change
@@ -90,24 +90,25 @@ def _store_multimodal_content(messages, tmpdir: str):
9090

9191
# traverse all messages and replace base64 image data with new file name.
9292
for message in messages:
93-
for content in message.get("content", []):
94-
if content.get("type") == "image_url":
95-
image_url = content.get("image_url")
96-
if image_url and "url" in image_url and image_url["url"].startswith("data:image/jpg;base64,"):
97-
# Extract the base64 string
98-
base64image = image_url["url"].replace("data:image/jpg;base64,", "")
99-
100-
# Generate a unique filename
101-
image_file_name = f"{str(uuid.uuid4())}.jpg"
102-
image_url["url"] = f"images/{image_file_name}" # Replace the base64 URL with the file path
103-
104-
# Decode the base64 string to binary image data
105-
image_data_binary = base64.b64decode(base64image)
106-
107-
# Write the binary image data to the file
108-
image_file_path = os.path.join(images_folder_path, image_file_name)
109-
with open(image_file_path, "wb") as f:
110-
f.write(image_data_binary)
93+
if isinstance(message.get("content", []), list):
94+
for content in message.get("content", []):
95+
if content.get("type") == "image_url":
96+
image_url = content.get("image_url")
97+
if image_url and "url" in image_url and image_url["url"].startswith("data:image/jpg;base64,"):
98+
# Extract the base64 string
99+
base64image = image_url["url"].replace("data:image/jpg;base64,", "")
100+
101+
# Generate a unique filename
102+
image_file_name = f"{str(uuid.uuid4())}.jpg"
103+
image_url["url"] = f"images/{image_file_name}" # Replace the base64 URL with the file path
104+
105+
# Decode the base64 string to binary image data
106+
image_data_binary = base64.b64decode(base64image)
107+
108+
# Write the binary image data to the file
109+
image_file_path = os.path.join(images_folder_path, image_file_name)
110+
with open(image_file_path, "wb") as f:
111+
f.write(image_data_binary)
111112

112113

113114
def _log_metrics_and_instance_results(
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
{"messages":[{"role":"system","content":[{"type":"text","text":"This is a nature boardwalk at the University of Wisconsin-Madison."}]},{"role":"user","content":[{"type":"text","text":"Can you describe this image?"},{"type":"image_url","image_url":{"url":"https://cdn.britannica.com/68/178268-050-5B4E7FB6/Tom-Cruise-2013.jpg"}}]}]}
2+
{"messages":[{"role":"system","content":[{"type":"text","text":"This is a nature boardwalk at the University of Wisconsin-Madison."}]},{"role":"user","content":[{"type":"text","text":"Can you describe this image?"},{"type":"image_url","image_url":{"url":"https://cdn.britannica.com/68/178268-050-5B4E7FB6/Tom-Cruise-2013.jpg"}}]}]}

sdk/evaluation/azure-ai-evaluation/tests/unittests/data/generated_conv_images.jsonl

+1
Large diffs are not rendered by default.

sdk/evaluation/azure-ai-evaluation/tests/unittests/test_eval_run.py

+27-1
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,10 @@
44
import time
55
from unittest.mock import MagicMock, patch
66
from uuid import uuid4
7-
7+
import tempfile
88
import jwt
9+
import pandas as pd
10+
import pathlib
911
import pytest
1012
from promptflow.azure._utils._token_cache import ArmTokenCache
1113

@@ -303,6 +305,30 @@ def test_wrong_artifact_path(
303305
assert len(caplog.records) == 1
304306
assert expected_error in caplog.records[0].message
305307

308+
def test_store_multi_modal_no_images(self, token_mock, caplog):
309+
data_path = os.path.join(pathlib.Path(__file__).parent.resolve(), "data")
310+
data_file = os.path.join(data_path, "generated_qa_chat_conv.jsonl")
311+
data_convo = pd.read_json(data_file, lines=True)
312+
with tempfile.TemporaryDirectory() as tmpdir:
313+
for value in data_convo["messages"]:
314+
ev_utils._store_multimodal_content(value, tmpdir)
315+
316+
def test_store_multi_modal_image_urls(self, token_mock, caplog):
317+
data_path = os.path.join(pathlib.Path(__file__).parent.resolve(), "data")
318+
data_file = os.path.join(data_path, "generated_conv_image_urls.jsonl")
319+
data_convo = pd.read_json(data_file, lines=True)
320+
with tempfile.TemporaryDirectory() as tmpdir:
321+
for value in data_convo["messages"]:
322+
ev_utils._store_multimodal_content(value, tmpdir)
323+
324+
def test_store_multi_modal_images(self, token_mock, caplog):
325+
data_path = os.path.join(pathlib.Path(__file__).parent.resolve(), "data")
326+
data_file = os.path.join(data_path, "generated_conv_images.jsonl")
327+
data_convo = pd.read_json(data_file, lines=True)
328+
with tempfile.TemporaryDirectory() as tmpdir:
329+
for value in data_convo["messages"]:
330+
ev_utils._store_multimodal_content(value, tmpdir)
331+
306332
def test_log_metrics_and_instance_results_logs_error(self, token_mock, caplog):
307333
"""Test that we are logging the error when there is no trace destination."""
308334
logger = logging.getLogger(ev_utils.__name__)

0 commit comments

Comments
 (0)