-
Notifications
You must be signed in to change notification settings - Fork 90
Use base64 format to improve performance of embeddings #211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Feel free to send a PR! Note that both the Python and JS SDKs are also generated, but we are able to add custom code (which will be preserved) |
Results of Performance Measurements Using Java SDK After conducting measurements with the Java SDK, the following results were observed: Base64 (File Size: 84,390 bytes)
Float (File Size: 209,678 bytes)
Summary of Results: I generated 10 random strings and measured both the latency and file size when retrieving these strings in two different formats: Base64 and Float (Java Double). The results showed that there were no significant differences in latency across 10 calls. However, the file size between the two formats was significantly different, with the Float format being approximately 2.5 times larger than the Base64 format. Based on these findings, adjusting the default embedding format could lead to improved network bandwidth efficiency. Proposed Implementation Plan:
It would be beneficial to retain the option to output results in Base64, as some developers may prefer to work with values in the Base64 format for their specific use cases. This proposed approach aims to strike a balance between flexibility for developers and optimizing network bandwidth by allowing the selection of the most appropriate data format for their needs. I used following sample code to evaluate. package com.openai.example;
import java.io.BufferedWriter;
import java.io.FileWriter;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.file.StandardOpenOption;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
import com.azure.identity.AuthenticationUtil;
import com.azure.identity.DefaultAzureCredentialBuilder;
import com.openai.azure.AzureOpenAIServiceVersion;
import com.openai.azure.credential.AzureApiKeyCredential;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.credential.BearerTokenCredential;
import com.openai.credential.Credential;
import com.openai.models.ChatCompletionCreateParams;
import com.openai.models.ChatModel;
import com.openai.models.CreateEmbeddingResponse;
import com.openai.models.EmbeddingCreateParams;
import com.openai.services.async.EmbeddingServiceAsync;
import com.openai.services.async.EmbeddingServiceAsyncImpl;
import com.openai.services.blocking.EmbeddingService;
import com.openai.models.EmbeddingCreateParams.Builder;
import com.openai.models.EmbeddingCreateParams.EncodingFormat;
public final class EmbeddingsExample {
private EmbeddingsExample() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.builder()
.baseUrl("https://AOAI-INSTANCE.openai.azure.com")
.credential(AzureApiKeyCredential.create(""))
.azureServiceVersion(AzureOpenAIServiceVersion.getV2024_02_15_PREVIEW()).build();
// Created Dummy data
List<String> poems = new ArrayList<>();
poems.add("In the quiet night, stars whisper secrets, dreams take flight.");
poems.add("Beneath the moon's glow, shadows dance, hearts begin to know.");
poems.add("Waves crash on the shore, time stands still, love forevermore.");
poems.add("Autumn leaves fall, painting the ground, nature's final call.");
poems.add("Morning dew glistens, a new day dawns, hope always listens.");
poems.add("Mountains stand tall, silent guardians, witnessing it all.");
poems.add("In a field of green, flowers bloom bright, a serene scene.");
poems.add("Winter's chill bites, fireside warmth, cozy, long nights.");
poems.add("Spring's gentle breeze, life awakens, hearts find ease.");
poems.add("Sunset hues blend, day meets night, a perfect end.");
EmbeddingService embeddings = client.embeddings();
try {
long startTime = System.nanoTime();
List<String> results = poems.stream().map(poem -> {
EmbeddingCreateParams embeddingCreateParams = EmbeddingCreateParams.builder()
.input(poem)
// .encodingFormat(EncodingFormat.BASE64)
.model("text-embedding-3-small")
.build();
CreateEmbeddingResponse createEmbeddingResponse =
embeddings.create(embeddingCreateParams);
return createEmbeddingResponse.toString();
}).collect(Collectors.toList());
Files.write(Paths.get("result-base64.txt"), results, StandardCharsets.UTF_8,
StandardOpenOption.CREATE, StandardOpenOption.WRITE);
long endTime = System.nanoTime();
long durationInMillis = (endTime - startTime) / 1_000_000;
System.out.println("Execution time: " + durationInMillis + " ms");
} catch (IOException e) {
e.printStackTrace();
}
}
} |
This is the actual result data of the above program. |
And I notice that there is a bug on current implementation. if I execute following code, it will run successfully. EmbeddingCreateParams embeddingCreateParams = EmbeddingCreateParams.builder()
.input("Sunset hues blend, day meets night, a perfect end.")
// .encodingFormat(EncodingFormat.BASE64)
.model("text-embedding-3-small").build();
CreateEmbeddingResponse createEmbeddingResponse = embeddings.create(embeddingCreateParams);
createEmbeddingResponse.data().forEach(embedding -> {
embedding.embedding().forEach(System.out::println);
}); Output is look like follows
However If I use Base64, following exception occure EmbeddingCreateParams embeddingCreateParams = EmbeddingCreateParams.builder()
.input("Sunset hues blend, day meets night, a perfect end.")
.encodingFormat(EncodingFormat.BASE64)
.model("text-embedding-3-small").build();
CreateEmbeddingResponse createEmbeddingResponse = embeddings.create(embeddingCreateParams);
createEmbeddingResponse.data().forEach(embedding -> {
embedding.embedding().forEach(System.out::println);
}); Actual Exception is as follows
Because currently ProposalWould it be better to remove the current Updated Proposal (3/10)Based on the actual behavior in REST response, I have thought about it a bit more. Following is the REST invokation Sample: curl -X POST "https://AOAI.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15" \
-H "Content-Type: application/json" \
-H "api-key: $API-KEY" \
-d '{
"input": "SOME DATA",
"model": "text-embedding-3-small",
}'
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [
0.010497476,
-0.030576246,
-0.007097663, curl -X POST "https://AOAI.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15" \
-H "Content-Type: application/json" \
-H "api-key: $API-KEY" \
-d '{
"input": "SOME DATA",
"model": "text-embedding-3-small",
"encoding_format": "base64"
}'
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": "m/0rPAl7+ryDk+i77Q0gvHsqizwBYSc79xjtOZrqyjy8THo8XvE0vSxaG7wFrGc8prluuxM/Yrxyk8Y8rOMLu3QIE71tsdI8KpbEus22rDw7y6489iASvJ4I6DwmFjy9Qu0mPet+kTwfv3s8zImJPDYeAz3Pcl48Vx7Hux7HoLy7VB89yi5/vOjcITslA1s7fhNgvNinN7whg9K8FWUkvc56gz1oz149o5tRPS/HQj1HMQY8L8fCPN8rG7zwekc8U9OGPVcEBTxGHiU6wt9jPPOYZLx6MWy8Gz9zvIeqJDvYcu+8SSlhvEkp4TxE3qC8+K+gujQLIjsbJTE76y+HPGPtars/Fm88HFJUPH1qj7vzfiK8bbHSPOZnVbysTNi64jbXvD28KD0y5V89iAwQPHp5lTx/qhM70IW/vLy2Cr1wz2+8w6O6PLcjIb3tDSA9guLyOxQ4gbyLk3m9UDEXPM0YGLoPIUU92uc7vU1AHTz9kRS9oZAVPRpHGDzFZxE8MuXfPMpJhbxSPNM8SF4pvMLFoTosWps8dhNPPGUTrT2VCNc7bE9nOnVP+Ds2HgM83FwIPMdyTTzhhWG9WovuvIzborxbnk+9VgvmPODckLyivTg9I1oKPWcmDj08+FE77mdmPFB4fD1gtQs9o4GPvUOW9zwWeAU9y3YoPKI5Zjtqk7W88T6evBpHmDx7RE29gIgsvIZi+zpKIgA9OF4 As a result, since the result returns with "embedding," I believe it would be better to handle both within the embedding() method rather than splitting it into two separate methods (above proposal). import com.fasterxml.jackson.databind.annotation.JsonDeserialize;
import java.util.List;
import java.util.Optional;
@JsonDeserialize(using = EmbeddingValueDeserializer.class)
public class EmbeddingValue {
private Optional<List<Double>> floatEmbedding = Optional.empty();
private Optional<String> base64Embedding = Optional.empty();
public Optional<List<Double>> getFloatEmbedding() {
return floatEmbedding;
}
public void setFloatEmbedding(Optional<List<Double>> floats) {
this.floatEmbedding = floats;
}
public Optional<String> getBase64Embedding() {
return base64Embedding;
}
public void setBase64Embedding(Optional<String> base64) {
this.base64Embedding = base64;
}
} import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.databind.*;
import com.fasterxml.jackson.databind.node.ArrayNode;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Optional;
public class EmbeddingValueDeserializer extends JsonDeserializer<EmbeddingValue> {
@Override
public EmbeddingValue deserialize(JsonParser jp, DeserializationContext ctxt) throws IOException {
EmbeddingValue embeddingValue = new EmbeddingValue();
JsonNode node = jp.getCodec().readTree(jp);
if (node.isArray()) {
List<Double> floats = new ArrayList<>();
for (JsonNode item : (ArrayNode) node) {
floats.add(item.asDouble());
}
embeddingValue.setFloatEmbedding(Optional.of(floats));
} else if (node.isTextual()) {
embeddingValue.setBase64Embedding(Optional.of(node.asText()));
}
return embeddingValue;
}
} import com.fasterxml.jackson.annotation.JsonProperty;
import java.util.List;
public class EmbeddingResponse {
private String object;
private List<DataItem> data;
public String getObject() {
return object;
}
public void setObject(String object) {
this.object = object;
}
public List<DataItem> getData() {
return data;
}
public void setData(List<DataItem> data) {
this.data = data;
}
public static class DataItem {
private String object;
private int index;
@JsonProperty("embedding")
private EmbeddingValue embeddingValue;
public String getObject() {
return object;
}
public void setObject(String object) {
this.object = object;
}
public int getIndex() {
return index;
}
public void setIndex(int index) {
this.index = index;
}
public EmbeddingValue getEmbeddingValue() {
return embeddingValue;
}
public void setEmbeddingValue(EmbeddingValue embeddingValue) {
this.embeddingValue = embeddingValue;
}
}
} import com.fasterxml.jackson.databind.ObjectMapper;
import java.util.Optional;
public final class EmbeddingsExample {
private EmbeddingsExample() {}
public static void main(String[] args) {
// ...existing code...
// Sample JSON Response
String jsonResponse = "{ \"object\": \"list\", \"data\": [ { \"object\": \"embedding\", \"index\": 0, \"embedding\": [ 0.010497476, -0.030576246 ] } ] }";
try {
ObjectMapper mapper = new ObjectMapper();
EmbeddingResponse customResponse =
mapper.readValue(jsonResponse, EmbeddingResponse.class);
customResponse.getData().forEach(item -> {
EmbeddingValue embeddingValue = item.getEmbeddingValue();
Optional<List<Double>> floatEmbedding = embeddingValue.getFloatEmbedding();
Optional<String> base64Embedding = embeddingValue.getBase64Embedding();
floatEmbedding.ifPresent(floats -> {
System.out.println("Float embedding: " + floats);
});
base64Embedding.ifPresent(base64 -> {
System.out.println("Base64 embedding: " + base64);
});
});
} catch (Exception e) {
e.printStackTrace();
}
// ...existing code...
}
} |
This commit includes the fix described in Issue openai#211. * Addressed the issue where Base64 encoding could not be handled. * Improved performance by using Base64 encoding by default.
First commit to fix Issue openai#211 This commit includes the fix described in Issue openai#211. * Addressed the issue where Base64 encoding could not be handled. * Improved performance by using Base64 encoding by default.
When Embeddings are created, they have an
EncodingFormat
which can either be offloat
ofbase64
as seen inopenai-java/openai-java-core/src/main/kotlin/com/openai/models/EmbeddingCreateParams.kt
Line 903 in 6390dfd
Unless this is configured by the user,
base64
should be the default, as this will provide a significant performance improvement.Here's a similar issue:
As this SDK is generated, we can't contribute directly to it, but I'd be happy to help fix this.
The text was updated successfully, but these errors were encountered: