Async requests #98

MiWeiss · 2022-05-27T06:07:11Z

Exposing async interfaces would allow using this library in a much more modern, performant, and scalable way.

Would be great if the maintainers could mention if they plan to add async methods in the future (i.e., allow for nonblocking api usage). Even specifying explicitly that this won't be added would be great, as it allows 3rd parties to release their own fork or wrapper, without the risk of being obsolete just moments later :-)

ArtemBernatskyy · 2022-07-28T15:28:03Z

Nice ignore, ggwp

hallacy · 2022-07-28T15:39:55Z

Hi @MiWeiss! Thanks for the issue.

Adding async support is on the roadmap, though we aren't committing to a timeline for when it'll be released. While I generally agree that an async interface would be good, can you tell me a little more about the performance improvement you'd expect to see if we added it? That'll help give us a sense about how to prioritize it.

MiWeiss · 2022-07-28T19:07:59Z

Hi @hallacy

Thanks for your answer.

I am not sure how to understand your question (e.g., are you asking about the technical reasons or use cases?), so please excuse if my answer misses its target...

Quick motivation:
Async requests makes it very easy to perform other tasks while waiting for the response to a request to OpenAI. If the "other tasks" are also io/network bound (e.g. more requests to OpenAI 😄), I'm likely also running them in an async way, such that I can combine the waiting time (as such, the time I wait is approx. equal to the time the slowest task takes). This is naturally much faster than doing all the waiting sequentially.

And yes, much of that could be done using threads, but there are various disadvantages (especially for i/o bound operations) to using threading over asyncio. See e.g. this great comment.

See also these fastapi docs providing a detailed, yet simple and intuitive motivation for using async requests.

Does this answer your question?

Also, IMHO
It may be nontrivial to change the library such that both async and synchronous requests are supported, both from an implementation perspective (session handling, api design, etc.) and regarding documentation (every snippet can be async or sync)? It might be easier to offer async-openai as a standalone library. That's just my five cents and I am happy to be proven wrong, though.

hallacy · 2022-07-28T23:33:57Z

It does! Thank you for the writeup. That comment you linked to was particularly helpful.

I think I agree with that opinion about non-triviality. I can't commit to a timeline, but I'll make a point of bringing this up to the team soon

lucasgadams · 2022-10-05T13:32:56Z

This is badly needed! Having to make concurrent requests using threading is not good for modern python. And it's honestly better to start on it earlier, because the whole library will need to be upgraded. Or of course can make a new library for it. The nodejs openai library is naturally async which is a big advantage. In the meantime, asyncifying function calls with a threadpool seems to work well for me https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor.

Andrew-Chen-Wang · 2022-12-04T17:33:04Z

Happy to commit to this. Based on previous experience with Redis and Django, a standalone package isn't needed, and this package would only need to include aiohttp as a dependency. Elasticsearch has async dependencies as optional, but in my opinion it doesn't need to be for this repository. Should be done tomorrow. (In the meantime, you can use asgiref and use sync_to_async)

Andrew-Chen-Wang · 2022-12-06T07:44:21Z

#146 resolves this issue. Please go test it, thanks. Usage (notice the a prefix, used in CPython lib, Django, and other libs):

import openai

openai.api_key = "sk-..."

async def main():
    await openai.Completion.acreate(prompt="This is a test", engine="text-ada-001")

In the meantime, you can use asgiref (notice the lack of the a prefix):

import openai
from asgiref.sync import sync_to_async

openai.api_key = "sk-..."

async def main():
    await sync_to_async(openai.Completion.create)(prompt="Test is a test", engine="text-ada-001")

itayzit · 2022-12-17T21:52:07Z

in the meantime, you can also use this light-weight client i wrote (it's using httpx)
https://pypi.org/project/openai-async/

something like:

pip install openai-async

and then:

import openai_async

response = await openai_async.complete(
    "<API KEY>",
    timeout=2,
    payload={
        "model": "text-davinci-003",
        "prompt": "Correct this sentence: Me like you.",
        "temperature": 0.7,
    },
)
print(response.json()["choices"][0]["text"].strip())
>>> "I like you."

danbf · 2023-02-03T17:36:34Z

@itayzit @MiWeiss any luck with this on a high concurrency implementation? i'm trying this but not getting the rates i'm hoping for.

Andrew-Chen-Wang · 2023-02-03T21:31:30Z

@danbf Recommend you also manually control the aiohttp session:

import openai
from aiohttp import ClientSession

openai.aiosession.set(ClientSession())
# At the end of your program, close the http session
await openai.aiosession.get().close()

MiWeiss · 2023-02-04T07:04:14Z

@danbf also note that OpenAI has rate limits in place: see here

danbf · 2023-02-04T20:03:20Z

thanks @Andrew-Chen-Wang and @MiWeiss going to use that code hint and yup, was aware of the OpenAI rate limit.

danbf · 2023-02-06T18:41:40Z

@Andrew-Chen-Wang @MiWeiss any ideas what to set TCPConnector(limit=XXX) to maximize throughput?

lucasgadams · 2023-02-08T15:17:39Z

You could try setting it to 0 for no limit. But honestly I very much doubt you can query openai with much more than 100 connections and not hit one of their quota limits.

jonra1993 · 2023-04-13T23:23:31Z

It can be done also using asyncer

import openai

openai.api_key = settings.OPENAI_API_KEY

response = await asyncify(openai.ChatCompletion.create)(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful help desk assistant."},
        {"role": "user", "content": "Which is the capital of Ecuador?"},
        {
            "role": "assistant",
            "content": "The capital of Ecuador is Quito.",
        },
        {"role": "user", "content": "Responde lo mismo pero en español"},
    ],
)
print("response", response)

vedantroy · 2023-05-11T22:29:42Z

@Andrew-Chen-Wang -- would there be any disadvantage of using your PR over asyncer or vice-versa?

Andrew-Chen-Wang · 2023-05-11T22:52:25Z

I would go with whatever openai-python has (ie my PR) since there are constant improvements to the repo, it's the official one, and all the examples online utilize this package. Performance is negligible for this repo since latency is the largest performance hindrance. It also seems easier to just import once (openai) and not twice (with asyncify)

hallacy added the enhancement New feature or request label Jul 28, 2022

sudoskys mentioned this issue Dec 5, 2022

[lib]Async requests LlmKira/Openaibot#1

Closed

Andrew-Chen-Wang mentioned this issue Dec 6, 2022

Add async support #146

Merged

JA1E0 mentioned this issue Dec 22, 2022

bot stop responding after a while when running in docker environment Zero6992/chatGPT-discord-bot#91

Closed

ddeville closed this as completed in #146 Jan 5, 2023

drazvan mentioned this issue May 7, 2023

Runtime Error nemoguardrails 0.1.0 NVIDIA/NeMo-Guardrails#16

Closed

hooman-bayer mentioned this issue May 30, 2023

Update to real Async and Streaming Chainlit/chainlit#7

Closed

neubig mentioned this issue Jun 6, 2023

Fix oai session zeno-ml/zeno-build#125

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Async requests #98

Async requests #98

MiWeiss commented May 27, 2022

ArtemBernatskyy commented Jul 28, 2022

hallacy commented Jul 28, 2022

MiWeiss commented Jul 28, 2022

hallacy commented Jul 28, 2022

lucasgadams commented Oct 5, 2022 •

edited

Loading

Andrew-Chen-Wang commented Dec 4, 2022 •

edited

Loading

Andrew-Chen-Wang commented Dec 6, 2022 •

edited

Loading

itayzit commented Dec 17, 2022

danbf commented Feb 3, 2023

Andrew-Chen-Wang commented Feb 3, 2023

MiWeiss commented Feb 4, 2023

danbf commented Feb 4, 2023

danbf commented Feb 6, 2023

lucasgadams commented Feb 8, 2023

jonra1993 commented Apr 13, 2023 •

edited

Loading

vedantroy commented May 11, 2023

Andrew-Chen-Wang commented May 11, 2023 •

edited

Loading

Async requests #98

Async requests #98

Comments

MiWeiss commented May 27, 2022

ArtemBernatskyy commented Jul 28, 2022

hallacy commented Jul 28, 2022

MiWeiss commented Jul 28, 2022

hallacy commented Jul 28, 2022

lucasgadams commented Oct 5, 2022 • edited Loading

Andrew-Chen-Wang commented Dec 4, 2022 • edited Loading

Andrew-Chen-Wang commented Dec 6, 2022 • edited Loading

itayzit commented Dec 17, 2022

danbf commented Feb 3, 2023

Andrew-Chen-Wang commented Feb 3, 2023

MiWeiss commented Feb 4, 2023

danbf commented Feb 4, 2023

danbf commented Feb 6, 2023

lucasgadams commented Feb 8, 2023

jonra1993 commented Apr 13, 2023 • edited Loading

vedantroy commented May 11, 2023

Andrew-Chen-Wang commented May 11, 2023 • edited Loading

lucasgadams commented Oct 5, 2022 •

edited

Loading

Andrew-Chen-Wang commented Dec 4, 2022 •

edited

Loading

Andrew-Chen-Wang commented Dec 6, 2022 •

edited

Loading

jonra1993 commented Apr 13, 2023 •

edited

Loading

Andrew-Chen-Wang commented May 11, 2023 •

edited

Loading