Skip to content

Common pattern for reconnections #414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pgrandinetti opened this issue May 25, 2018 · 24 comments · Fixed by #983
Closed

Common pattern for reconnections #414

pgrandinetti opened this issue May 25, 2018 · 24 comments · Fixed by #983

Comments

@pgrandinetti
Copy link

In several projects of mines I do something like the following, in order to handle scenario that may happen with connection errors and reconnection attempts:

    async def listen_forever(self):
        while True:
        # outer loop restarted every time the connection fails
            try:
                async with websockets.connect(self.url) as ws:
                    while True:
                    # listener loop
                        try:
                            reply = await asyncio.wait_for(ws.recv(), timeout=***)
                        except (asyncio.TimeoutError, websockets.exceptions.ConnectionClosed):
                            try:
                                pong = await ws.ping()
                                await asyncio.wait_for(pong, timeout=self.ping_timeout)
                                logger.debug('Ping OK, keeping connection alive...')
                                continue
                            except:
                                await asyncio.sleep(self.sleep_time)
                                break  # inner loop
                        # do stuff with reply object
            except socket.gaierror:
                # log something
                continue
            except ConnectionRefusedError:
                # log something else
                continue

and I was wondering whether (1) this makes sense at all, and (2) there is any shortcut already provided in websockets to what seems quite a recurrent behavior (assuming that is correct!)

@aaugustin
Copy link
Member

There is no shortcut in websockets for implementing this.

I've heard about this use case before and I think it's valid.

I'm unsure whether we can provide a good API -- I would expect user code to do stuff when opening the connection or when it dies and I'm not sure how to handle that?

/Cc @RemiCardona

@aaugustin
Copy link
Member

no progress

@pgrandinetti
Copy link
Author

just saw the help wanted tag.
As I mentioned 1.5 years ago (!) I did use that piece of code in a couple of projects. I can help, but would need guidance.

@heckad
Copy link

heckad commented Dec 5, 2019

What needs to do to add this feature?

@aaugustin
Copy link
Member

For this feature, I would suggest:

  • first, a documentation patch describing how the feature would behave exactly;
  • once we agree on that, an implementation with tests.

@pgrandinetti
Copy link
Author

@heckad I placed a minimal reproducible example here, to start with: https://gist.github.com/pgrandinetti/964747a9f2464e576b8c6725da12c1eb

@heckad
Copy link

heckad commented Dec 7, 2019

I suggest doing like in aio_pika connect_robust method which should return RobustConnection or RobustSomethingAnyThing which will reconnect on any error by default. Another way make like in requests.

@filwaline
Copy link

filwaline commented Oct 31, 2020

Here is my attempt to talk with websocket server
Invoke async-generator
https://gist.github.com/filwaline/37309797265079f2ad7fcc47999575b2

@aaugustin
Copy link
Member

An interesting API would be:

async for ws in websockets.connect(...):
    ...

Now the question is: should it reconnect only on connection errors? on or all exceptions?

This isn't an easy question:

  • reconnecting on connection errors seems legitimate while swallowing all exceptions is a Gun Pointed At Feet
  • it seems Quite Hard to tell connection errors apart from other errors with a high degree of certainty

Also, we need exponential backoff to avoid going in a hot loop if the network or the server is down

@MelchiSalins
Copy link

MelchiSalins commented May 13, 2021

Could the behaviour of the reconnection be a parameter passed to the connect method?

async for ws in WebSockets.connect(reconnect_policy=websokets.ALWAYS, exponential_backoff=true):
         ...

or something similar? Just putting it out there. I can see a few issues already with what is proposed but hoping something else can build on top of it.

@adriansev
Copy link

Just a small remark: re-connecting, and retry connecting are different things

  • the former assumes a successful initial connection that was dropped (so, the configuration is correct but a connection problem occurred so let's reconnect )
  • the latter would be when no connection was successful, so let's keep trying to do so
    IMHO, i would say that if session state (THIS_WAS_SUCCESSFUL) can be kept, then always (configurable) retry when there was an initial successful connection and always bailout if no initial connection

I think that this would translate to always retry if ConnectionClosedError and bailout in all other cases.. I'm not sure if there are cases where the session is valid but the connection was closed with ConnectionClosedOK

@aaugustin
Copy link
Member

Hello! I have a proposal here: #983. Sorry it took me so long to figure it out.

There's a wide variety of situations where connections can fail and require reconnecting. I don't have a good way to reproduce them all.

If you have a real life use case and you're able to test this pull request, I would be very interested in your feedback.

The idea would be to run the version of websockets from #983, to replace your reconnection system by:

async for ws in websockets.connect(...):
    ...

Then you can check if the reconnection works. Errors that trigger reconnections are logged; if you see errors in the logs and the connection is still alive, you're good.

Looking forwards to your feedback. Thank you!

@aaugustin
Copy link
Member

@adriansev: you have a good point; indeed there's a different reasoning for errors opening the connections and errors after it's open.

Here's what I ended up doing:

  • Errors opening the connection: retry with exponential backoff and no limit. I don't think depending on an initial successful connection is a good idea. For example, if the network or the server is momentarily down while a script starts, it makes sense to keep trying until the connection eventually works.
  • Errors after the connection is open: I'm logging ConnectionClosed at the debug level (because you're expecting it, else you wouldn't be reconnecting in the first place) and other exceptions at the error level.

aaugustin added a commit that referenced this issue Jun 5, 2021
aaugustin added a commit that referenced this issue Jun 5, 2021
aaugustin added a commit that referenced this issue Jun 5, 2021
aaugustin added a commit that referenced this issue Jun 5, 2021
aaugustin added a commit that referenced this issue Jun 9, 2021
@aaugustin
Copy link
Member

I merged the branch but I'm still interested in feedback if you try it :-)

@adriansev
Copy link

@aaugustin thank you for your work!! So, just to throw a disclaimer: my view is just from the point of view of the client (and a very dumb client, is a REPL interface to a server, so only one message at a time). So, my comments were from this view, and, as such, i continue to think that retrying the connection when no initial connection was done is pointless (maybe my certificate is wrong, or the wireless is disconnected so, i will wait for an answer from the server when in fact there are re-connecting tryouts in the websocket part). IMHO re-connect would be useful for cases when because of load and network a connection is dropped for very short time. So, for me, my fear is that this would convert to user reports like "i started the client, and nothing happens" as opposed to the often "error was thrown, look in the log" and in the end the problem was of certificate permission, mixed certificate and key (new with old) or other pure ssl problems that will threw errors at connection time.
So, i would say that that my use-case is really specific and unique and my opinion does not matter on this...
The only thing that would simplify my code would be if an automatic re-connection at pong timeout and for an exception in send .. just FYI at this moment i do this: https://github.com/adriansev/jalien_py/blob/master/alienpy/alien.py#L514
But, in the end, i just thank you so much for your work! :)

@aaugustin
Copy link
Member

Indeed there's no single behavior that will work for everyone if the initial connection fails. I didn't want to get down the path of trying to tell apart fatal errors from temporary errors. At least a stack trace will be displayed so it isn't a silent failure.

If you started from scratch today, you could probably use this to make the code that reconnects on pong timeout a bit more elegant. But given that you already wrote it, I'm not seeing much value in refactoring it.

@adriansev
Copy link

Well, all depends on "is it worth it?" (in regard to discrimination of types of error and handle them differently) and the range of needs can be quite wide. but, as you say, it throws an exception so it's ok ...
About starting from scratch, well, there could be a knowledge delta in a week frame, in a year time one can be at the 3rd rewrite of the project :)))
But now that you threw the bait with the "elegant" could you please elaborate what would that mean? :)
Also, i do not re-connect on pong (as i cannot know if an underlying ping failed to receive the pong.. that would need an separate thread, or in the same thread/task as ping/pong mechanism to have an callback for re-connection), i only do a re-connection when getting an exception from send
Thank you!

@aaugustin
Copy link
Member

By "elegant" I meant "can you do everything within a context manager" vs. having a global instance of the websockets client and reinitializing it when it fails.

The benefits are those of context managers -- generally they improve the structure of the code and give more clarity to which resource is used where.

@adriansev
Copy link

ooh, yeah, that is excluded in my case, as in "technically impossible" : there is not possible to mix async with non-async modules/functions .. so to solve my conundrum, actually i have a thread with the async main loop and a decorator to "un-syncify" the send/receive function (and a few other function that use directly the websocket)

@aaugustin
Copy link
Member

OK. You need #885, then!

@akhtarshahnawaz
Copy link

akhtarshahnawaz commented Oct 4, 2021

def run():
    try:
        asyncio.get_event_loop().run_until_complete(call_api(json.dumps(msg)))
    except Exception as e:
        print(f'WEBSOCKET CONNECTION CLOSED. RESTARTING..', e)
        time.sleep(2)
        run()

run()

@ghost
Copy link

ghost commented Oct 15, 2021

Hi @aaugustin, I saw that you merged #983 but it is still unclear to me how properly implement the auto-reconnect feature using your library.
Can you please provide a small example on the best way to do it? Or is still necessary to use the pattern mentioned in the first post of this thread?

Thank you!

@aaugustin
Copy link
Member

@coccoinomane
Copy link

Thanks @aaugustin!
For those stumbling on the broken documentation link, here's a working one:
https://websockets.readthedocs.io/en/stable/reference/asyncio/client.html#opening-a-connection

The solution is to do:

async for websocket in websockets.connect(...):
    try:
        ...
    except websockets.ConnectionClosed:
        continue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants