Skip to content

v1.8.3-1 noble stops after receiver.amplify.nginx.com timeout #124

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
fersk opened this issue Feb 21, 2025 · 1 comment
Open

v1.8.3-1 noble stops after receiver.amplify.nginx.com timeout #124

fersk opened this issue Feb 21, 2025 · 1 comment

Comments

@fersk
Copy link

fersk commented Feb 21, 2025

We recently migrated a new server to 24.04 noble and are using agent v1.8.3-1~noble. Getting the error below when receiver.amplify.nginx.com times out. Agent stops.

2025-02-21 02:30:50,530 [1172] supervisor uncaught exception during run time
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 537, in _make_request
    response = conn.getresponse()
               ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 461, in getresponse
    httplib_response = super().getresponse()
                       ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/http/client.py", line 1428, in getresponse
    response.begin()
  File "/usr/lib/python3.12/http/client.py", line 331, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/http/client.py", line 292, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/socket.py", line 707, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/gevent/ssl.py", line 625, in recv_into
    return self.read(nbytes, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/gevent/ssl.py", line 440, in read
    self._wait(self._read_event, timeout_exc=_SSLErrorReadTimeout)
  File "src/gevent/_hub_primitives.py", line 317, in gevent._gevent_c_hub_primitives.wait_on_socket
  File "src/gevent/_hub_primitives.py", line 322, in gevent._gevent_c_hub_primitives.wait_on_socket
  File "src/gevent/_hub_primitives.py", line 313, in gevent._gevent_c_hub_primitives._primitive_wait
  File "src/gevent/_hub_primitives.py", line 314, in gevent._gevent_c_hub_primitives._primitive_wait
  File "src/gevent/_hub_primitives.py", line 46, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
  File "src/gevent/_hub_primitives.py", line 46, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
  File "src/gevent/_hub_primitives.py", line 55, in gevent._gevent_c_hub_primitives.WaitOperationsGreenlet.wait
  File "src/gevent/_waiter.py", line 154, in gevent._gevent_c_waiter.Waiter.get
  File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_greenlet_primitives.py", line 61, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_greenlet_primitives.py", line 65, in gevent._gevent_c_greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_gevent_c_greenlet_primitives.pxd", line 35, in gevent._gevent_c_greenlet_primitives._greenlet_switch
TimeoutError: The read operation timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 845, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 472, in increment
    raise reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/urllib3/util/util.py", line 39, in reraise
    raise value
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 791, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 539, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 371, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='receiver.amplify.nginx.com', port=443): Read timed out. (read timeout=5.0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/amplify/agent/supervisor.py", line 356, in talk_to_cloud
    context.http_client.post('agent/', data=root_object)
  File "/usr/lib/python3/dist-packages/amplify/agent/common/util/http.py", line 128, in post
    return self.make_request(url, 'post', data=data, timeout=timeout, json=json)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/amplify/agent/common/util/http.py", line 110, in make_request
    raise e
  File "/usr/lib/python3/dist-packages/amplify/agent/common/util/http.py", line 94, in make_request
    r = self.session.post(
        ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 637, in post
    return self.request("POST", url, data=data, json=json, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 532, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='receiver.amplify.nginx.com', port=443): Read timed out. (read timeout=5.0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/amplify/agent/main.py", line 147, in run
    daemon_runner.do_action()
  File "/usr/lib/python3/dist-packages/amplify/agent/common/runner.py", line 42, in do_action
    self.app.run()
  File "/usr/lib/python3/dist-packages/amplify/agent/supervisor.py", line 277, in run
    self.talk_to_cloud(root_object=context.objects.root_object.definition)
  File "/usr/lib/python3/dist-packages/amplify/agent/supervisor.py", line 375, in talk_to_cloud
    self.cloud_talk_delay = exponential_delay(self.cloud_talk_fails)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/amplify/agent/common/util/backoff.py", line 35, in exponential_delay
    return randint(0, period_size - 1)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/random.py", line 336, in randint
    return self.randrange(a, b+1)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/random.py", line 312, in randrange
    istop = _index(stop)
            ^^^^^^^^^^^^
TypeError: 'float' object cannot be interpreted as an integer
2025-02-21 02:30:56,092 [1172] supervisor failed POST "[https://receiver.amplify.nginx.com:443/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/](https://receiver.amplify.nginx.com/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/)", exception: "HTTPSConnectionPool(host='receiver.amplify.nginx.com', port=443): Read timed out. (read timeout=5.0)"
2025-02-21 02:30:56,092 [1172] supervisor [None] post [https://receiver.amplify.nginx.com:443/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/](https://receiver.amplify.nginx.com/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/) 500 795 0 5.547
2025-02-21 02:30:56,096 [1172] supervisor agent stopped, version=1.8.3-1 pid=1172 uuid=05e5b1d0f22e5658ad8c53582aa99057 

We have multiple other servers running 1.8.2-1~jammy on 22.04 jammy and the timeout is handled properly there:

2025-02-21 02:31:09,561 [42374] bridge_manager [bc35d14c1d8530a3eae40b5d885df9bd] post [https://receiver.amplify.nginx.com:443/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/](https://receiver.amplify.nginx.com/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/) 202 3814 0 0.421
2025-02-21 02:31:19,093 [42374] supervisor [e490f00104cfc3f422d00939cc35c9ac] post [https://receiver.amplify.nginx.com:443/1.4/9890f479ff7c57dce75cc2e09560e9f1/agent/](https://receiver.amplify.nginx.com/1.4/9890f479ff7c57dce75cc2e09560e9f1/agent/) 200 86 5 0.257
2025-02-21 02:32:14,600 [42374] bridge_manager failed POST "[https://receiver.amplify.nginx.com:443/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/](https://receiver.amplify.nginx.com/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/)", exception: "HTTPSConnectionPool(host='re>
2025-02-21 02:32:14,600 [42374] bridge_manager [None] post [https://receiver.amplify.nginx.com:443/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/](https://receiver.amplify.nginx.com/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/) 500 3817 0 5.008
2025-02-21 02:32:14,601 [42374] bridge_manager failed to push data due to ReadTimeout
2025-02-21 02:33:15,293 [42374] bridge_manager [194bcce98ba9c884c0f6eab09f090ff2] post [https://receiver.amplify.nginx.com:443/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/](https://receiver.amplify.nginx.com/1.4/9890f479ff7c57dce75cc2e09560e9f1/update/) 202 5144 0 0.657
@fersk fersk changed the title v1.8.3-1 noble stops when after receiver timeout v1.8.3-1 noble stops after receiver.amplify.nginx.com timeout Feb 21, 2025
jasil1414 added a commit to jasil1414/nginx-amplify-agent that referenced this issue Apr 2, 2025
…timeouts

Problem:
- Agent v1.8.3-1~noble crashes on Ubuntu 24.04 with TypeError when receiver.amplify.nginx.com times out
- Error occurs in backoff.py: "TypeError: 'float' object cannot be interpreted as an integer"
- The issue prevents proper exponential backoff during network connectivity problems
- Agent completely stops instead of retrying with the backoff delay

Solution:
- Add explicit integer type conversion to period_size before passing to randint()
- Convert: return randint(0, period_size - 1)
- To: return randint(0, int(period_size) - 1)

This fix allows the agent to properly handle connection timeouts without crashing.
The issue appears to be specific to Python 3.12 used in Ubuntu 24.04 Noble, as
the agent works correctly on Ubuntu 22.04 Jammy with Python 3.10.

Fixes nginxinc#124
@jaymoh
Copy link

jaymoh commented Apr 4, 2025

I'm also experiencing this issue after upgrading one of my servers to Ubuntu 24. The timeout was handled fine in Ubuntu 20.04

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants