Skip to content

Webhooks randomly fail with 408 timeout #5470

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 7 tasks
sbrl opened this issue Dec 4, 2018 · 25 comments
Closed
2 of 7 tasks

Webhooks randomly fail with 408 timeout #5470

sbrl opened this issue Dec 4, 2018 · 25 comments

Comments

@sbrl
Copy link

sbrl commented Dec 4, 2018

  • Gitea version (or commit ref): 1.6.0 (latest stable release
  • Git version: git version 2.17.1 (on the server)
  • Operating system: Ubuntu Server 18.04
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No
    • Not relevant
  • Log gist: ?

Description

I've setup a webhook between my server and a server at a remote location, I'm getting random failures. Some webhooks go through fine, and others die for some strange reason.

Setup:

  • Git Server:
    • Gitea behind nginx
  • Remote server:

Gitea claims the following in the webhook log:

Delivery: Post https://ci.starbeamrainbowlabs.com/hooks/laminar-config-check: read tcp 5.196.73.75:54504->x.y.z.w:443: i/o timeout

...but the remote server's nginx claims that it's Gitea's fault for not sending a complete request in time:

5.196.73.75 - - [04/Dec/2018:20:48:05 +0000] "POST /hooks/laminar-config-check HTTP/1.1" 408 0 "-" "GiteaServer"

The remote server has a good-quality fibre line. The git server has a stable 100mb/sec connection from KimSufi / OVH. Why the timeout?

Annoyingly, it always works if I hit "test delivery" - it's only genuine push events that are causing problems.

Screenshots

selection_094

Edit: Nope, it doesn't always work with the "test delivery" button. I've just got this:

selection_095

....clicking the "test delivery" button again causes the next one to succeed though.

@sbrl
Copy link
Author

sbrl commented Jan 13, 2019

Anyone at all? @lunny? Someone?

@sbrl sbrl mentioned this issue Jan 16, 2019
3 tasks
@barryp
Copy link

barryp commented Jan 21, 2019

I'm getting a similar thing sort of randomly

Delivery: Post http://www.home:9100/hooks/deploy-site: read tcp 10.0.0.107:12078->10.0.0.101:9100: i/o timeout

If I try a Test Delivery, it usually works fine, although that could be because the first invocation already did some time-consuming downloads - and the second invocation can complete much quicker. I'm using the same webhook @sbrl is using, although not behind nginx. The script it invokes can take maybe 5-15 seconds, perhaps even longer. It'd be nice to know what the timeout was, and even better if it could be configurable.

@sbrl
Copy link
Author

sbrl commented Jan 23, 2019

I think the timeout is already configurable, @barryp. It's this in custom/app.ini:

[webhook]
DELIVER_TIMEOUT = 60

....60 is the value I have set. If I try the "test" delivery, it does work fine some of the time. If I try it manually with curl from the same host, it always works. I think it's definitely a bug in Gitea.

@sbrl
Copy link
Author

sbrl commented Jan 23, 2019

I can confirm that this is still an issue with the latest v1.7.0 release of Gitea.

@sbrl
Copy link
Author

sbrl commented Feb 4, 2019

Furthermore, I have a tcpdump capture that demonstrates the issue. It's a HTTPS connection, but hopefully still proves useful. Contact me (my email address is bugs at starbeamrainbowlabs dot com) to obtain said dump.

(Note to self: It's in my downloads)

@lunny
Copy link
Member

lunny commented Feb 5, 2019

Can not reproduce on try.gitea.io and my instances.

@sbrl
Copy link
Author

sbrl commented Feb 6, 2019

@lunny Oh dear. That's awkward. Do you have an email address I can send this packet capture to, if you think it'd help?

If further details are needed, I could try doing it over regular http. If you've got any other suggestions as to what could potentially be the cause, I'd love to hear it and I'll investigate.

I'm currently investigating using the git post-receive hook instead and writing a bash script as a workaround.

@joyqat
Copy link

joyqat commented Mar 1, 2019

Have the same problem
image
It might be a network problem. Does Gitea have a "retry when failed" config?

@lunny
Copy link
Member

lunny commented Mar 1, 2019

@sbrl You can send message to me on discord.

@stale
Copy link

stale bot commented Apr 30, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs during the next 2 weeks. Thank you for your contributions.

@stale stale bot added the issue/stale label Apr 30, 2019
@sbrl
Copy link
Author

sbrl commented May 1, 2019

Looks like I have issues with GitHub sending Webhooks to the same server, so I think it's probably a network issue - or something else on my end. Thanks for investigating though!

@sbrl sbrl closed this as completed May 1, 2019
@c0h1b4
Copy link

c0h1b4 commented Aug 5, 2019

I am having this same issues on a kubernetes deployment of Gitea with Drone.io.

Randomly Delivery: Post https://xxxxx/hook?...: dial tcp: i/o timeout

Sometimes it goes smooth, sometimes it stalls. Then if I press "Test Delivery" it goes thru. An auto retry with a X number of tries would probably fix it.

@lafriks
Copy link
Member

lafriks commented Aug 5, 2019

That could also be network connectivity issues or dns problems

@c0h1b4
Copy link

c0h1b4 commented Aug 6, 2019

No. The issue here is that I'm using a small arm cluster with kubernetes. Sometimes the timeout is > 5 secs. I fixed increasing the webhook timeout to 30 secs. But I believe that an automatic retry for a custom number of tries would be very welcome.

@sbrl
Copy link
Author

sbrl commented Aug 7, 2019

What a co-incidence, @c0h1b4! My situation was something like this:

Gitea server ----> [ Internet ] -----> Home router ------> Raspberry Pi

If you're having issues with arm, maybe it's a bug in the linux networking stack for arm? I've no clue how to start debugging that though.

I'm pretty sure it's a network issue and nothing to do with Gitea. I know this because I setup a GitHub webhook that pointed tot he same place, and it also failed randomly in the exact same fashion.

I ended up writing a PHP proxy script that does the automatic retry that I run on a more stable box in the cloud (incidentally, this is the same box that has the Gitea server running). Said script looks a bit like this: https://pastebin.com/Un9B01s1

Very curious indeed.

@src386
Copy link

src386 commented Oct 14, 2019

I think I have the same issue, in a amd64 Docker environment.
Test delivery never worked (i/o timeout) but using curl with the same url works.
Tested both https (via traefik) and http (direct container to container).

Adding the DELIVER_TIMEOUT = 60 option slightly improves the situation, now I get HTTP 500, at least it's something.

@sbrl
Copy link
Author

sbrl commented Oct 15, 2019

Have you tried it with GitHub too, @src386?

@deanpcmad
Copy link

deanpcmad commented Oct 27, 2019

I'm also having this issue and I've tried multiple ways of trying to fix it...

  • Gitea in a docker container at home
  • Gitea running normally at home
  • Gitea on Docker on a VPS
  • Gitea running normally on a VPS
  • Both a clean database and imported database

I get the same i/o timeout issue no matter what I do. Of course webhooks are quite important especially with CI/CD. I was hoping to move to Gitea this weekend but after spending hours trying to get this work, I'm not sure now lol

Edit: using the latest stable version of Gitea btw which is 1.9.4

Another edit an hour later: I've sorted it. Turns out Drone checks if there's a .drone.yml file and if there isn't one, it won't allow a webhook? Very odd but glad I've sorted it!

@c0h1b4
Copy link

c0h1b4 commented Oct 28, 2019

I'm also having this issue and I've tried multiple ways of trying to fix it...

  • Gitea in a docker container at home
  • Gitea running normally at home
  • Gitea on Docker on a VPS
  • Gitea running normally on a VPS
  • Both a clean database and imported database

I get the same i/o timeout issue no matter what I do. Of course webhooks are quite important especially with CI/CD. I was hoping to move to Gitea this weekend but after spending hours trying to get this work, I'm not sure now lol

Edit: using the latest stable version of Gitea btw which is 1.9.4

Another edit an hour later: I've sorted it. Turns out Drone checks if there's a .drone.yml file and if there isn't one, it won't allow a webhook? Very odd but glad I've sorted it!

Did you tried playing with the DELIVER_TIMEOUT parameter in the config? It helped me to fix this issue.

@sbrl
Copy link
Author

sbrl commented Oct 29, 2019

Many root causes, 1 symptom.

@Veitor
Copy link

Veitor commented Mar 3, 2020

same problem..

  • I have set DELIVER_TIMEOUT=30 and still timeout.
  • I execute curl <webhook URL> command inside my gitea container, it's very fast and no problem.
  • I test webhook within another repository, it's worked. This repository's webhook URL is same with previous repository's webhook URL, all they are my Drone CI adress.
    How can it time out?
    image
    image
    image

@Veitor
Copy link

Veitor commented Mar 3, 2020

same problem..

  • I have set DELIVER_TIMEOUT=30 and still timeout.
  • I execute curl <webhook URL> command inside my gitea container, it's very fast and no problem.
  • I test webhook within another repository, it's worked. This repository's webhook URL is same with previous repository's webhook URL, all they are my Drone CI adress.
    How can it time out?
    image
    image
    image

ok. I have solved this problem.
my gitea repository's webhook is Drone URL. and the point is that CI repository - setting - Main - Configuration specified yaml file named .drone.yml, and my git repository's file named drone.yml!!
that's worker after modified file name... :)

However, this problem should not caused timeout error, network has no problem!

@deanpcmad
Copy link

^ Yeah my thoughts too

@sbrl
Copy link
Author

sbrl commented Mar 6, 2020

Glad you fixed it! :D

Although for future reference if you see these symptoms, it's almost certainly the particular issue I experienced here.

@ryanlelek
Copy link

This thread helped.

  • increased DELIVER_TIMEOUT to 30 (or 60) but still had issues
  • curl-ing from the container worked fine
  • adding .drone.yml to repo fixed the problem (if your webhook destination is Drone)
    Not scientific, sharing my vague experience with others that find this via search :)

@go-gitea go-gitea locked and limited conversation to collaborators Nov 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants