-
-
Notifications
You must be signed in to change notification settings - Fork 5.8k
OOM caused by numerous crawls #33520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you download a diagnosis report from "admin panel -> monitor -> trace" when the memory goes high? The report contains heap dump (no sensitive data) and could help to locate the problem. |
Here is the system notice: This is the system status, which shows an inconsistent system status, as I mentioned earlier in #33311. |
Could you download a diagnosis report from "admin panel -> monitor -> trace" when the memory goes high? The report contains heap dump (no sensitive data) and could help to locate the problem. |
If the memory is not related to Gitea process, then maybe you need to figure out which process consumes that memory, for example: git process? or some other commands? |
I can't capture the memory usage when it spikes via the trace admin panel because every time memory consumption goes high (e.g., 7 GiB), it crashes due to OOM Kubernetes. |
Is it clear that which process consumes that much memory? The Gitea web server process itself, or other processes like "ssh" or "git" or "gitea serve/hook"? |
If the OOM is caused by crawls, then it isn't a regression: each request consumes memory, some large repo/files consume more, then if there are lot of requests, these requests do consume a lot of memory and would lead to OOM. Maybe you could try to make stop the crawls and/or require sign-in for your instance. So I think we need to make the problem clearer. |
Most likely, it's from Git because the stack trace shows this: When there are many requests, such as GET requests to view repositories from crawls, memory consumption goes high, and it crashes due to being OOM killed by Kubernetes. |
Also right now, I've rolled back to version 1.23.1 and reduced the cache for last commit messages from 10K to 5K in the |
TBH, I do not see related change between 1.23.1 ~ 1.23.3 |
Well It worked fine for me previously, with uptime of over a month without crashing due to high memory consumption. And now, after rolling back, it still crashes.
|
Well, as I said above: it can't be a regression, it can't be related to the new version. There are just more crawls now. If you do not have that much resource support the crawls, maybe you need to block the crawls. |
For now, I've enabled Blocking these crawls by IP is ineffective because their IPs frequently change. |
The problem was solved by blocking their ASN, likely used for abusive AI training (e.g., Facebook Inc. (Meta), Amazon (AWS)). Now, only crawls from Google, used for indexing in their search engine, are allowed via Kubernetes Ingress Nginx. However, I believe it would be beneficial to expand the admin panel with additional features to block crawls based on IPs, User-Agent, and ASN. This would help prevent high memory consumption, likely due to memory leaks, which can cause crashes. |
The proof that blocking bad crawls used by Facebook Inc. (Meta) and Amazon (AWS) for AI training has effectively solved the memory usage issue, which was previously being abused excessively for profit. Note Memory usage has returned to normal, even with legitimate crawls like Google Search and others used for SEO, unlike the abusive AI training crawls from large companies such as Facebook Inc. (Meta) and Amazon (AWS). |
@wxiaoguang I've resolved this problem by increasing the Redis cache pool size to 500 and switching the session storage from files to Redis, using the same pool size of 500. This results in a total pool size of 1000. The Stats: However, this solution is only temporary because, without Redis, the memory usage leads to excessive consumption. |
@wxiaoguang, I've been trying that configuration option, but it seems similar to |
For "open source public site", my proposal is I don't run a public site, so I can't comment too much for this problem. |
@wxiaoguang, I run a public site primarily for mirroring repositories. Also the implementation of #33951 could indeed help reduce resource usage. It's quite similar to a rate limiter, which would be beneficial in managing resource consumption effectively. |
#33951 has been merged, does it work for your case? |
@wxiaoguang I haven't tried it yet. My git site is using Gitea 1.23.7, not the nightly build, as I prefer long-term stability due to its running on k8s. |
Description
In the latest versions, 1.23.2 and 1.23.3, memory leaks occur.(update: see below, not memory leak, not regression)These OOMs are caused by numerous crawls, such as those used by Facebook Inc. (Meta), Amazon (AWS), and other entities that fetch data excessively for AI training.
My Gitea self-hosted configuration:
Screenshots
The logs exemplify how these companies use crawls for their AI.
Essentially, memory leaks occur when there are many fetch requests, leading to crashes due to excessive memory consumption (thanks to OOM Kubernetes).
The text was updated successfully, but these errors were encountered: