OOM caused by numerous crawls #33520

H0llyW00dzZ · 2025-02-06T11:05:53Z

Description

~~In the latest versions, 1.23.2 and 1.23.3, memory leaks occur.~~ (update: see below, not memory leak, not regression)

These OOMs are caused by numerous crawls, such as those used by Facebook Inc. (Meta), Amazon (AWS), and other entities that fetch data excessively for AI training.

My Gitea self-hosted configuration:

Sessions using files
Cache using Redis with a TTL of 5 hours, and the last commit cache is 10K
No SSH

Screenshots

The logs exemplify how these companies use crawls for their AI.

Essentially, memory leaks occur when there are many fetch requests, leading to crashes due to excessive memory consumption (thanks to OOM Kubernetes).

wxiaoguang · 2025-02-06T11:10:49Z

Could you download a diagnosis report from "admin panel -> monitor -> trace" when the memory goes high?

The report contains heap dump (no sensitive data) and could help to locate the problem.

H0llyW00dzZ · 2025-02-06T11:15:03Z

Could you download a diagnosis report from "admin panel -> monitor -> trace" when the memory goes high?

The report contains heap dump (no sensitive data) and could help to locate the problem.

Here is the system notice:

This is the system status, which shows an inconsistent system status, as I mentioned earlier in #33311.

wxiaoguang · 2025-02-06T11:16:15Z

Could you download a diagnosis report from "admin panel -> monitor -> trace" when the memory goes high?

The report contains heap dump (no sensitive data) and could help to locate the problem.

wxiaoguang · 2025-02-06T11:17:14Z

If the memory is not related to Gitea process, then maybe you need to figure out which process consumes that memory, for example: git process? or some other commands?

H0llyW00dzZ · 2025-02-06T11:17:24Z

Could you download a diagnosis report from "admin panel -> monitor -> trace" when the memory goes high?

The report contains heap dump (no sensitive data) and could help to locate the problem.

I can't capture the memory usage when it spikes via the trace admin panel because every time memory consumption goes high (e.g., 7 GiB), it crashes due to OOM Kubernetes.

wxiaoguang · 2025-02-06T11:20:11Z

Could you download a diagnosis report from "admin panel -> monitor -> trace" when the memory goes high?
The report contains heap dump (no sensitive data) and could help to locate the problem.

I can't capture the memory usage when it spikes via the trace admin panel because every time memory consumption goes high (e.g., 7 GiB), it crashes due to OOM Kubernetes.

Is it clear that which process consumes that much memory? The Gitea web server process itself, or other processes like "ssh" or "git" or "gitea serve/hook"?

wxiaoguang · 2025-02-06T11:23:38Z

The logs exemplify how these companies use crawls for their AI.

Essentially, memory leaks occur when there are many fetch requests, leading to crashes due to excessive memory consumption (thanks to OOM Kubernetes).

If the OOM is caused by crawls, then it isn't a regression: each request consumes memory, some large repo/files consume more, then if there are lot of requests, these requests do consume a lot of memory and would lead to OOM. Maybe you could try to make stop the crawls and/or require sign-in for your instance.

So I think we need to make the problem clearer.

H0llyW00dzZ · 2025-02-06T11:24:55Z

Could you download a diagnosis report from "admin panel -> monitor -> trace" when the memory goes high?
The report contains heap dump (no sensitive data) and could help to locate the problem.

I can't capture the memory usage when it spikes via the trace admin panel because every time memory consumption goes high (e.g., 7 GiB), it crashes due to OOM Kubernetes.

Is it clear that which process consumes that much memory? The Gitea web server process itself, or other processes like "ssh" or "git" or "gitea serve/hook"?

Most likely, it's from Git because the stack trace shows this:

When there are many requests, such as GET requests to view repositories from crawls, memory consumption goes high, and it crashes due to being OOM killed by Kubernetes.

H0llyW00dzZ · 2025-02-06T11:27:42Z

Also right now, I've rolled back to version 1.23.1 and reduced the cache for last commit messages from 10K to 5K in the app.ini configuration. Let's see if it still crashes.

wxiaoguang · 2025-02-06T11:29:08Z

TBH, I do not see related change between 1.23.1 ~ 1.23.3

v1.23.1...v1.23.3

H0llyW00dzZ · 2025-02-06T11:34:13Z

TBH, I do not see related change between 1.23.1 ~ 1.23.3

v1.23.1...v1.23.3

Well It worked fine for me previously, with uptime of over a month without crashing due to high memory consumption.

And now, after rolling back, it still crashes.

h0llyw00dzz@ubuntu-pro:~$ kubectl get pods -n gitea
NAME                     READY   STATUS    RESTARTS      AGE
gitea-5cb7dff998-xwb5r   1/1     Running   1 (40s ago)   10m

h0llyw00dzz@ubuntu-pro:~$ kubectl describe pods -n gitea
Containers:
  gitea:
    Container ID:   containerd://866d173132606a07e7937e7dfb430533cf1e5a8ad515044e496486416f6a485c
    Image:          gitea/gitea:1.23.1
    Image ID:       docker.io/gitea/gitea@sha256:c3be67d5c31694f8c27e5f3ab87630cceadf05abb795ab0ed70ba14b5edfc29c
    Port:           3000/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Thu, 06 Feb 2025 18:30:03 +0700
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Thu, 06 Feb 2025 18:20:10 +0700
      Finished:     Thu, 06 Feb 2025 18:30:01 +0700

wxiaoguang · 2025-02-06T11:37:01Z

Well, as I said above: it can't be a regression, it can't be related to the new version.

There are just more crawls now. If you do not have that much resource support the crawls, maybe you need to block the crawls.

H0llyW00dzZ · 2025-02-06T11:40:49Z

Well, as I said above: it can't be a regression, it can't be related to the new version.

There are just more crawls now. If you do not have that much resource support the crawls, maybe you need to block the crawls.

For now, I've enabled REQUIRE_SIGNIN_VIEW to disable crawls used by companies like Facebook (Meta) and Amazon (AWS) for training their AI. It seems they are likely overusing (Abuse) the crawls for AI purposes.

Blocking these crawls by IP is ineffective because their IPs frequently change.

H0llyW00dzZ · 2025-02-06T12:36:03Z

@wxiaoguang

The problem was solved by blocking their ASN, likely used for abusive AI training (e.g., Facebook Inc. (Meta), Amazon (AWS)). Now, only crawls from Google, used for indexing in their search engine, are allowed via Kubernetes Ingress Nginx. However, I believe it would be beneficial to expand the admin panel with additional features to block crawls based on IPs, User-Agent, and ASN. This would help prevent high memory consumption, likely due to memory leaks, which can cause crashes.

H0llyW00dzZ · 2025-02-06T14:57:44Z

The proof that blocking bad crawls used by Facebook Inc. (Meta) and Amazon (AWS) for AI training has effectively solved the memory usage issue, which was previously being abused excessively for profit.

Note

Memory usage has returned to normal, even with legitimate crawls like Google Search and others used for SEO, unlike the abusive AI training crawls from large companies such as Facebook Inc. (Meta) and Amazon (AWS).

H0llyW00dzZ · 2025-02-15T18:53:06Z

@wxiaoguang I've resolved this problem by increasing the Redis cache pool size to 500 and switching the session storage from files to Redis, using the same pool size of 500. This results in a total pool size of 1000.

The Stats:

Redis:

Pods:

However, this solution is only temporary because, without Redis, the memory usage leads to excessive consumption.

wxiaoguang · 2025-04-09T03:26:10Z

In 1.23.7 , we have this:

Add a config option to block "expensive" pages for anonymous users (#34024) (#34071)

H0llyW00dzZ · 2025-04-09T04:32:40Z

In 1.23.7 , we have this:

Add a config option to block "expensive" pages for anonymous users (#34024) (#34071)

@wxiaoguang, I've been trying that configuration option, but it seems similar to REQUIRE_SIGNIN_VIEW = true, which may not be ideal for open-source repositories. I think it would be more effective to implement a rate limiter based on IP addresses or user agents, or both, for areas that consume a lot of memory (e.g., example.com/repo/commit/sha1commit). This could reduce resource usage, such as memory, especially since many AI crawlers use the same IPs and user agents when crawling a site.

wxiaoguang · 2025-04-09T04:44:14Z

which may not be ideal for open-source repositories.

For "open source public site", my proposal is https://github.com/go-gitea/gitea/pull/33951#discussion_r2032324964

I don't run a public site, so I can't comment too much for this problem.

H0llyW00dzZ · 2025-04-09T04:57:14Z

which may not be ideal for open-source repositories.

For "open source public site", my proposal is https://github.com/go-gitea/gitea/pull/33951#discussion_r2032324964

I don't run a public site, so I can't comment too much for this problem.

@wxiaoguang, I run a public site primarily for mirroring repositories. Also the implementation of #33951 could indeed help reduce resource usage. It's quite similar to a rate limiter, which would be beneficial in managing resource consumption effectively.

wxiaoguang · 2025-04-20T08:15:12Z

#33951 has been merged, does it work for your case?

H0llyW00dzZ · 2025-04-20T09:36:56Z

#33951 has been merged, does it work for your case?

@wxiaoguang I haven't tried it yet. My git site is using Gitea 1.23.7, not the nightly build, as I prefer long-term stability due to its running on k8s.

H0llyW00dzZ added the type/bug label Feb 6, 2025

wxiaoguang changed the title ~~Memory Leaks in Versions 1.23.2 and 1.23.3~~ OOM caused by numerous crawls Feb 6, 2025

wxiaoguang removed the type/bug label Feb 6, 2025

wxiaoguang mentioned this issue Feb 13, 2025

Gitea Current Goroutines 1500+ and Spiking Slow Server Response Time #33578

Closed

wxiaoguang added the issue/needs-feedback For bugs, we need more details. For features, the feature must be described in more detail label Apr 9, 2025

wxiaoguang removed the issue/needs-feedback For bugs, we need more details. For features, the feature must be described in more detail label Apr 9, 2025

wxiaoguang added the issue/needs-feedback For bugs, we need more details. For features, the feature must be described in more detail label Apr 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OOM caused by numerous crawls #33520

OOM caused by numerous crawls #33520

H0llyW00dzZ commented Feb 6, 2025 •

edited by wxiaoguang

Loading

wxiaoguang commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025 •

edited

Loading

H0llyW00dzZ commented Feb 6, 2025 •

edited by wxiaoguang

Loading

H0llyW00dzZ commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

H0llyW00dzZ commented Feb 15, 2025 •

edited

Loading

wxiaoguang commented Apr 9, 2025

H0llyW00dzZ commented Apr 9, 2025 •

edited

Loading

wxiaoguang commented Apr 9, 2025

H0llyW00dzZ commented Apr 9, 2025

wxiaoguang commented Apr 20, 2025

H0llyW00dzZ commented Apr 20, 2025

OOM caused by numerous crawls #33520

OOM caused by numerous crawls #33520

Comments

H0llyW00dzZ commented Feb 6, 2025 • edited by wxiaoguang Loading

Description

Screenshots

wxiaoguang commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025 • edited Loading

H0llyW00dzZ commented Feb 6, 2025 • edited by wxiaoguang Loading

H0llyW00dzZ commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

wxiaoguang commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

H0llyW00dzZ commented Feb 6, 2025

H0llyW00dzZ commented Feb 15, 2025 • edited Loading

wxiaoguang commented Apr 9, 2025

H0llyW00dzZ commented Apr 9, 2025 • edited Loading

wxiaoguang commented Apr 9, 2025

H0llyW00dzZ commented Apr 9, 2025

wxiaoguang commented Apr 20, 2025

H0llyW00dzZ commented Apr 20, 2025

H0llyW00dzZ commented Feb 6, 2025 •

edited by wxiaoguang

Loading

wxiaoguang commented Feb 6, 2025 •

edited

Loading

H0llyW00dzZ commented Feb 6, 2025 •

edited by wxiaoguang

Loading

H0llyW00dzZ commented Feb 15, 2025 •

edited

Loading

H0llyW00dzZ commented Apr 9, 2025 •

edited

Loading