Skip to content

bug: fetch gets stuck very freqently #1568

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 tasks done
shmerl opened this issue Jun 27, 2024 · 63 comments
Closed
3 tasks done

bug: fetch gets stuck very freqently #1568

shmerl opened this issue Jun 27, 2024 · 63 comments
Labels
bug Something isn't working

Comments

@shmerl
Copy link

shmerl commented Jun 27, 2024

Did you check docs and existing issues?

  • I have read all the lazy.nvim docs
  • I have searched the existing issues of lazy.nvim
  • I have searched the existing issues of plugins related to this issue

Neovim version (nvim -v)

NVIM v0.10.0
Build type: Release
LuaJIT 2.1.1713484068

Operating system/version

Debian testing Linux

Other

  • git: 2.43.0

Describe the bug

Since recent updates to lazy.nvim (main branch) when checking for plugins updates, it often doesn't finish fetch commands which looks like this:

For a while:
image

And then:
image

Steps To Reproduce

Just try to run :Lazy check

Expected Behavior

Fetch should complete in reasonable time.

Repro

-- DO NOT change the paths and don't remove the colorscheme
local root = vim.fn.fnamemodify("./.repro", ":p")

-- set stdpaths to use .repro
for _, name in ipairs({ "config", "data", "state", "cache" }) do
  vim.env[("XDG_%s_HOME"):format(name:upper())] = root .. "/" .. name
end

-- bootstrap lazy
local lazypath = root .. "/plugins/lazy.nvim"
if not vim.loop.fs_stat(lazypath) then
  vim.fn.system({ "git", "clone", "--filter=blob:none", "https://github.com/folke/lazy.nvim.git", lazypath, })
end
vim.opt.runtimepath:prepend(lazypath)

-- install plugins
local plugins = {
  "folke/tokyonight.nvim",
  -- add any other plugins here
}
require("lazy").setup(plugins, {
  root = root .. "/plugins",
})

vim.cmd.colorscheme("tokyonight")
-- add anything else here
@shmerl shmerl added the bug Something isn't working label Jun 27, 2024
@shmerl shmerl changed the title bug: feth gets stuck very freqently bug: fetch gets stuck very freqently Jun 27, 2024
@b0ae989c
Copy link

Can not reproduce. I have 78 plugins and git fetch never fails. Is this only related to nvim-dap?

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

No, happens randomly for any plugin or a number of them.

@Shougo
Copy link

Shougo commented Jun 27, 2024

Hm... If you disable luarocks support, the issue may be gone.

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

I think I already disabled it. My config looks like this:

require("lazy").setup({
   import = "plugin_specs", -- see plugin specs in lua/plugin_specs
},
-- lazy.nvim configuration
{
   lockfile = vim.fn.stdpath("data") .. "/lazy/lazy-lock.json", -- lockfile generated after running update.
   pkg = {
      sources = { "lazy" } -- use only lazy specs
   },
   ui = {
...

@dhoizner
Copy link

I've also been running into this today with random plugins failing to fetch during update/check/etc. it's almost never the same plugin, and triggering an update on that plugin individually succeeds.

@davidosomething
Copy link

same issue, mine gets stuck at fetch
image

@folke
Copy link
Owner

folke commented Jun 27, 2024

Did you previously have to configure concurrency?
If so, then for a couple of hours yesterday, that feature no longer worked and spawned all git procs at the same time (which works fine on my machine, but can be too much for others).

That feature works again in the meantime.

@folke
Copy link
Owner

folke commented Jun 27, 2024

On Windows, concurrency is set automatically, so for people that had those failures yesterday, then that was to be expected

@afulki
Copy link

afulki commented Jun 27, 2024

MacOS, this issue started for me yesterday, random plugins,

  Failed (4)
    ○ indent-blankline.nvim  LazyFile 
        fatal: unable to access 'https://github.com/lukas-reineke/indent-blankline.nvim.git/': Failed to connect to github.com port 443 after 75009 ms: Couldn't connect to server
    ● LuaSnip 11.62ms  start
        fatal: unable to access 'https://github.com/L3MON4D3/LuaSnip.git/': Failed to connect to github.com port 443 after 75003 ms: Couldn't connect to server
    ● telescope-file-browser.nvim 12.92ms  start
        fatal: unable to access 'https://github.com/nvim-telescope/telescope-file-browser.nvim.git/': Failed to connect to github.com port 443 after 75002 ms: Couldn't connect to server
    ● telescope-fzf-native.nvim 6.88ms  telescope.nvim
        fatal: unable to access 'https://github.com/nvim-telescope/telescope-fzf-native.nvim.git/': Failed to connect to github.com port 443 after 75003 ms: Couldn't connect to server

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

No, I didn't set concurrency, so I assume it's pretty much default. It's still not working today so the issue isn't fixed (Linux).

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

Looking at processes, I see a bunch of git-remote-https hanging around even after I exit nvim. That should be related.

@folke
Copy link
Owner

folke commented Jun 27, 2024

Nothing changed related to that. The code is more efficient though, so it may spawn more tasks concurrently than before.
Just configure concurrency and you should be good.

@afulki that's clearly an OS issue. Not lazy's fault. Set concurrency to a lower number

@folke folke closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2024
@davidosomething
Copy link

davidosomething commented Jun 27, 2024

i changed concurrency in the root level and in checker = { concurrency } to 1

image

still gets stuck on one fetch haha, didn't fix

require("lazy").setup("dko.plugins", {
  change_detection = {
    enabled = false,
  },
  checker = {
    concurrency = 1,
    -- needed to get the output of require("lazy.status").updates()
    enabled = true,
    -- get a notification when new updates are found?
    notify = false,
  },
  concurrency = 1,
  dev = {
    fallback = true,
    patterns = { "davidosomething" },
  },
  ui = { border = require("dko.settings").get("border") },
  performance = {
    rtp = {
      disabled_plugins = {
        "gzip",
        "matchit",
        "matchparen", -- vim-matchup will re-load this anyway
        "netrwPlugin",
        "tarPlugin",
        "tohtml",
        "tutor",
        "zipPlugin",
      },
    },
  },
})

@afulki
Copy link

afulki commented Jun 27, 2024

@folke That would be fine, except I've not changed concurrency (fairly default LazyVim install). This has been working for over a year with no issues until yesterday or the day before.

@folke
Copy link
Owner

folke commented Jun 27, 2024

@afulki like I said, the new lazy.nvim is more efficient, so can run more stuff in parallel. Whether you like it or not, you'll have to set concurrency or fix your network settings.

@folke
Copy link
Owner

folke commented Jun 27, 2024

@davidosomething and the git process is still running?

@davidosomething
Copy link

it gets killed eventually because of timeout

  Failed (1)
    ○ markdown.nvim  markdown 
        Process was killed because it reached the timeout

but usually it is like this:
image

@folke
Copy link
Owner

folke commented Jun 27, 2024

That all looks fine. It's just a git command that I'm spawning.
I really don't see how I can do something wrong there.
Might be github that's acting up

@afulki
Copy link

afulki commented Jun 27, 2024

I increased the number of ports using the command:

sudo sysctl -w net.inet.ip.portrange.first=32768

which doubles the number available to 32k from 16k on the mac.

then tried again, these are the git processes running:
image

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

I just set concurrency to 1 - it still gets stuck. So something else is going on.

@dhoizner
Copy link

Might have been an ephemeral thing for me, i was running into this last night but everything is fine this morning (no concurrency set, running macOS 14.5).

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

In my case these linger around even with concurrency 1:

user    793134  0.0  0.0  10464  4436 ?        S    10:21   0:00 git fetch --recurse-submodules --tags --force --progress
user    793135  0.0  0.0  10452  3368 ?        S    10:21   0:00 /usr/lib/git-core/git remote-https origin https://github.com/preservim/tagbar.git
user    793136  0.0  0.0 100048 11332 ?        S    10:21   0:00 /usr/lib/git-core/git-remote-https origin https://github.com/preservim/tagbar.git

@afulki
Copy link

afulki commented Jun 27, 2024

It isn't just this, I tried a brew update, and it's having issues too:

fatal: unable to access 'https://github.com/Homebrew/brew/': Failed to connect to github.com port 443 after 75021 ms: Couldn't connect to server

@folke thanks for the assistance, I will dig deeper

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

It could be github started blocking requests?

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

Setting low timeout sort of works around this a bit.

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

What's the problem though is that git processes are hanging after nvim exits. So lazy.nvim fails to kill them despite saying that it did. That is a bug.

@folke
Copy link
Owner

folke commented Jun 27, 2024

It's a sub-process that is not detached, so it should be killed regardless by the OS when exiting Neovim

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

Well, I set concurrency to 1 and this still happens. So may be github has some weird dynamic blocking once it detected something?

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

To troubleshoot we'd need some non github set up and see if this is reproducible or not.

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

OK, setting checker concurrency to 1 seems to be improving things. Still gets stuck if you re-run it a bunch of times in a row, but way less frequently.

   checker = {
      concurrency = 1
   },

I suspect github might have some anti-bot mechanism that's oversensitive.

@folke
Copy link
Owner

folke commented Jun 27, 2024

Yeah, probably more like max X requests per hour, so concurrency might not be that relevant.

@folke
Copy link
Owner

folke commented Jun 27, 2024

I wonder what the best option for lazy is here. It's annoying that Github doesn't publish their dynamic rate limiting policies for simple git requests. All I could find is that normally there's no rate limits, but requests do get stalled on purpose when your host is causing degraded performance on a given Github server.
So that's why you see the git requests hanging.

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

Is there a way to specify a short timeout in the git request to break that stall on the client side? Failures are better than hanging waits.

@folke
Copy link
Owner

folke commented Jun 27, 2024

@folke
Copy link
Owner

folke commented Jun 27, 2024

you could lower the lazy git timeout though

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

I tried that, but that I/O blocking makes it hang still. I.e. I want git itself abort sooner if the server is doing something weird.

@folke
Copy link
Owner

folke commented Jun 27, 2024

I could change the lazy code to use sigkill instead of sigint, but that's not really pretty.
Or kill it in two steps, first try sigint and a couple of seconds later with sigkill

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

Yeah, I saw that article. I guess there is no easy way. Could be a missing git feature.

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

Hmm

If the data rate of an HTTP operation is lower than GIT_HTTP_LOW_SPEED_LIMIT bytes per second
for longer than GIT_HTTP_LOW_SPEED_TIME seconds, Git will abort that operation. These values
override the http.lowSpeedLimit and http.lowSpeedTime configuration values.

https://git-scm.com/book/en/v2/Git-Internals-Environment-Variables

@folke
Copy link
Owner

folke commented Jun 27, 2024

Can also check your limits on open file descriptors and max connects so ckets etc?

I have all of those raised so that's maybe why I'm not seeing issues.

I have around 78 plugins

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

I don't think I touched those. Which kernel variables did you increase?

@folke
Copy link
Owner

folke commented Jun 27, 2024

you can check your limits with ulimit -a.

And then your current values are in /proc.

You can find lots of guides online that can help you.

If you did not raise your limits, then that may also cause issues

Not home right now

@shmerl
Copy link
Author

shmerl commented Jun 27, 2024

I'll take a look at limits a bit later, but just checking what happens in git when it's stuck that way, it is I/O like I thought:

strace: Process 839108 attached
read(7

@davepgreene
Copy link

I'm running into this issue as well including orphaned git processes.

@shmerl
Copy link
Author

shmerl commented Jun 28, 2024

So, I checked my limits for open file descriptors:

ulimit -a
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 253357
max locked memory           (kbytes, -l) 8117204
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 253357
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

ulimit -aH
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) unlimited
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 253357
max locked memory           (kbytes, -l) 8117204
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1073741816
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) unlimited
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 253357
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

cat /proc/sys/fs/file-max
9223372036854775807

I tried increasing the soft limit, that didn't really resolve it.

@shmerl
Copy link
Author

shmerl commented Jun 28, 2024

Random guess, but may be it's about initiating requests in parallel at the same time (i.e. github not liking that)? I.e. may be staggering them with small delays (rather than completely dispatching them serially with no concurrency) could mitigate it?

I can try writing some tests to simulate different scenarios.

@shmerl
Copy link
Author

shmerl commented Jun 28, 2024

Huh, even a simple script like that gets stuck (sometimes) and it's not even parallelized:

#!/bin/bash

cd $HOME/.local/share/nvim/lazy

dirs=( $(find . -maxdepth 1 -mindepth 1 -xtype d) )

for dir in ${dirs[@]}; do
   echo "$dir"

   cd "$dir"
   git fetch --recurse-submodules --tags --force --progress
   cd ..
done

I doubt it's related to limits. Something weird is going on on github's side most likely.

@shmerl
Copy link
Author

shmerl commented Jun 28, 2024

Another weird detail - just tried lazy checks over VPN - and it's not getting stuck even with full parallelism! Even more reason to suspect github now.

@jyuan0
Copy link

jyuan0 commented Jun 28, 2024

Even more reason to suspect github now.

I've been having the same problem, and I'm starting to suspect the same.

@shmerl
Copy link
Author

shmerl commented Jun 28, 2024

After recent lazy update I don't see this issue anymore.

@jyuan0
Copy link

jyuan0 commented Jun 28, 2024

I hadn't updated lazy yet, but it's not happening for me anymore too.

@davepgreene
Copy link

Yup, same with me. Oh well 🤷‍♂️

@polyzen
Copy link
Contributor

polyzen commented Jun 29, 2024

Indeed probably was a GitHub issue. I was also getting a lot of timeout issues from GitHub release/tag Atom feeds I was following, that ended around the time of @shmerl's comment.

@bobhenkel
Copy link

I've had the same issue over the last week. I'm on Debian 12, behind a VPN if that matters.

@bobhenkel
Copy link

I set concurrency = 1 and that seem to fix the issue. Seems like a bug to me. I mean I have 16 threads seems odd it timeouts out. I play more with concurrency though.
Thanks for the tips!

@kah-ell
Copy link

kah-ell commented Oct 28, 2024

Had this problem for ages and today finally got around to try and fix it - this page here was a godsend.

Let's just say that if you have over 200 plugins (yeah just trying out "a couple" right now) and a Threadripper processor it's absolutely imperative to set that concurrency limit. Github starts to throw handshake errors at around 120 plugin checks with default settings. "concurrency = 8" seems to work fine.

Lazy UI looks a lot better now too, I just assumed an endless list of "Fetch" lines was the way it's supposed to be. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests