Network flip flop #429

ottuzzi · 2013-11-11T17:19:07Z

Hi,

since when I upgraded the venerable 3.6.11+ I noted on all newer kernels a kind of network flip flop at boot. Something like this:

[   22.772034] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[   24.537515] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0xCDE1
[   25.279144] bcm2835-cpufreq: switching to governor ondemand
[   25.279174] bcm2835-cpufreq: switching to governor ondemand
[   25.408311] smsc95xx 1-1.1:1.0 eth0: link down
[   27.001218] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0xCDE1
[   27.833748] smsc95xx 1-1.1:1.0 eth0: link down
[   29.450200] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0xCDE1
[   29.647750] Adding 102396k swap on /var/swap.  Priority:-1 extents:1 across:102396k SS
[   30.266133] smsc95xx 1-1.1:1.0 eth0: link down
[   32.122655] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x4DE1
[   32.954562] smsc95xx 1-1.1:1.0 eth0: link down
[   34.571877] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x4DE1
[   35.386973] smsc95xx 1-1.1:1.0 eth0: link down
[   37.003071] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x4DE1
[   37.819097] smsc95xx 1-1.1:1.0 eth0: link down
[   39.453573] smsc95xx 1-1.1:1.0 eth0: link up, 10Mbps, full-duplex, lpa 0x4C61

Network goes up and down many times and then it just stabilizes: not a major issue... just curiosity on my side. Did anyone saw something similar? If I remember correctly on 3.6.11+ network did not switch back and forth so many times...

Thanks
Bye
Piero

ruuns · 2013-12-15T03:48:40Z

I can confirm this issue on my device using arch arm distribution.

[root@alarmpi ~]# uname -a
Linux alarmpi 3.10.24-1-ARCH #1 PREEMPT Fri Dec 13 01:21:41 CST 2013 armv6l GNU/Linux

dmesg returns the following output at boot up:

[    8.834205] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[   10.401782] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
[   46.150249] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[   46.254166] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[   47.812140] smsc95xx 1-1.1:1.0 eth0: link up, 100Mbps, full-duplex, lpa 0x45E1

popcornmix · 2013-12-15T12:11:31Z

Insufficient power supply would cause this behaviour:
http://elinux.org/R-Pi_Troubleshooting#Troubleshooting_power_problems

ruuns · 2013-12-15T15:02:19Z

I've checked my internal voltage between TP1 and TP2 during the booting process. It's constantly between 4.75V and 4.77V (with and without HDMI/Keyboard connections). It can be caused by my power supply (my power cable/supply is currently only a workaround).

I was also wondering at first because it seems this issue is only occuring at startup and later it stabilizes again (like it waits for all condensators until they are fully-charged). I'll try a more proper power supply.

ottuzzi · 2013-12-28T09:46:03Z

Hi there, it looks like that in my case it was a faulty network connection, not a raspberry or kernel problem. So, if ruuns agrees we can close this issue.

Thanks for your work
Bye
Piero

During the recent conversion of cgroup to kernfs, cgroup_tree_mutex which nests above both the kernfs s_active protection and cgroup_mutex is added to synchronize cgroup file type operations as cgroup_mutex needed to be grabbed from some file operations and thus can't be put above s_active protection. While this arrangement mostly worked for cgroup, this triggered the following lockdep warning. ====================================================== [ INFO: possible circular locking dependency detected ] 3.15.0-rc3-next-20140430-sasha-00016-g4e281fa-dirty #429 Tainted: G W ------------------------------------------------------- trinity-c173/9024 is trying to acquire lock: (blkcg_pol_mutex){+.+.+.}, at: blkcg_reset_stats (include/linux/spinlock.h:328 block/blk-cgroup.c:455) but task is already holding lock: (s_active#89){++++.+}, at: kernfs_fop_write (fs/kernfs/file.c:283) which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (s_active#89){++++.+}: lock_acquire (arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) __kernfs_remove (arch/x86/include/asm/atomic.h:27 fs/kernfs/dir.c:352 fs/kernfs/dir.c:1024) kernfs_remove_by_name_ns (fs/kernfs/dir.c:1219) cgroup_addrm_files (include/linux/kernfs.h:427 kernel/cgroup.c:1074 kernel/cgroup.c:2899) cgroup_clear_dir (kernel/cgroup.c:1092 (discriminator 2)) rebind_subsystems (kernel/cgroup.c:1144) cgroup_setup_root (kernel/cgroup.c:1568) cgroup_mount (kernel/cgroup.c:1716) mount_fs (fs/super.c:1094) vfs_kern_mount (fs/namespace.c:899) do_mount (fs/namespace.c:2238 fs/namespace.c:2561) SyS_mount (fs/namespace.c:2758 fs/namespace.c:2729) tracesys (arch/x86/kernel/entry_64.S:746) -> #1 (cgroup_tree_mutex){+.+.+.}: lock_acquire (arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) mutex_lock_nested (kernel/locking/mutex.c:486 kernel/locking/mutex.c:587) cgroup_add_cftypes (include/linux/list.h:76 kernel/cgroup.c:3040) blkcg_policy_register (block/blk-cgroup.c:1106) throtl_init (block/blk-throttle.c:1694) do_one_initcall (init/main.c:789) kernel_init_freeable (init/main.c:854 init/main.c:863 init/main.c:882 init/main.c:1003) kernel_init (init/main.c:935) ret_from_fork (arch/x86/kernel/entry_64.S:552) -> #0 (blkcg_pol_mutex){+.+.+.}: __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) lock_acquire (arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) mutex_lock_nested (kernel/locking/mutex.c:486 kernel/locking/mutex.c:587) blkcg_reset_stats (include/linux/spinlock.h:328 block/blk-cgroup.c:455) cgroup_file_write (kernel/cgroup.c:2714) kernfs_fop_write (fs/kernfs/file.c:295) vfs_write (fs/read_write.c:532) SyS_write (fs/read_write.c:584 fs/read_write.c:576) tracesys (arch/x86/kernel/entry_64.S:746) other info that might help us debug this: Chain exists of: blkcg_pol_mutex --> cgroup_tree_mutex --> s_active#89 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(s_active#89); lock(cgroup_tree_mutex); lock(s_active#89); lock(blkcg_pol_mutex); *** DEADLOCK *** 4 locks held by trinity-c173/9024: #0: (&f->f_pos_lock){+.+.+.}, at: __fdget_pos (fs/file.c:714) #1: (sb_writers#18){.+.+.+}, at: vfs_write (include/linux/fs.h:2255 fs/read_write.c:530) #2: (&of->mutex){+.+.+.}, at: kernfs_fop_write (fs/kernfs/file.c:283) #3: (s_active#89){++++.+}, at: kernfs_fop_write (fs/kernfs/file.c:283) stack backtrace: CPU: 3 PID: 9024 Comm: trinity-c173 Tainted: G W 3.15.0-rc3-next-20140430-sasha-00016-g4e281fa-dirty #429 ffffffff919687b0 ffff8805f6373bb8 ffffffff8e52cdbb 0000000000000002 ffffffff919d8400 ffff8805f6373c08 ffffffff8e51fb88 0000000000000004 ffff8805f6373c98 ffff8805f6373c08 ffff88061be70d98 ffff88061be70dd0 Call Trace: dump_stack (lib/dump_stack.c:52) print_circular_bug (kernel/locking/lockdep.c:1216) __lock_acquire (kernel/locking/lockdep.c:1840 kernel/locking/lockdep.c:1945 kernel/locking/lockdep.c:2131 kernel/locking/lockdep.c:3182) lock_acquire (arch/x86/include/asm/current.h:14 kernel/locking/lockdep.c:3602) mutex_lock_nested (kernel/locking/mutex.c:486 kernel/locking/mutex.c:587) blkcg_reset_stats (include/linux/spinlock.h:328 block/blk-cgroup.c:455) cgroup_file_write (kernel/cgroup.c:2714) kernfs_fop_write (fs/kernfs/file.c:295) vfs_write (fs/read_write.c:532) SyS_write (fs/read_write.c:584 fs/read_write.c:576) This is a highly unlikely but valid circular dependency between "echo 1 > blkcg.reset_stats" and cfq module [un]loading. cgroup is going through further locking update which will remove this complication but for now let's use trylock on blkcg_pol_mutex and retry the file operation if the trylock fails. Signed-off-by: Tejun Heo <[email protected]> Reported-by: Sasha Levin <[email protected]> References: http://lkml.kernel.org/g/[email protected]

Andrey reported the following while fuzzing the kernel with syzkaller: kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Modules linked in: CPU: 0 PID: 3859 Comm: a.out Not tainted 4.9.0-rc6+ #429 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff8800666d4200 task.stack: ffff880067348000 RIP: 0010:[<ffffffff833617ec>] [<ffffffff833617ec>] icmp6_send+0x5fc/0x1e30 net/ipv6/icmp.c:451 RSP: 0018:ffff88006734f2c0 EFLAGS: 00010206 RAX: ffff8800666d4200 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: dffffc0000000000 RDI: 0000000000000018 RBP: ffff88006734f630 R08: ffff880064138418 R09: 0000000000000003 R10: dffffc0000000000 R11: 0000000000000005 R12: 0000000000000000 R13: ffffffff84e7e200 R14: ffff880064138484 R15: ffff8800641383c0 FS: 00007fb3887a07c0(0000) GS:ffff88006cc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000000 CR3: 000000006b040000 CR4: 00000000000006f0 Stack: ffff8800666d4200 ffff8800666d49f8 ffff8800666d4200 ffffffff84c02460 ffff8800666d4a1a 1ffff1000ccdaa2f ffff88006734f498 0000000000000046 ffff88006734f440 ffffffff832f4269 ffff880064ba7456 0000000000000000 Call Trace: [<ffffffff83364ddc>] icmpv6_param_prob+0x2c/0x40 net/ipv6/icmp.c:557 [< inline >] ip6_tlvopt_unknown net/ipv6/exthdrs.c:88 [<ffffffff83394405>] ip6_parse_tlv+0x555/0x670 net/ipv6/exthdrs.c:157 [<ffffffff8339a759>] ipv6_parse_hopopts+0x199/0x460 net/ipv6/exthdrs.c:663 [<ffffffff832ee773>] ipv6_rcv+0xfa3/0x1dc0 net/ipv6/ip6_input.c:191 ... icmp6_send / icmpv6_send is invoked for both rx and tx paths. In both cases the dst->dev should be preferred for determining the L3 domain if the dst has been set on the skb. Fallback to the skb->dev if it has not. This covers the case reported here where icmp6_send is invoked on Rx before the route lookup. Fixes: 5d41ce2 ("net: icmp6_send should use dst dev to determine L3 domain") Reported-by: Andrey Konovalov <[email protected]> Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>

onzulinapps · 2017-01-19T11:00:02Z

Hello, I have same problem, smsc95xx 1-1.1:1.0 eth0 : hardware isn't capable of remote wakeup
IPv6: eth0 link is not ready
how could I resolve my problem? regards

shelteroperations · 2018-06-04T20:59:23Z

Same problem since Maybe January (Rpi 3, Raspbian):

smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup

Ideas???

pelwell · 2018-06-04T21:13:11Z

If that's the only error message then there isn't a problem - it's just a statement of fact. Is your Pi3 malfunctioning in some way?

shelteroperations · 2018-06-04T21:16:04Z

It dies every few days. It's remote so "dying" might not be "dying" but that I can't get into it without someone hard rebooting, then.... /var/log/messages shows this as the last message before I couldn't get in. I just installed ifplugd then ran sudo update-rc.d ifplugd enable. I don't know, maybe a bandaid, but we'll see.

pelwell · 2018-06-04T21:19:11Z

I can't check right now, but I think that message will be displayed every time the interface is brought up, so seeing it repeatedly in the log could be a symptom of some sort of network flakiness.

shelteroperations · 2018-06-04T22:19:46Z

Seems slow to login...... latest messages in /var/log/messages:

Jun 4 17:09:35 mana31 kernel: [ 23.726264] random: 7 urandom warning(s) missed due to ratelimiting
Jun 4 17:09:37 mana31 kernel: [ 25.914791] r820t 4-001a: destroying instance
Jun 4 17:09:37 mana31 kernel: [ 25.916363] dvb_usb_v2: 'Realtek RTL2832U reference design:1-1.4' successfully deinitialized and disconnected
Jun 4 17:09:47 mana31 kernel: [ 35.813546] r820t 6-001a: destroying instance
Jun 4 17:09:47 mana31 kernel: [ 35.814084] dvb_usb_v2: 'Realtek RTL2832U reference design:1-1.5' successfully deinitialized and disconnected
Jun 4 17:16:08 mana31 kernel: [ 416.443273] device eth0 entered promiscuous mode
Jun 4 17:21:25 mana31 kernel: [ 733.407713] device eth0 left promiscuous mode
Jun 4 18:00:10 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:00:40 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:01:18 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:01:48 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:05:15 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:05:45 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:07:03 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:07:33 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:09:01 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:09:31 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:10:15 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:10:45 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:10:45 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:11:15 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:11:17 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:11:47 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:15:25 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:15:55 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:16:01 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:16:31 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:17:01 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:18:01 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:18:04 mana31 rsyslogd-2007: action 'action 17' suspended, next retry is Mon Jun 4 18:19:04 2018 [try http://www.rsyslog.com/e/2007 ]
Jun 4 18:18:12 mana31 kernel: [ 4140.721808] device eth0 entered promiscuous mode
Jun 4 18:18:22 mana31 kernel: [ 4150.665603] device eth0 left promiscuous mode

JamesH65 · 2018-06-05T07:09:14Z

If you remove the DVB-T adapter, does it work correctly?

shelteroperations · 2018-06-05T12:09:09Z

There is no HDMI plugged in or any video if that's what you're asking. It runs text-only.

JamesH65 · 2018-06-05T12:38:33Z

The log you posted has a reference to a 'Realtek RTL2832U reference design:1-1.4. That's a DVB-T adaptor. Is it plugged in? Or is it an artefact of a particular kernel build? I'm not sure why its appearing in the log.

shelteroperations · 2018-06-05T12:41:11Z

Woops I read that quickly as the Ethernet adapter. I have two RF (radio frequency) USB sticks plugged in. But they've been plugged in for over a year, no problems at all, and used for service. I don't know why it keeps crashing.

JamesH65 · 2018-06-05T12:45:51Z

It would be worth removing them and seeing if the problem goes away. Could be a interaction between them. We've had similar before between Wireless and onboard ethernet - should've be independent. Turns out they were not under some very specific circumstances.

shelteroperations · 2018-06-05T13:06:00Z

Would sort of defeat the purpose of the Rpi there then. And as I mentioned no problems in over a year. Unless it's hardware failure. Would it help to install a USB hub so all those signals weren't "right next to" the ethernet port?

shelteroperations · 2018-06-05T13:06:56Z

I also uninstalled a bunch of packages recently I read in forums or wherever that were unnecessary and were just eating memory. I don't have a list of those, but is there any chance I uninstalled something that I needed? Is there a message that could help us understand if this were the case?

JamesH65 · 2018-06-05T13:15:07Z

When diagnosing a problem, you reduce the problem space until the problem goes away. We are simply trying to narrow down what the issue might be, and removing possible causing of the problem helps with that.

You might have uninstalled something important, impossible to tell. You need to start afresh with a new SD card, and also try with unusual peripherals removed. Add then add things back until the problem reappears.

shelteroperations · 2018-06-05T13:27:00Z

This is a true statement. As I noted though it's a remote machine we're using for something. I can try that if nothing else works next time I'm there. It hasn't crashed yet after I installed the ifplugd I mentioned above but it's only been a day. Still even if that's the case it sounds like a bandaid. As you initially noted it could also be the ethernet switch it's plugged into. I can't test any now remotely. I was just wondering if anything glaring jumped out at you, software related. Thank you for your kind help.

pelwell · 2018-06-06T14:08:09Z

It seems you can safely ignore (or even suppress) the "action 17" messages: https://raspberrypi.stackexchange.com/questions/47781/what-is-action-17

shelteroperations · 2018-06-06T22:39:55Z

Thanks.

iiv3 · 2018-12-15T15:25:02Z

After upgrading to "testing" I got a serious problem where rpi would obtain IP through DHCP, but then bring Ethernet down, sometimes multiple times and then it would not respond, even to disconnecting and reconnecting the lan cable.

The logs also showed a bunch of "hardware isn't capable of remote wakeup".

After numerous tries I solved the problem with removing "avahi-daemon".

I do suspect that the above message is related to the lan driver not been able to handle multicast properly and doing something wrong with the lan card.

So, if you have similar problem, avoid everything that might be using multicast.

Just to be clear, multicast is when networks uses groups with IP addresses from 224.0.0.0 to 239.255.255.255. Other than "avahi", it could be used by "upnp" and even "ntp" stuff.

lamazze · 2021-09-21T19:07:34Z

After upgrading my raspberry 3B from Jessie to Buster, I started experiencing "wlan0: carrier lost" problems once a day .
I've tried different solutions but stopping avahi service apparently solved the problem. Thanks for the suggestion!
Edit: finally disabling wicd, avachi-daemon and networking solved all the wifi connection problems.

ghost assigned P33M Nov 11, 2013

P33M closed this as completed Dec 31, 2013

Network flip flop #429

Network flip flop #429

Comments

ottuzzi commented Nov 11, 2013

ruuns commented Dec 15, 2013

Uh oh!

popcornmix commented Dec 15, 2013

Uh oh!

ruuns commented Dec 15, 2013

Uh oh!

ottuzzi commented Dec 28, 2013

Uh oh!

onzulinapps commented Jan 19, 2017

Uh oh!

shelteroperations commented Jun 4, 2018

Uh oh!

pelwell commented Jun 4, 2018

Uh oh!

shelteroperations commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pelwell commented Jun 4, 2018

Uh oh!

shelteroperations commented Jun 4, 2018

Uh oh!

JamesH65 commented Jun 5, 2018

Uh oh!

shelteroperations commented Jun 5, 2018

Uh oh!

JamesH65 commented Jun 5, 2018

Uh oh!

shelteroperations commented Jun 5, 2018

Uh oh!

JamesH65 commented Jun 5, 2018

Uh oh!

shelteroperations commented Jun 5, 2018

Uh oh!

shelteroperations commented Jun 5, 2018

Uh oh!

JamesH65 commented Jun 5, 2018

Uh oh!

shelteroperations commented Jun 5, 2018

Uh oh!

pelwell commented Jun 6, 2018

Uh oh!

shelteroperations commented Jun 6, 2018

Uh oh!

iiv3 commented Dec 15, 2018

Uh oh!

lamazze commented Sep 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shelteroperations commented Jun 4, 2018 •

edited

Loading

lamazze commented Sep 21, 2021 •

edited

Loading