Skip to content

MNDS.begin crashes ESP with Fatal exception 28(LoadProhibitedCause) when used with WiFiManager #4417

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
chadham opened this issue Feb 22, 2018 · 21 comments

Comments

@chadham
Copy link

chadham commented Feb 22, 2018

Basic Infos

Wemos NodeMCU dev board 2.4 core

Hardware

Description

Problem description

Settings in IDE

Module: ?WEMOS Node MCU
Flash Size: ?4MB
CPU Frequency: ?80Mhz?
Flash Mode: ?qio?
Flash Frequency: ?40Mhz
Upload Using: ?SERIAL
Reset Method: ?ck / nodemcu?

Sketch

#include <Arduino.h>
#include <ESP8266mDNS.h>
#include <WiFiManager.h>

WiFiManager wifiManager;

void setup() {
  Serial.begin(115200);
  Serial.setDebugOutput(true);
  WiFi.disconnect();
  wifiManager.autoConnect("DEVICE", "12");

  Serial.print("\nConnected to "); Serial.print(WiFi.SSID()); Serial.print(" ,IP address is "); Serial.print(WiFi.localIP()); Serial.print(" and signal strength is "); Serial.println(WiFi.RSSI());
  while (WiFi.status() != WL_CONNECTED){
    Serial.print("#");
  }
  if (!MDNS.begin("hello")) { Serial.println("Error setting up MDNS responder!"); }
  else { Serial.println("Started MDNS"); }
  
}

void loop() {}

Debug Messages

Connected to TrojanBackdoorVirus ,IP address is 10.0.1.111 and signal strength is -58
Fatal exception 28(LoadProhibitedCause):
epc1=0x40225d9b, epc2=0x00000000, epc3=0x00000000, excvaddr=0x00000098, depc=0x00000000

Exception (28):
epc1=0x40225d9b epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000098 depc=0x00000000

ctx: cont
sp: 3fff0570 end: 3fff09a0 offset: 01a0

stack>>>
3fff0710: 400043e6 00000030 00000016 ffffffff
3fff0720: 402201c8 3fff17bc 3fff135c 3fff2668
3fff0730: 3fff2872 08000000 08000000 00000000
3fff0740: 0000ffff 00042035 00002035 003fe000
3fff0750: 3fff2872 40104456 3fff17bc 3fff2668
3fff0760: 0000002e 3fff0e00 3fff17bc 401043d1
3fff0770: 3fff282c 3fff3ec4 00000049 00001627
3fff0780: 00000027 00000004 00000000 3fff0e35
3fff0790: 3fff0e00 3fff0e00 3fff282c 4010453d
3fff07a0: 4021218b 3fff282c 3fff2872 40212194
3fff07b0: 00000008 3fff07e0 3fff2040 ffff8000
3fff07c0: 00000030 3fff1358 00000090 3fff2880
3fff07d0: 3fff282c 3fff27c8 3fff0e00 40218229
3fff07e0: 005e0001 4000fb00 60000200 3fff0890
3fff07f0: 00000018 3fff27c8 3fff282c 40218cb0
3fff0800: 00000000 400042db 3fffc718 00000000
3fff0810: 3fff0884 00000001 00000000 00000002
3fff0820: 3fff0d60 00000004 000003ff 40219328
3fff0830: 40106c91 3fff1ddc 40248d50 3fff27c4
3fff0840: 00000000 3fff282c 3fff2898 40218cf0
3fff0850: 3fff0e00 3fff0880 00000004 3fff27c4
3fff0860: 00000016 3fff0e00 3fff27c4 40218481
3fff0870: 3fff0e00 3fff0880 00000004 402121f0
3fff0880: 00000494 0104a8c0 3fff0f5c 4021222c
3fff0890: 3fff27c8 3fff0e00 3fff258c 40218552
3fff08a0: 3fff276c 3ffef66c 3fff0950 00000020
3fff08b0: 3fff091c 3fff0e00 3fff27c4 4021872b
3fff08c0: 3fff091c 3ffe97fc 3fff0e00 402187b7
3fff08d0: 3fff0800 00000001 3fff0950 00000001
3fff08e0: 3ffef624 3ffef668 3ffef624 40202586
3fff08f0: 00000000 3ffef668 3fff0940 40100690
3fff0900: 3ffef88c 000002a1 00000000 4020af70
3fff0910: 00000000 3ffef668 3ffef650 fb0000e0
3fff0920: 00000000 3ffef668 3ffef624 3ffef970
3fff0930: 00000000 3ffef668 3ffef624 40203822
3fff0940: 3fff23cc 3fff0970 402021ec 402026a8
3fff0950: 00000000 00000000 3ffef8b0 4020a3d8
3fff0960: 3ffe8ab9 3ffef668 3ffef8b0 40202150
3fff0970: 3ffe92e0 6f01000a 00000000 feefeffe
3fff0980: 3fffdad0 00000000 3ffef968 4020b07c
3fff0990: feefeffe feefeffe 3ffef980 40100710
<<<stack<<<

messages here

stack dump
Decoding 33 results 0x40220184: ieee80211_output_pbuf at ?? line ? 0x40104456: glue2esp_linkoutput at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/glue-esp/lwip-esp.c line 292 0x401043d1: glue2esp_linkoutput at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/glue-esp/lwip-esp.c line 263 0x4010453d: new_linkoutput at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/glue-lwip/lwip-git.c line 240 0x40212147: ethernet_output at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/netif/ethernet.c line 305 0x40212150: ethernet_output at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/netif/ethernet.c line 305 0x401051da: os_printf_plus at ?? line ? 0x40232f26: pp_attach at ?? line ? 0x402181e5: etharp_output_LWIP2 at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/etharp.c line 893 0x40218c6c: ip4_output_if_opt_src at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/ip4.c line 962 0x402192e4: mem_malloc at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/mem.c line 136 0x40106c91: __wrap_spi_flash_read at /Users/mchadha/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.0/cores/esp8266/core_esp8266_phy.c line 267 0x40248d10: sleep_reset_analog_rtcreg_8266 at ?? line ? 0x40218cac: ip4_output_if_opt at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/ip4.c line 788 0x4021843d: igmp_send at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/igmp.c line 570 0x402121ac: do_memp_malloc_pool at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/memp.c line 231 0x402121e8: memp_malloc at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/memp.c line 231 0x4021850e: igmp_lookup_group at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/igmp.c line 570 0x402186e7: igmp_start_timer at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/igmp.c line 570 : (inlined by) igmp_joingroup_netif at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/igmp.c line 521 0x40218773: igmp_joingroup at /home/david/dev/esp8266/origin/tools/sdk/lwip2/builder/lwip2-src/src/core/ipv4/igmp.c line 570 0x40202542: MDNSResponder::_listen() at /Users/mchadha/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.0/libraries/ESP8266mDNS/ESP8266mDNS.cpp line 396 0x40100690: free at /Users/mchadha/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.0/cores/esp8266/umm_malloc/umm_malloc.c line 1737 0x4020af2c: operator delete(void) at /Users/mchadha/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.0/cores/esp8266/abi.cpp line 84 0x402037de: MDNSResponder::begin(char const) at /Users/mchadha/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.0/libraries/ESP8266mDNS/ESP8266mDNS.cpp line 396 0x402021a8: _M_manager at /Users/mchadha/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.0/libraries/ESP8266mDNS/ESP8266mDNS.cpp line 396 0x40202664: operator() at /Users/mchadha/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.0/libraries/ESP8266mDNS/ESP8266mDNS.cpp line 396 : (inlined by) _M_invoke at /Users/mchadha/Library/Arduino15/packages/esp8266/tools/xtensa-lx106-elf-gcc/1.20.0-26-gb404fb9-2/xtensa-lx106-elf/include/c++/4.8.2/functional line 2071 0x4020a394: Print::println(int, int) at /Users/mchadha/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.0/cores/esp8266/Print.cpp line 87 0x40202110: setup at /Users/mchadha/Documents/Arduino/wifiandMDNStest/wifiandMDNStest.ino line 23 0x4020b038: loop_wrapper at /Users/mchadha/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.0/cores/esp8266/core_esp8266_main.cpp line 57 0x40100710: cont_norm at /Users/mchadha/Library/Arduino15/packages/esp8266/hardware/esp8266/2.4.0/cores/esp8266/cont.S line 109

@chadham
Copy link
Author

chadham commented Feb 24, 2018

Confirmed issue remains after erase_flash. Looking for some help here please.

Thanks.

@HugoML
Copy link

HugoML commented Feb 26, 2018

I have the same issue but seems random.
My code does not crash consistently. It happens randomly.
Has any one got any ideas? Thanks.

WiFi.mode(WIFI_STA);
WiFi.hostname(myHostName.c_str());
WiFi.begin(strSSID.c_str(), strPassword.c_str());
MDNS.begin(myHostName.c_str());

Exception (28):
epc1=0x40230c23 epc2=0x00000000 epc3=0x00000000 excvaddr=0x00000098 depc=0x00000000

ctx: sys
sp: 3ffff970 end: 3fffffb0 offset: 01a0

stack>>>
3ffffb10: fffffffe 00000000 4010100f 3fff486c
3ffffb20: 4022b050 3fff5c4c 3fff6294 3fff6e08
3ffffb30: 3fff5c12 40237bdb 3fff486c 3fff6bc4
3ffffb40: 00000000 4022bd03 3fff17cc 3fff585c
3ffffb50: 3fff5c12 40104456 3fff5c4c 3fff6e08
3ffffb60: 0000002e 3fff4674 3fff5c4c 401043d1
...

@devyte
Copy link
Collaborator

devyte commented Feb 27, 2018

The MDNS responder has several issues, and I believe calling begin() multiple times is one of them.
I'm currently working on a rewrite, but it will take a (possibly long) while.

@lawrence-jeff
Copy link

Also getting Exception(28) on MDNS.Begin (on version 2.4)

@chadham
Copy link
Author

chadham commented Feb 28, 2018

I don't believe my issue is being caused by multiple calls to begin().

If I use hardcoded ssid and password to start wifi, it works fine (every time).
If I use WiFiManager to get the ssid, password, this fails every time.

I've confirmed the wifi connection is active by checking Wifi.status()

See discussion in WifiManager forum

tzapu/WiFiManager#537

@debsahu
Copy link

debsahu commented Mar 3, 2018

I'm having similar issues as well. mdns server begin after WiFi manager causes crashes sometimes. Setting SSID and password manually fixes this issue, but that is not a solution.

@lawrence-jeff
Copy link

FYI - Didn't change the code but moved back to Core 2.3 and the issue went away

@CZEMacLeod
Copy link

Don't know if it is related, but I am experiencing almost the same crash with MDNS (and OTA which calls MDNS) when in AP mode rather then STA mode.

@igrr
Copy link
Member

igrr commented Mar 3, 2018

@lawrence-jeff Is the issue also present with core 2.4.0 if you choose LwIP v1.4 from tools menu?

@igrr
Copy link
Member

igrr commented Mar 3, 2018

Observation based on the stack dump: WiFi "station got IP" event is dispatched to mDNS responder, which calls igmp_joingroup, which indirectly calls glue2esp_linkoutput, which calls netif->output with netif == NULL (based on excvaddr=0x00000098).

I think this might have been partially fixed by @d-a-v here: d-a-v/esp82xx-nonos-linklayer@afc9297#diff-bb3aa6bcbd43cb748d0152bb3478ad46.

That is, the crash should probably be fixed. But why we might be calling igmp_joingroup when netif is still down is something i don't understand. This looks like another ordering issue when processing WiFi events. MDNS subscribes to "Station got IP" event, and when it is dispatched, the netif should definitely be up.

@devyte This does not look like an issue with mDNS library to me. I think it is something related to LwIP or WiFi event handling or both.

@d-a-v d-a-v self-assigned this Mar 3, 2018
@lawrence-jeff
Copy link

@igrr - Yes, if using 2.4 I change it to LwIP 1.4 Pre-built I do not have the crash and everything seems to work correctly.

@d-a-v
Copy link
Collaborator

d-a-v commented Mar 3, 2018

2.4.0 is known to have issues with lwip2.
I can't reproduce with latest git master version of core.
I used the above sketch and added a service to discover. I can see it with android's "Bonjour Browser" after connecting to my AP with WiFiManager (no crash).

22:07:01.050 -> scandone
22:07:01.051 -> *WM: 
22:07:01.051 -> *WM: AutoConnect
22:07:01.051 -> *WM: Connecting as wifi client...
22:07:01.051 -> *WM: Using last saved values, should be faster
22:07:01.051 -> *WM: Connection result: 
22:07:01.051 -> *WM: 0
22:07:01.051 -> mode : sta(60:01:94:1a:8b:cd) + softAP(62:01:94:1a:8b:cd)
22:07:01.051 -> add if1
22:07:01.051 -> dhcp server start:(ip:192.168.4.1,mask:255.255.255.0,gw:192.168.4.1)
22:07:01.051 -> bcn 100
22:07:01.183 -> *WM: SET AP STA
22:07:01.183 -> *WM: 
22:07:01.183 -> *WM: Configuring access point... 
22:07:01.183 -> *WM: DEVICE
22:07:01.686 -> *WM: AP IP address: 
22:07:01.686 -> *WM: 192.168.4.1
22:07:01.686 -> *WM: HTTP server started
22:07:02.617 -> add 1
22:07:02.618 -> aid 1
22:07:02.618 -> station: bc:f5:ac:fd:5e:f1 join, AID = 1
22:07:09.699 -> *WM: Request redirected to captive portal
22:07:15.912 -> *WM: Request redirected to captive portal
22:07:15.974 -> *WM: Handle root
22:07:16.876 -> *WM: Request redirected to captive portal
22:07:16.972 -> *WM: Request redirected to captive portal
22:07:17.271 -> *WM: Request redirected to captive portal
22:07:19.564 -> scandone
22:07:19.564 -> *WM: Scan done
22:07:19.564 -> *WM: open
22:07:19.564 -> *WM: -49
22:07:19.644 -> *WM: Sent config page
22:07:19.644 -> *WM: Request redirected to captive portal
22:07:20.760 -> *WM: Request redirected to captive portal
22:07:22.290 -> *WM: Request redirected to captive portal
22:07:27.409 -> *WM: WiFi save
22:07:27.476 -> *WM: Sent wifi save page
22:07:29.471 -> *WM: Connecting to new AP
22:07:29.471 -> *WM: Connecting as wifi client...
22:07:31.774 -> scandone
22:07:32.771 -> state: 0 -> 2 (b0)
22:07:32.771 -> state: 2 -> 3 (0)
22:07:32.771 -> state: 3 -> 5 (10)
22:07:32.771 -> add 0
22:07:32.771 -> aid 4
22:07:32.771 -> cnt 
22:07:32.804 -> 
22:07:32.805 -> connected with open, channel 1
22:07:32.805 -> dhcp client start...
22:07:36.827 -> ip:192.168.1.239,mask:255.255.255.0,gw:192.168.1.254
22:07:36.860 -> *WM: Connection result: 
22:07:36.860 -> *WM: 3
22:07:36.893 -> station: bc:f5:ac:fd:5e:f1 leave, AID = 1
22:07:36.893 -> rm 1
22:07:36.893 -> bcn 0
22:07:36.893 -> del if1
22:07:36.893 -> pm open,type:2 0
22:07:36.893 -> mode : sta(60:01:94:1a:8b:cd)
22:07:36.993 -> 
22:07:36.993 -> Connected to open ,IP address is 192.168.1.239 and signal strength is -54
22:07:37.026 -> Started MDNS

@igrr lwIP routes 224. to all the interfaces and link flagged up. The lwip2-issue was also a misunderstanding that the interface was up where it was not, happening when switching from AP to STA (if I remember well).

@lawrence-jeff
Copy link

@d-a-v Does that script fail with 2.4.0 release (and lwip2)

@d-a-v
Copy link
Collaborator

d-a-v commented Mar 3, 2018

Yes it does fail all the same (with core at current git master and lwip2 tag arduino-2.4.0).

@d-a-v
Copy link
Collaborator

d-a-v commented Mar 8, 2018

@lawrence-jeff can you check with 2.4.1 ?

@schweini
Copy link

updating the esp8622 board software via the board manager seems to have fixed this MDNS issue, at least for me.

@comino
Copy link
Contributor

comino commented Apr 20, 2018

I still have this / or a similar issue with latest git version, but only on one specific router (FritzBox) (out of 3 in total).

There is no crash anymore, but a watchdog timer reset without a stack dump to look at.
As far as I can see it hangs inside "igmp_joingroup" on 'while (netif != NULL)' hinting to same issues as @igrr described in his post here in this thread.

I leave this here in case someone can confirm this.

@TD-er
Copy link
Contributor

TD-er commented Aug 28, 2018

On ESPeasy I also get a number of reports about the same Watchdog timer resets as described by @comino .
As suggested here, by @DittelHome , it may be correlating with poor wifi reception.

I am not able to reproduce it myself, but I've seen Watchdog reboots on 1 node that is furthest away from the accesspoints. This particular node has been known to run extremely stable (100+ days uptime, downtime due to power loss) on firmware versions prior to ESPeasy using WiFi events.

@arihantdaga
Copy link

I am having the similar issue where i am getting exception28 continuously...

@arihantdaga
Copy link

arihantdaga commented Nov 15, 2018

A lot of times when using interrupts based sensors like HLW(A power measurement IC) with MDNS then calling mdns.begin() is causeing esp to restart due to hardware watch dog reset. Although if i turn off Interrupts for the same sensor. It runs fine. Definitely there is something in mdns that needs to be taken care of i think. I am using latest Git version and Lwip 2 higher bandwidth.

@devyte
Copy link
Collaborator

devyte commented Dec 5, 2018

Closing in view of #5442 with a full rewrite of mdns.

@devyte devyte closed this as completed Dec 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests