MT6000 crashing (kernel core dump?)

I've posted this on the discord but have not seen any replies.

This has been happening for about a week now. The crash happens once or twice a day.
logread.zip (22.9 KB)

What works:

Wireless LAN - I can connect to the GL-MT6000 via wifi (sometimes).

What doesn't work:

WAN: No internet can be accessed when the situation occurs.
LAN: No ethernet connected devices can be resolved or connect to the GL-MT6000.
Wi-fi: Occasionally this will go down and both 2.4Ghz and 5Ghz will cycle off an one but no clients can connect.

A reboot of the router brings it back online.

System Logs attached.

It does not seem to be thermal related.

Please check the devices 58:1c:f8:7a:e6:bc and 4e:34:76:5a:63:cb
They are repeatedly associating and disassociating, which might be part of the issue.

One of those is a Lenovo Thinkpad laptop and the other is a OnePlus 10T+5G phone. I don't see anything unusual in the settings for either.

I'll run the laptop hardwired and turn off wifi for now and see if that makes a difference.

I just looked at the system and kernel logs (router is running fine at the moment) and I'm not seeing the same associating and disassociating behavior from those clients.

There is another person over on the discord with what seems to be the same problem on his MT6000.

I have the same problem on Google Pixel 7, but not frequently.

The DHCP server do not respect the lease time and it disconnect the Pixel7 instead of extending the lease time. When the "auto-connect" is ON, the device is disconnected and then it connect back again:

Tue Jul 23 00:02:07 2024 kern.warn kernel: [1876565.902697] 7986@C08L2,ap_peer_disassoc_action() 3645: ASSOC - 1 receive DIS-ASSOC request
Tue Jul 23 00:02:07 2024 kern.warn kernel: [1876565.911216] 7986@C01L2,wifi_sys_disconn_act() 1002:  wdev_idx=2
Tue Jul 23 00:02:07 2024 kern.notice kernel: [1876565.917526] 7986@C08L3,hw_ctrl_flow_v2_disconnt_act() 172: wdev_idx=2
Tue Jul 23 00:02:07 2024 kern.warn kernel: [1876565.924697] 7986@C13L2,MacTableDeleteEntry() 1938: Del Sta:3f:fe:3b:c7:c8:c8
Tue Jul 23 00:04:03 2024 kern.debug kernel: [1876681.567506] entrytb_aid_aquire(): found non-occupied aid:12, allocated from:4
Tue Jul 23 00:04:03 2024 kern.warn kernel: [1876681.574825] 7986@C13L2,MacTableInsertEntry() 1577: New Sta:3f:fe:3b:c7:c8:c8
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.584421] 7986@C08L3,ap_cmm_peer_assoc_req_action() 1714:  Recv Assoc from STA - 3f:fe:3b:c7:c8:c8
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.593890] 7986@C08L3,ap_cmm_peer_assoc_req_action() 2241: ASSOC Send ASSOC response (Status=0)...
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.603157] 7986@C01L3,wifi_sys_conn_act() 1115: wdev idx = 2
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.609317] 7986@C08L3,hw_ctrl_flow_v2_connt_act() 215: wdev_idx=2
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.762779] 7986@C15L3,WPABuildPairMsg1() 5310: <=== send Msg1 of 4-way
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.769565] 7986@C15L3,PeerPairMsg2Action() 6303: ===>Receive msg 2
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.776432] 7986@C15L3,WPABuildPairMsg3() 5595: <=== send Msg3 of 4-way
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.783213] 7986@C15L3,PeerPairMsg4Action() 6734: ===>Receive msg 4
Tue Jul 23 00:04:03 2024 kern.warn kernel: [1876681.794853] 7986@C15L2,PeerPairMsg4Action() 7098: AP SETKEYS DONE(rax0) - AKMMap=WPA2PSK, PairwiseCipher=AES, GroupCipher=AES, wcid=9 from 3f:fe:3b:c7:c8:c8
Tue Jul 23 00:04:03 2024 kern.warn kernel: [1876681.794853]
Tue Jul 23 00:04:03 2024 kern.err kernel: [1876681.810692] 7986@C14L1,ReceiveBTMQuery() 1536: Find peer address in BTMPeerList already
Tue Jul 23 00:04:03 2024 daemon.info dnsmasq-dhcp[23830]: DHCPREQUEST(br-lan) 192.168.6.171 3f:fe:3b:c7:c8:c8
Tue Jul 23 00:04:03 2024 daemon.info dnsmasq-dhcp[23830]: DHCPACK(br-lan) 192.168.6.171 3f:fe:3b:c7:c8:c8 Pixel-7

But, nobody from GL-iNet is able to reproduce this problem...

DHCP is a client not a server thing.

So the client seems to be the issue if it decides to disconnect.

On the discord, a user named Prime pointed out the issue. This has been a known problem in OpenWrt and was fixed in a patch backported to OpenWrt master / kernel 6.6.

We'll need GL.iNet to upgrade the kernel to a patched version in order to get a fix.

3 Likes

This is about the op24 version right?, i think since a month or maybe a bit longer ago OpenWrt indeed added a variation of mediatek related patches including for the ethernet but also wifi, now im not sure if op24 included the ethernet one or if it was forked before it, im refering to that commit where they dropped patches and merged the official added linux eth_soc variant, sadly github still doesn't have a history search in the commit data, i believe it was merged along with a bunch of commits from mt7988 or mt7996 (not the flint 2 target).

Though i agree it would nice to have a update again :+1:

-edit-
I see its in there click this is the full tree, but I think its still good to update it but thats on gl-inet :slight_smile:

@alzhao It seems this is know issue through OpenWrt. Can we get a fix? My MT6000 crashes several times a day. Just happened again:

35675.698389] ------------[ cut here ]------------
[35675.703009] NETDEV WATCHDOG: eth1 (mtk_soc_eth): transmit queue 0 timed out
[35675.709995] WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:473 dev_watchdog+0x2d8/0x2e0
[35675.718242] Modules linked in: pppoe ppp_async option wireguard usb_wwan pppox ppp_generic libchacha20poly1305 ipt_REJECT chacha_neon xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_MASQUERADE xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CT xt_CLASSIFY usbserial slhc rndis_host qmi_wwan poly1305_neon nf_reject_ipv4 nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_conntrack_netlink nf_conncount mtqos mtk_warp_proxy(P) mtfwd mt_wifi(P) libcurve25519_generic libchacha iptable_raw iptable_nat iptable_mangle iptable_filter ipt_ECN ipheth ip_tables huawei_cdc_ncm exfat crc_ccitt cdc_wdm cdc_ncm cdc_ether cdc_acm arptable_filter arpt_mangle arp_tables fuse sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred
[35675.718323]  act_gact xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_nat nf_nat ip6t_NPT nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ifb ip6_udp_tunnel udp_tunnel tun vfat fat ntfs nls_utf8 nls_iso8859_1 nls_cp437 shortcut_fe_ipv6 shortcut_fe mtdoops mtk_warp mtkhnat leds_gpio uhci_hcd ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl kmwan ehci_hcd gpio_button_hotplug gl_sdk4_tertf gl_sdk4_black_white_list f2fs ext4 mbcache jbd2 conninfra crc32c_generic crc32_generic gl_sdk4_hw_info
[35675.877580] CPU: 3 PID: 0 Comm: swapper/3 Tainted: P                  5.4.238 #0
[35675.884960] Hardware name: GL.iNet GL-MT6000 (DT)
[35675.889649] pstate: 60000005 (nZCv daif -PAN -UAO)
[35675.894431] pc : dev_watchdog+0x2d8/0x2e0
[35675.898427] lr : dev_watchdog+0x2d8/0x2e0
[35675.902422] sp : ffffffc01001bdb0
[35675.905722] x29: ffffffc01001bdb0 x28: 0000000000000140 
[35675.911019] x27: 00000000ffffffff x26: 0000000000000000 
[35675.916315] x25: 0000000000000003 x24: 0000000000000000 
[35675.921612] x23: 0000000000000001 x22: ffffff803e08b000 
[35675.926908] x21: ffffff803e08b480 x20: ffffffc010ab6000 
[35675.932205] x19: 0000000000000000 x18: 0000000000000000 
[35675.937502] x17: 0000000000000000 x16: 0000000000000000 
[35675.942799] x15: 0000000000000000 x14: ffffffc010b523da 
[35675.948095] x13: 00000000000027b8 x12: ffffffc010b51000 
[35675.953392] x11: ffffffc010ace000 x10: 0000000000000010 
[35675.958689] x9 : 0000000000000000 x8 : 2065756575712074 
[35675.963986] x7 : 696d736e61727420 x6 : 0000000000000001 
[35675.969283] x5 : ffffffc010358d88 x4 : 0000000000000008 
[35675.974580] x3 : 0000000000000004 x2 : 0000000000000004 
[35675.979877] x1 : 0000000000000004 x0 : 000000000000003f 
[35675.985175] Call trace:
[35675.987612]  dev_watchdog+0x2d8/0x2e0
[35675.991262]  call_timer_fn.isra.37+0x20/0x78
[35675.995519]  run_timer_softirq+0x1e0/0x468
[35675.999602]  __do_softirq+0x124/0x260
[35676.003251]  irq_exit+0xb8/0xc8
[35676.006383]  __handle_domain_irq+0x64/0xb8
[35676.010466]  gic_handle_irq+0xc0/0x158
[35676.014201]  el1_irq+0xb8/0x140
[35676.017331]  arch_cpu_idle+0x10/0x18
[35676.020895]  do_idle+0x120/0x148
[35676.024110]  cpu_startup_entry+0x20/0x60
[35676.028022]  secondary_start_kernel+0x148/0x158
[35676.032539] ---[ end trace 906c19c93bc084ac ]---

I will push guys to check.

1 Like

Great! Thank you for all your help!