I've posted this on the discord but have not seen any replies.
This has been happening for about a week now. The crash happens once or twice a day.
logread.zip (22.9 KB)
What works:
Wireless LAN - I can connect to the GL-MT6000 via wifi (sometimes).
What doesn't work:
WAN: No internet can be accessed when the situation occurs.
LAN: No ethernet connected devices can be resolved or connect to the GL-MT6000.
Wi-fi: Occasionally this will go down and both 2.4Ghz and 5Ghz will cycle off an one but no clients can connect.
A reboot of the router brings it back online.
System Logs attached.
It does not seem to be thermal related.
admon
July 24, 2024, 6:41am
3
Please check the devices 58:1c:f8:7a:e6:bc
and 4e:34:76:5a:63:cb
They are repeatedly associating and disassociating, which might be part of the issue.
One of those is a Lenovo Thinkpad laptop and the other is a OnePlus 10T+5G phone. I don't see anything unusual in the settings for either.
I'll run the laptop hardwired and turn off wifi for now and see if that makes a difference.
I just looked at the system and kernel logs (router is running fine at the moment) and I'm not seeing the same associating and disassociating behavior from those clients.
There is another person over on the discord with what seems to be the same problem on his MT6000.
Renato
July 24, 2024, 3:17pm
5
I have the same problem on Google Pixel 7, but not frequently.
The DHCP server do not respect the lease time and it disconnect the Pixel7 instead of extending the lease time. When the "auto-connect" is ON, the device is disconnected and then it connect back again:
Tue Jul 23 00:02:07 2024 kern.warn kernel: [1876565.902697] 7986@C08L2,ap_peer_disassoc_action() 3645: ASSOC - 1 receive DIS-ASSOC request
Tue Jul 23 00:02:07 2024 kern.warn kernel: [1876565.911216] 7986@C01L2,wifi_sys_disconn_act() 1002: wdev_idx=2
Tue Jul 23 00:02:07 2024 kern.notice kernel: [1876565.917526] 7986@C08L3,hw_ctrl_flow_v2_disconnt_act() 172: wdev_idx=2
Tue Jul 23 00:02:07 2024 kern.warn kernel: [1876565.924697] 7986@C13L2,MacTableDeleteEntry() 1938: Del Sta:3f:fe:3b:c7:c8:c8
Tue Jul 23 00:04:03 2024 kern.debug kernel: [1876681.567506] entrytb_aid_aquire(): found non-occupied aid:12, allocated from:4
Tue Jul 23 00:04:03 2024 kern.warn kernel: [1876681.574825] 7986@C13L2,MacTableInsertEntry() 1577: New Sta:3f:fe:3b:c7:c8:c8
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.584421] 7986@C08L3,ap_cmm_peer_assoc_req_action() 1714: Recv Assoc from STA - 3f:fe:3b:c7:c8:c8
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.593890] 7986@C08L3,ap_cmm_peer_assoc_req_action() 2241: ASSOC Send ASSOC response (Status=0)...
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.603157] 7986@C01L3,wifi_sys_conn_act() 1115: wdev idx = 2
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.609317] 7986@C08L3,hw_ctrl_flow_v2_connt_act() 215: wdev_idx=2
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.762779] 7986@C15L3,WPABuildPairMsg1() 5310: <=== send Msg1 of 4-way
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.769565] 7986@C15L3,PeerPairMsg2Action() 6303: ===>Receive msg 2
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.776432] 7986@C15L3,WPABuildPairMsg3() 5595: <=== send Msg3 of 4-way
Tue Jul 23 00:04:03 2024 kern.notice kernel: [1876681.783213] 7986@C15L3,PeerPairMsg4Action() 6734: ===>Receive msg 4
Tue Jul 23 00:04:03 2024 kern.warn kernel: [1876681.794853] 7986@C15L2,PeerPairMsg4Action() 7098: AP SETKEYS DONE(rax0) - AKMMap=WPA2PSK, PairwiseCipher=AES, GroupCipher=AES, wcid=9 from 3f:fe:3b:c7:c8:c8
Tue Jul 23 00:04:03 2024 kern.warn kernel: [1876681.794853]
Tue Jul 23 00:04:03 2024 kern.err kernel: [1876681.810692] 7986@C14L1,ReceiveBTMQuery() 1536: Find peer address in BTMPeerList already
Tue Jul 23 00:04:03 2024 daemon.info dnsmasq-dhcp[23830]: DHCPREQUEST(br-lan) 192.168.6.171 3f:fe:3b:c7:c8:c8
Tue Jul 23 00:04:03 2024 daemon.info dnsmasq-dhcp[23830]: DHCPACK(br-lan) 192.168.6.171 3f:fe:3b:c7:c8:c8 Pixel-7
But, nobody from GL-iNet is able to reproduce this problem...
admon
July 24, 2024, 3:43pm
6
DHCP is a client not a server thing.
So the client seems to be the issue if it decides to disconnect.
On the discord, a user named Prime pointed out the issue. This has been a known problem in OpenWrt and was fixed in a patch backported to OpenWrt master / kernel 6.6.
We'll need GL.iNet to upgrade the kernel to a patched version in order to get a fix.
opened 09:12PM - 16 Jul 23 UTC
bug
### Describe the bug
While using the router [Zyxel EX5601-T0](https://github.… com/openwrt/openwrt/commit/1c05388ab04c934ec240e8362321908f91381a90) I randomly encountered the following problem:
The Ethernet switch driver (mtk_soc_eth) stops working, for no reason (the crash log is attached in the Actual behavior section).
The following problem seems to be the same as this Issue: #12143
This problem blocks the normal use of the router as well as the functionality of the Ethernet ports.
### OpenWrt version
r23551-e21b4c9636
### OpenWrt target/subtarget
mediatek/filogic
### Device
Zyxel EX5601-T0
### Image kind
Official downloaded image
### Steps to reproduce
_**The mentioned issue is randomly encountered**_, I encountered the issue twice during an active upload stream (for example live video stream) a month apart.
### Actual behaviour
KERNEL LOG:
```
Sun Jul 16 20:31:12 2023 kern.warn kernel: [328426.306756] ------------[ cut here ]------------
Sun Jul 16 20:31:12 2023 kern.info kernel: [328426.311461] NETDEV WATCHDOG: eth1 (mtk_soc_eth): transmit queue 1 timed out
Sun Jul 16 20:31:12 2023 kern.warn kernel: [328426.318517] WARNING: CPU: 2 PID: 0 at dev_watchdog+0x330/0x33c
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.324427] Modules linked in: pppoe ppp_async nft_fib_inet nf_flow_table_ipv6 nf_flow_table_ipv4 nf_flow_table_inet pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject nft_redir nft_quota nft_objref nft_numgen nft_nat nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_counter nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack mt7915e mt76_connac_lib mt76 mac80211 cfg80211 slhc nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c crc_ccitt compat crypto_safexcel sha1_generic seqiv md5 des_generic libdes authencesn authenc leds_gpio gpio_button_hotplug
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.383717] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.15.120 #0
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.389879] Hardware name: Zyxel EX5601-T0 (DT)
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.394478] pstate: 40400005 (nZcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.401505] pc : dev_watchdog+0x330/0x33c
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.405586] lr : dev_watchdog+0x330/0x33c
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.409665] sp : ffffffc008c3bdb0
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.413049] x29: ffffffc008c3bdb0 x28: 0000000000000140 x27: 00000000ffffffff
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.420252] x26: 0000000000000000 x25: 0000000000000002 x24: ffffff800085a4c0
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.427454] x23: 0000000000000000 x22: 0000000000000001 x21: ffffffc008af6000
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.434656] x20: ffffff800085a000 x19: 0000000000000001 x18: ffffffc008b0a338
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.441858] x17: ffffffc0372cf000 x16: ffffffc008c38000 x15: 00000000000005b8
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.449060] x14: 00000000000001e8 x13: ffffffc008c3bad8 x12: ffffffc008b62338
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.456262] x11: 712074696d736e61 x10: ffffffc008b62338 x9 : 0000000000000000
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.463464] x8 : ffffffc008b0a2e8 x7 : ffffffc008b0a338 x6 : 0000000000000001
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.470666] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.477867] x2 : ffffff803fdad080 x1 : ffffffc0372cf000 x0 : 000000000000003f
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.485071] Call trace:
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.487591] dev_watchdog+0x330/0x33c
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.491326] call_timer_fn.constprop.0+0x20/0x80
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.496014] __run_timers.part.0+0x208/0x284
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.500354] run_timer_softirq+0x38/0x70
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.504347] _stext+0x10c/0x28c
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.507559] __irq_exit_rcu+0xdc/0xfc
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.511295] irq_exit+0xc/0x1c
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.514423] handle_domain_irq+0x60/0x8c
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.518420] gic_handle_irq+0x50/0x120
Sun Jul 16 20:31:12 2023 kern.debug kernel: [328426.522244] call_on_irq_stack+0x20/0x34
...
...
...
```
**_Actual behaviour_**
Using the following configuration:
- 2.5 Gbps Fiber ONT is connected on the ETH1 (wan) port (ISP: TIM, Italy).
- A PPoE connection is established through the Fiber ONT on the following router.
- All my devices are wired through the Ethernet ports.
- All client network devices lost connection to DHCP server.
- ssh, ping, telnet all got no response from the router.
Upon crashing, all Ethernet ports stop responding: not allowing me to access the OpenWRT GUI via numeric IP/local DNS name.
I can only access the OpenWRT GUI from the Wi-Fi interface.
(The bug affects the mtk_soc_eth ethernet switch).
To restore full system functionality, I had to completely reboot the router.
### Expected behaviour
The network connectivity should remain stable, and the router should not experience timeouts/crash or loss of LAN access.
### Additional info
The issue occurs randomly and is not reproducible consistently.
The log indicates a timeout in the transmit queue of eth1, which is related to the **Mtk_soc_eth** driver.
Searching the web with the keyword "(mtk_soc_eth): transmit queue" I came across several similar issues.
Except for this problem, the router has never presented any instability problems
I should add that I use Stubby for DoT configuration, I don't think it is the cause of the problem.
### Diffconfig
_No response_
### Terms
- [x] I am reporting an issue for OpenWrt, not an unsupported fork.
3 Likes
xize11
July 24, 2024, 4:03pm
8
This is about the op24 version right?, i think since a month or maybe a bit longer ago OpenWrt indeed added a variation of mediatek related patches including for the ethernet but also wifi, now im not sure if op24 included the ethernet one or if it was forked before it, im refering to that commit where they dropped patches and merged the official added linux eth_soc variant, sadly github still doesn't have a history search in the commit data, i believe it was merged along with a bunch of commits from mt7988 or mt7996 (not the flint 2 target).
Though i agree it would nice to have a update again
-edit-
I see its in there click this is the full tree, but I think its still good to update it but thats on gl-inet
@alzhao It seems this is know issue through OpenWrt. Can we get a fix? My MT6000 crashes several times a day. Just happened again:
35675.698389] ------------[ cut here ]------------
[35675.703009] NETDEV WATCHDOG: eth1 (mtk_soc_eth): transmit queue 0 timed out
[35675.709995] WARNING: CPU: 3 PID: 0 at net/sched/sch_generic.c:473 dev_watchdog+0x2d8/0x2e0
[35675.718242] Modules linked in: pppoe ppp_async option wireguard usb_wwan pppox ppp_generic libchacha20poly1305 ipt_REJECT chacha_neon xt_time xt_tcpudp xt_tcpmss xt_statistic xt_state xt_recent xt_quota xt_pkttype xt_owner xt_nat xt_multiport xt_mark xt_mac xt_limit xt_length xt_hl xt_helper xt_ecn xt_dscp xt_conntrack xt_connmark xt_connlimit xt_connbytes xt_comment xt_addrtype xt_TCPMSS xt_REDIRECT xt_MASQUERADE xt_LOG xt_HL xt_FLOWOFFLOAD xt_DSCP xt_CT xt_CLASSIFY usbserial slhc rndis_host qmi_wwan poly1305_neon nf_reject_ipv4 nf_log_ipv4 nf_flow_table_hw nf_flow_table nf_conntrack_netlink nf_conncount mtqos mtk_warp_proxy(P) mtfwd mt_wifi(P) libcurve25519_generic libchacha iptable_raw iptable_nat iptable_mangle iptable_filter ipt_ECN ipheth ip_tables huawei_cdc_ncm exfat crc_ccitt cdc_wdm cdc_ncm cdc_ether cdc_acm arptable_filter arpt_mangle arp_tables fuse sch_tbf sch_ingress sch_htb sch_hfsc em_u32 cls_u32 cls_route cls_matchall cls_fw cls_flow cls_basic act_skbedit act_mirred
[35675.718323] act_gact xt_set ip_set_list_set ip_set_hash_netportnet ip_set_hash_netport ip_set_hash_netnet ip_set_hash_netiface ip_set_hash_net ip_set_hash_mac ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_hash_ipport ip_set_hash_ipmark ip_set_hash_ip ip_set_bitmap_port ip_set_bitmap_ipmac ip_set_bitmap_ip ip_set nfnetlink ip6table_nat nf_nat ip6t_NPT nf_log_ipv6 nf_log_common ip6table_mangle ip6table_filter ip6_tables ip6t_REJECT x_tables nf_reject_ipv6 ifb ip6_udp_tunnel udp_tunnel tun vfat fat ntfs nls_utf8 nls_iso8859_1 nls_cp437 shortcut_fe_ipv6 shortcut_fe mtdoops mtk_warp mtkhnat leds_gpio uhci_hcd ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl kmwan ehci_hcd gpio_button_hotplug gl_sdk4_tertf gl_sdk4_black_white_list f2fs ext4 mbcache jbd2 conninfra crc32c_generic crc32_generic gl_sdk4_hw_info
[35675.877580] CPU: 3 PID: 0 Comm: swapper/3 Tainted: P 5.4.238 #0
[35675.884960] Hardware name: GL.iNet GL-MT6000 (DT)
[35675.889649] pstate: 60000005 (nZCv daif -PAN -UAO)
[35675.894431] pc : dev_watchdog+0x2d8/0x2e0
[35675.898427] lr : dev_watchdog+0x2d8/0x2e0
[35675.902422] sp : ffffffc01001bdb0
[35675.905722] x29: ffffffc01001bdb0 x28: 0000000000000140
[35675.911019] x27: 00000000ffffffff x26: 0000000000000000
[35675.916315] x25: 0000000000000003 x24: 0000000000000000
[35675.921612] x23: 0000000000000001 x22: ffffff803e08b000
[35675.926908] x21: ffffff803e08b480 x20: ffffffc010ab6000
[35675.932205] x19: 0000000000000000 x18: 0000000000000000
[35675.937502] x17: 0000000000000000 x16: 0000000000000000
[35675.942799] x15: 0000000000000000 x14: ffffffc010b523da
[35675.948095] x13: 00000000000027b8 x12: ffffffc010b51000
[35675.953392] x11: ffffffc010ace000 x10: 0000000000000010
[35675.958689] x9 : 0000000000000000 x8 : 2065756575712074
[35675.963986] x7 : 696d736e61727420 x6 : 0000000000000001
[35675.969283] x5 : ffffffc010358d88 x4 : 0000000000000008
[35675.974580] x3 : 0000000000000004 x2 : 0000000000000004
[35675.979877] x1 : 0000000000000004 x0 : 000000000000003f
[35675.985175] Call trace:
[35675.987612] dev_watchdog+0x2d8/0x2e0
[35675.991262] call_timer_fn.isra.37+0x20/0x78
[35675.995519] run_timer_softirq+0x1e0/0x468
[35675.999602] __do_softirq+0x124/0x260
[35676.003251] irq_exit+0xb8/0xc8
[35676.006383] __handle_domain_irq+0x64/0xb8
[35676.010466] gic_handle_irq+0xc0/0x158
[35676.014201] el1_irq+0xb8/0x140
[35676.017331] arch_cpu_idle+0x10/0x18
[35676.020895] do_idle+0x120/0x148
[35676.024110] cpu_startup_entry+0x20/0x60
[35676.028022] secondary_start_kernel+0x148/0x158
[35676.032539] ---[ end trace 906c19c93bc084ac ]---
alzhao
July 25, 2024, 10:22am
10
I will push guys to check.
1 Like
Great! Thank you for all your help!