GL-MT6000 Flint 2 randomly stops working

A few months ago I bought a brand new GL-MT6000 and I installed the official OpenWRT from openwrt.org. But ever since then I’ve been having a problem where it randomly stops working, sometimes in days, sometimes weeks, sometimes months. I have to unplug the power and plug it back in to get it to work again. It has not happened enough times to fully figure out what is going on, but last time I think I found that it was the ethernet interfaces that stopped responding, because the OS itself was still running normally according to the system log I made persist on nand. Next time I needed to check if I can access it on wi-fi, but now it has failed when I am away for weeks and it is a disaster! Now I am really frustrated with this router and wonder if I should get rid of it and go back to more reliable routers.

Has anyone else had such issues or have any clue what is going on? I just realized that you can also download the “vanilla” openwrt from gl.inet here https://dl.gl-inet.com/router/mt6000/open. Is this different from the official openwrt and would using this solve my problem?

The OpenWRT 24 firmware version has the GL GUI on top of open source drivers.

As for the official OpenWRT, that's probably a question for their forums.

It does? So what is difference between this one and this one?

The beta and stable versions are not using OpenWrt version 24.10 as base, but a much older version of OpenWrt.

This older version has been forked by Mediatek the maker of the chip in your router, these chip vendors often create SDKs from these versions and include copyrighted patented drivers which are different drivers than OpenWrts open source driver for Mediatek called MT76, the only shared commonality between the 2 distinct different drivers are the shared firmware blobs.

Vendor SDKs get used on a various of routers not only on gl firmware, but think on brands like tp-link and a bunch of other routers, it maybe be not so obvious because often it is locked, tip: if you are active on OpenWrt forums look to the forum section where they want to add support for routers, you will see very related things like this, like a log output, you will be amazed about this👍

Then you also have the gl sdk, and gl sdk does not walk parallel with the OpenWrt base version, GL-inet often chooses to maintain higher versions of their gl sdk on chip vendor firmware which adds to some confusion.

Gl sdk is the ui, and their full ui dashboard and logic.

So OP24 version on gl page, is OpenWrt 24 with gl ui and features but a lower version of gl-sdk.

But OP24 for GL is still different than vanilla OpenWrt because of the heavy modifications by GL-iNet and for support questions it often would be advised to create help or support topics towards the correct maintainer, this is so because otherwise developers may seek alot of time to try to find some bug in their code which never was present, and it won't help community either because some things may work different :slight_smile:

1 Like

Also back to the issue:

It doesn't sound like a really software issue to me, but I can be mistaken, just being optimistic.

Do you use a switch in between somewhere?, can you try this on a direct connection on the flint 2 what happens then on wire?

If that doesn't change anything, what about adguard? Are you a user with alot of blocklists?

Adguard can be a tricky one, even though the flint 2 has alot of space more than most routers, it is possible it is just too much and the memory is too full.

And the cable itself, have you tried a different one and also one with no damages or open core wiring ?

Do you have a ip conflict or: maybe worse not so present conflict: a ip conflict where a layer2 switch ip conflicts, this can show the tendency of the switch being offline like what I had observed with unifi switches, but the child devices online, this can result in very slow speed, or even disconnections or only being present after a bunch of reloads.

No other things in switches like tree spanning/STP ?, mixing different versions of STP or flavours can also do very glitchy and unreliable things, especially when you expect it to not fail when uploading/downloading high, it could trigger other switches upstream on completely different ports and cause a loop towards the main router.

There is a switch but before there wasn’t and adding the switch didn’t make a difference.

The ethernet cables are brand new cat6

I disabled STP on everything I could so as far as I know there is no STP.

Last time this happened, I found that right when it stopped working (not sure if right before or at same time), the ethernet interface of an imac plugged in to the router started crashing every minute with netdev watchdog “transmit timed out, resetting”. And ethtool said tx_mac_errors: 62. This imac ethernet had these transmit time out errors before, before I had this router, so I thought it might have something to do with it so I left it unplugged since. And it didn’t happen again for a while until now I am not there :man_facepalming: So it’s hard to debug it. But it seems it’s only the ethernet because according to system log the wifi devices connected directly to it were still getting dhcp. So I was hoping next time I would connect to its wifi directly and explore the OS in the broken state.

No adguard I just have wireguard and vlans. The memory usage was always low. I don’t think there is any ip conflict.

Even if it was something like STP it would only affect one layer2 network and not the WAN port which is a separate interface? But here all of the ports die. The WAN and LAN are on the two 2.5g ports and IIRC they are separate dedicated interfaces?

hmm which firmware do you use now?, and which port on the flint 2 has been connected to?

My guess is that maybe you run MTK SDK version and the 2.5G port seem to not like certain traffic from the mac that much you can try temporary check if the 1gb ports do the same if they don't cause issues, you could try op24 which has much newer ethernet drivers to see if that works stable.

Just in case: is hardware acceleration off ?, that one doesn't work well on MTK versions.

Nope, layer2 also includes broadcast and arp and multicast, if a STP version mix issue occurs from what happened on my network, on lan1 I had unifi on managed vlan using rstp, and on port 2 zyxel switch using classic stp sensed high traffic as a loop, and that made OpenWrt think that traffic from lan2 sended traffic with it's own source address to my router and locked up, I was able to oversee this on wifi.

After turning it off and replicating, the issue was gone, it is very crazy how such thing can glitch, you would expect this zyxel switch would not act upon traffic of lan1 but it is because of layer2.

Only wan would be unaffected in this scenario, but lan1 the 2.5gb port will also be unreachable since it is part of lan.

The firmware is OpenWrt 24.10.1 r28597-0425664679

Fortunately I got someone to turn off the power to reboot the router. As before there is nothing in the small log buffer it just shows the router kept working offline.

I hope next time it dies is when I’m back so I will connect to it on wifi and see what is going on. What should I check? What commands should I run?

You could try logread first only if the error is obvious.

And dmesg for kernel logs