Slowdown sft1200 slowing to half speed

joet · January 3, 2024, 7:37am

I have an SFT1200 Opal router, with a fiber provider (last mile being coax) providing 1gbps service with dependable 800+ mbps throughput on my network (and consistent testing speed when using a different router an Eero, instead of using the GLiNet Opal router). I think should be able to get consistent speeds around 800mbps. However the Opal is slowing down to half that and never improves.

When I reboot I get the full 800mbps bandwidth, every time… and all seems good… for about an hour.

Over the next hour it appears to slowwww down to about HALF that speed, finally resting at 340mbps as the top speed. BUT a reboot of the SFT1200 brings top network speed right back up to the full throttle 800mbps speed?!

What gives? I believe the SFT1200 should deliver some consistent speeds, and it seems the only variable is the SFT1200 Opal router?

bring.fringe18 · January 3, 2024, 4:11pm

Do you have another Cat 5 or better ethernet cable to try? I know it sounds strange but then there’s threads like this:

joet · January 3, 2024, 4:43pm

No wifi enabled, no VPN enabled, high quality cable (tried a few different cables), updated firmware to latest. About 50 clients, but most are not that active

point being a reboot brings speed back to highest possible, so I don’t think it’s the cable

bring.fringe18 · January 3, 2024, 4:52pm

50 Clients?! I really don’t think the Opal has the horsepower to keep up with the SYN/ACK & other polling, etc, has going on even with those idle clients.

@hansome : Is there a upper limit for simultaneous client connections on the Opal like there is on the Flint v1 (120 clients)?

joet · January 3, 2024, 4:58pm

50 devices / reservations – About half of those are smarthome outlets or lights, expected minimal traffic I would think – only two active people, 2 laptops 2 phones, a few streaming devices

I’ve also disabled traffic monitoring (which cuts the top speed for sure)

joet · January 3, 2024, 8:59pm

Continuing to look – the cpu load was typically high above 2, and network speeds are even slower on higher load, so I’ve disabled gltertf the client traffic monitor and blocking process, even though that was disabled using the web admin panel, it appears the process was still running and taking up some significant CPU

/etc/init.d/gl_tertf disable && /etc/init.d/gl_tertf stop

cpu load is back down to 1 or around 1.2, normal range

I’ve tried clearing the system log, also restarting the firewall, all with no impact – top speed is 350mbps

rebooting is doubles that, for an hour – so it seems some type of system process that accumulates over time is steadily bringing the top available client network speed down

hansome · January 4, 2024, 6:44am

Do you use the same speedtest server?
Will logread tell anything abnormal when speed slowdown?

logread -f

alzhao · January 4, 2024, 8:12am

50 clients are all on cable?

joet · January 4, 2024, 2:24pm

I tried different servers to see if there was any difference (no difference)

joet · January 4, 2024, 2:49pm

50 devices (IPs) on the network, all via wired Ethernet to the Opal. Opal does not have 2.4 or 5ghz enabled. The wifi devices on the network (some smart outlets mostly) are using a Eero in bridge mode.

Speed tests are via wired devices (not using wifi there), and in some cases directly plugged into the Opal (without a switch in between just to be sure, with the same results). I’m not testing via wifi at all to avoid confusion over potential max data bandwidth, the connections are all 1gig devices.

I’ll restate, after rebooting the speed is great, and then it slows down. I dont believe I’m expecting miracles from this device based on the specs. Reboot returns a solid 800Mbps result over and over again for about an hour (sometimes less). I’ve disabled all the internal GLiNet traffic monitoring (as much as possible), but it sure feels something like that is pulling the speed down over time. Once slow it does not recover – until the is Opal rebooted again, and the pattern repeats

I don’t see any message from logfile -f

bring.fringe18 · January 4, 2024, 7:30pm

There’s a method to force interface links to only negotiate on GbE in OpenWrt but IDK if that conflicts with GL’s customizations for the Opal. I’d be hesitant to try it given @hansome & @alzhao would be better suited to give insights here of course.

Of course I’m assuming at least one of the connected client’s interface cards can run GbE. I’ve seen Realtek GbE NICs only capable of pushing 600 Mbps… on a good day. OP, do you happen to have a card that features TOE?

In the mean time it might be worth pulling a backup just in case:

McMuckle · January 4, 2024, 11:59pm

I have an SFT-1200 and fibre 1gbs, but I only use the SFT as a backup WAN connection via 5G if the fibre drops. Had a few battles getting that set up reliably, but all good now.

My main router is a Deco M4 mesh and I too battled with slow throughput initially which was eventually resolved (changing the backhaul method from wifi to wired ethernet was my cure), so I feel your pain. Like you, after a reboot I would get good speed, but only for a few minutes. Maddening.

Anyway, I’d be happy to test my fibre with the SFT if I find some time this weekend.

When you run your speed tests when it slows down, is the device you run the speed tests from the only device connected to the SFT-1200 on its LAN side? Everything else is disconnected?

Cheers

Mike

Correction: My main router is a Sonicwall. The Deco is now double natted on the LAN side of the SW

joet · January 5, 2024, 2:22am

Instead of rebooting the router (getting performance back), I’m able to just restart networking

/etc/init.d/network restart

restart() {
        ifdown -a
        sleep 1
        trap '' TERM
        stop "$@"
        start "$@"
}

takes about 10 seconds and doesn’t bring down the router (just a workaround)

(update) now trying just this line…
ubus call network reload (doesn’t have any effect)

which should narrow the focus

alzhao · January 5, 2024, 5:02am

Maybe check out the CPU load when it slows down.

ps -e
top

joet · January 5, 2024, 8:16am

See prior post, I have been monitoring cpu load re: I’ve disabled gl_tertf bringing cpu load from ~2 to about 1.2… cpu load is normal at all times, but otherwise had little impact

when the performance is low, resetting network using ‘/etc/init.d/netowrk restart’ restores full network performance

/etc/init.d/network restart

until it slows down again (and remains slow)

joet · January 6, 2024, 9:00pm

this also works (trying to narrow down the ‘network reset’ action here)
ifdown wan lan && ifup lan wan

oddly enough either command alone does not improve speed

ifdown wan && ifup wan
ifdown lan && ifup wan

hansome · January 8, 2024, 1:23pm

check if some other clients establish connections and take some bandwidth away.

cat /proc/net/nf_conntrack|wc -l

joet · January 9, 2024, 3:47am

some connections… not anything close to “half bandwidth”

speaking of which, seems an usual, it’s just right about half speed

I also attempted to reinstate “traffic monitoring” – and confirmed very low performance when
gl_tertf is enabled

joet · January 11, 2024, 12:13am

I have have “realtime traffic” graphs which show usage levels for each interface (not related to
gl_tertf), using the OpenWrt LuCi admin page. The graphs show there is little other bandwidth being consumed before, and after the bandwidth tests for quite a while. There’s a massive spike during the test, which is expected. The number of connections remains more or less consistent. Not many clients and I’m not pushing the network during the test – this further confirms there isn’t much else going on the network, like another client that would be using the bandwidth that would interfere with the test results, or an external agent. That should be evident from the realtime graphs.

Resulting speeds are confirmed with two different machines performing test one, and then on the other of course (not at the same time, and always with matching results)

To regain full bandwidth… It’s also odd that restarting the network works (/etc/init.d/network restart… script uses ifdown -a), also just doing a simple command “ifdown wan lan && if up wan lan” … also works, but has to be BOTH wan and lan interfaces.

multiple “ifdown wan && if ifup wan” does not work, followed by multiple “ifdown lan && if ifup lan” does not work,

Using this approach ifdown && ifup must include both interfaces, after which, the speed test uses 100% of the available bandwidth.

I’ve tried to restart a number of dns firewall mwan3 relayd services, but nothing else impacts when the speed is low. Link still says it’s “full duplex” but odd the speed is just about half when it’s low.

Be great to hear from GLiNet if there were more specific commands to try, to reset. Might narrow down the component of software in question.

hansome · January 11, 2024, 3:30am

Please print CPU interrupt info, may be an issue with SMP affinity.

cat /proc/interrupts