SFT1200 2.4 GHz Repeater Disconnects

One of the reasons why we disconnect? How did channel 0 be read from the beacon here?

"lmac[1] vif working channel(36) is different from channel(0) of received beacon frame, notify that we are lost!"

LOG says ...

Sun Nov 3 10:19:57 2024 daemon.info lua: (...pkg-mips_siflower/gl-sdk4-repeater/usr/sbin/repeater:548) <3>CTRL-EVENT-CONNECTED - Connection to dc:2c:6e:ed:04:c4 completed [id=0 id_str=]
Sun Nov 3 10:19:57 2024 user.notice mwan3[20542]: Execute ifdown event on interface wwan (unknown)
Sun Nov 3 10:19:57 2024 user.info mwan3track[20068]: Detect ifdown event on interface wwan (wlan-sta0)
Sun Nov 3 10:19:57 2024 user.notice mwan3track[20068]: Interface wwan (wlan-sta0) is offline
Sun Nov 3 10:19:58 2024 daemon.notice wpa_supplicant[29476]: wlan-sta0: CTRL-EVENT-DISCONNECTED bssid=dc:2c:6e:ed:04:c4 reason=1 locally_generated=1
Sun Nov 3 10:19:58 2024 daemon.info lua: (...pkg-mips_siflower/gl-sdk4-repeater/usr/sbin/repeater:548) <3>CTRL-EVENT-DISCONNECTED bssid=dc:2c:6e:ed:04:c4 reason=1 locally_generated=1
Sun Nov 3 10:19:58 2024 kern.warn kernel: [317172.048909] lmac[1] vif working channel(36) is different from channel(0) of received beacon frame, notify that we are lost!
Sun Nov 3 10:19:58 2024 kern.warn kernel: [317172.060313] lmac[1] {CTXT-0} unlink from {VIF-1}: status=4 nb_vif=2

So does that mean that you get disconnections because the uplink router/modem disconnects the repeater?

Which of the 2 devices (uplink router or the repeater) is hard to find out.

  • GL.inet repeater claims that of beacon with the wrong (zero) channel is received, and says it has lost the connection.
    The uplink router (Mikrotik) says ..
    disconnected, ok, signal strength -57/
    attempts to associate/
    connected signal strength -57/
    reassociating/
    disconnected, ok, signal strength -57/

In Mikrotik language, "reassociating" means that the device associates while still associated.

This happens on all my Mikrotiks hAPac2, hAPax2, hAPac3, wAPac, hAP Lite .... no other client device shows any problems on those AP, only the GL.inet SFT1200.

Same problem with my friend. He is testing at our 30 AP large monitored Mikrotik campus installation. Performance on all connections is excellent, except the SFT1200.

So who is dropping the connection here ? I think it is the GL.inet, not the uplink router.
2.4 GHz is actually unusable, 5 GHz drops too frequently under load.

EDIT: It was exceptionally bad this evening. Friend at remote location, with SFT1200 connections that last only 1 or 2 sconds, even on 5GHz band, what used to be 5 minute sessions. Ideal to do some tweaking on the Mikrotiks, that I manage remotely. Guess what ... session was stable for 8hours ! No disconnects. OK this was on 5 GHz band, 2.4GHz is stable if the 802.11g only is forced there. No such thing on 5 GHz (n or n/ac makes no difference).

So tweaked a lot (RTS/CTS, AMSDU size, only 6Mbps basic rate, multicast helper), and the "multicast helper" set to "Full" made it 100% stable. Not only no-disconnects, but 100%CCQ and the dynamic interface rate goes up to the max possible : 400Mbps for 40MHz width/2 stream/short guard. All other tweaks turned off, it remains stable 100% with only the "multicast helper" set to full, on Mikrotik

In Mikrotik the multicast helper set to "default", as it is set by default actually means the helper is OFF or disabled !
Multicast helper on Full, converts multicasts in separate unicasts.
Don't know if this is done for beacons also.

Mikrotik - SFT1200 : there clearly is a multicast/broadcast problem. Wrong channel info as read from beacon, forces disconnects. But yes beacon/multicast does not do ACK-NACK or retransmits.

Have to test this helper workaround on the 2.4 GHz band, and compare it to the previous 802.11g workaround.

EDIT 2

Previous ... now this is on 4.3.21 .... and this supports DFS channels now, scanning, selecting, receiving and repeat/transmitting on channel 132 , with country BE (Europe/ETSI) . More changes than listed in the release notes. the previous workaround on 2.4GHz as "802.11g" does not help ... anymore ???Everything to be tested again ???

And we have new error types now:
Sat Nov 9 14:21:10 2024 kern.warn kernel: [ 9333.012255] lmac[1] Error, msdu index reach maximum limit, 31
Sat Nov 9 14:21:10 2024 kern.warn kernel: [ 9333.018094] lmac[1] Error, buffer is not uploaded to host buffer completely

Damned GL.inet SFT1200 device. It claims a lot of external IP addresses as pointing to its own uplink MAC address, even when used in WLAN connected repeater mode, set as router (aka as WISP mode). This feature does exist for the LAN ethernet interface, as "DROP-in gateway", but it's not activated here, and it is not for the WLAN uplink interface. My Hotspot (Mikrotik) dynamic mode is triggered to NAT some of the leaked SFT1200-LAN addresses (192.168.8.x) to it's Hotspot normal range (where the SFT WLAN address is part of, as it got is from that Mikrotik DHCP). It sees too many MAC/IP links in DHCP and denies access.

Only wireless is used in this setup, no ethernet cable attached.

7 posts were split to a new topic: SFT1200 lock BSSID

What Mikrotik is the uplink?
Mikrotik have a proprietary WiFi called Nv2 and a predecessor called Nstreme.
The sft1200 only has access to proper non-proprietary WiFi.

If a Mikrotik uplink finds a Mikrotik downstream they will converse in their own proprietary communication over the WiFi connection.

Yes, good remark, but GL.inet SFT1200 will not even see the nv2 or nstreme wifi signals. It's not 802.11 (CSMA/CA) , but it is TimeDomainMultiplex based access. A proprietary TDMA implementation for Mikrotik.

I'm very well aware of the Mikrotik wifi settings (I made 3000+ replies in the Mikrotik forum, on wifi performance and stability problems, with tuning suggestions and workarounds)

To answer your question, the SFT has uplink to hAP ac2 with ROS 6.49.13 , a very stable version. I also test with ROS 7.12 and the newer Wifi drivers on hAPac3 and hAP ax2.

For the WLAN drivers my current tuning for SFT stabel wifi connection is:

  • set Multicast Helper to FULL
  • configure data rates, reduce Basic rates A/G to 6 MBps only
  • VHT Supported MCS is twice MCS 0-9 , but the 3th is set to "none"
  • VHT Basic MCS, is set to "none" (!!!)
  • AMSDU Limit and Threshold are both set to 2300 (is default 8192)

Wifi driver cannot be finetuned like the WLAN driver.

By the way: with the newest release, the SFT1200 does respond to DFS channels as repeater, with country set to Belgium. It still may not select DFS channels on it's own, as it seems not to be able to do Radar Detect. (As repeater the uplink AP has already done that Radar Detect.)

This works on the 5 GHz connection. The 2.4GHz is still a total disaster, unless the AP limits to "legacy" (or G only) , and does not use "n". Only problem is SFT1200 in that area.

2 Likes

Status so far

  • DFS use with 5 GHz can clearly be seen in the log now. (v4.3.21)
  • disconnects with OpenWRT are all over the OpenWRT forum, with many tests and replies
  • Most replies after tests state: it is a broadcast/multicast problem (not received packet), and if it is for the ARP packet then there are repeated disconnects
  • " ARP/broadcast issues for wireless clients" in forum and github. Only solved since version 23.03
  • Use a SSH session on the SFT1200 and issue "ip neigh" command. And you will see the STALE state of ARP entries for AP, gateway and DNS server.
  • We learned to avoid the 802.11b rates (1Mbps ...) because of the beacon overhead (30 beacon senders at 1Mbps will use 50% of all airtime), but maybe here we should enable them, as failed broadcast/multicast transmission might be the ultimate cause of the disconnects.
  • that MSDU error : larger than 31 , is also still there in the log
  • "disconnect reason=2" ... caused by stale ARP entry ?
  • and every time SFT1200 (re)connects to the uplink AP, it's MAC address with the WAN and LAN IP address appear both in that ARP table of the uplink AP (local IP address leaking?)

"ip neigh" ... used it here: GL-SFT1200:/proc/sys/net/ipv4/neigh/wlan-sta0#

LOG lines:
Mon Nov 25 20:57:57 2024 daemon.info lua: (...pkg-mips_siflower/gl-sdk4-repeater/usr/sbin/repeater:548) <3>WPA: Group rekeying completed with b8:69:f4:[edit] [GTK=CCMP]
Mon Nov 25 20:57:57 2024 kern.info kernel: [30245.505752] hb-fmac 17800000.wifi-hb wlan-sta0: Add key for vif(0), key index : 1
Mon Nov 25 22:08:20 2024 kern.warn kernel: [34468.551984] lmac[1] Error, msdu index reach maximum limit, 31
Mon Nov 25 22:08:20 2024 kern.warn kernel: [34468.557780] lmac[1] Error, buffer is not uploaded to host buffer completely
Mon Nov 25 22:08:20 2024 kern.warn kernel: [34468.564763] lmac[1] Error, msdu index reach maximum limit, 31
Mon Nov 25 22:08:20 2024 kern.warn kernel: [34468.570546] lmac[1] Error, buffer is not uploaded to host buffer completely

2 Likes

Doing more tests ... want to fix this!

Ordered a competing cheap CUDY TR1200 ...

And compared performance ... horrible GL-SFT1200 only gets 10Mbps down, 6 Mbps up as best values in all my different attempts (2.4 and 5 GHz) , while the CUDY TR1200 connecting to the exact same Mikrotiks from the same spot, just gets the full ISP speed I have (50Mbps down, 10Mbps up) on all connections.

The SFT 1200 when doing OOKLA Speedtest has this in its LOG ...
"txq 216, try to reduce ps skbs to advoid ps station block ndevq"
Don't know what this means.

3 Likes

skbs aka socket buffers:
http://oldvger.kernel.org/~davem/skb.html

So ps I'm going to guess is packet something.
If you get past the avoid, ndevq, network device queue?

So it's reporting a bottleneck at the wireless station due to packet socket buffers need to be reduced.

Don't tempt me on the TR1200, beautiful design ruined by a 16mb flash chip, 16mb isn't big enough these days.

Thanks for the link.

I eventually downgraded the Firmware to 3.216. They (GL.inet) eliminated some things from the firmware package (like hotspot, etc) to make 4.x fit. But still this seems to be too heavy/big for the poor SFT1200.

Performance now with firmware 3.216 is 50% of the Cudy TR1200. This is MUCH better than the 4.3.21 firmware on the SFT1200 OPAL.

1 Like

tests continue on one remote SFT1200 with 4.3.21 and one with 3.216.

  • Exploring all config files via SSH "cat". SSH "iw list" command is very interesting: all info on MSDU,A-MSDU, MPDU, A-MPDU is there
  • 4.3.21 fails after hours of up-time with 5 GHz for no apparent reason. (Is a remote idle site, but with 10 competing SSID around)
  • the 2.4GHz link on the LAN side of the SFT (SFT as AP), to another Mikrotik (as station) is stable as a rock

24 hours passed

  • when reconnecting happens SFT picks just another SSID in its known network list, so it did not come back to the operational SSID
  • SFT1200 as station does pick up the channel to set the wifi interface (or would not connect), but is wrong on the channel width. It sets 20MHz for the 5GHz band, and 40MHz for the 2.4 GHz band. Infrastructure setting there is just the opposite
  • Wrong channel width leading to that warning/error : "siwifi_calculate_legrate invalid legrate" or not?
  • At reconnect start time the SFT uses 192.168.8.1 LAN IP on it's WWAN connection. Then only later uses the DHCP assigned IP.
  • mwan3 has very strict values in the tests for interface up/down. (low # bad pings for down, versus 8 good pings for re-enable as up)
  • any glitch in the uplink connection or in internet will trigger the interface down, and start the long mwan3 recovery
  • (added the local gateway as test-IP, to reduce downs by internet glitches)
2 Likes

@bruce any chance of 3.218 for the Opal?

Don't see 3.218 for Opal.
Tested 3.216, much (!) better than 4.3.21.
But returned to 4.3.21 because expected Zerotier to be in there. It's not.

Also started looking with the Android GL.inet APP.
Hmmm, had not seen the "Force 20MHz bandwidth for 2.4G(hz)" option explained, in the web interface. It"s there if you click.
Seems exactly what's needed.
Tested.
No help so far.
2.4GHz still disconnects rapidly.

The LOG says every time ...
"Sun Dec 1 13:28:43 2024 kern.warn kernel: [54211.328067] lmac[0] Error, msdu index reach maximum limit, 31
Sun Dec 1 13:28:43 2024 kern.warn kernel: [54211.333921] lmac[0] Error, buffer is not uploaded to host buffer completely
Sun Dec 1 13:28:48 2024 daemon.notice wpa_supplicant[12187]: wlan-sta0: CTRL-EVENT-DISCONNECTED bssid=b8:69:f4:95:6d:fc reason=2"

Question is, what is "msdu index" , the index of MSDU in A-MSDU?? . A-MSDU is rather small in the uplink MT router. Same numbers as in the SSH session with "iw list" command in the SFT1200.

Maybe it is the MPDU index in the A-MPDU. That's done in the driver or hardware, and not visible to the OS. Mikrotik has small MPDU (A-MSDU), so there could be many in a A-MPDU.

Will test with MPDU aggregation OFF in Mikrotik.

EDIT:

  • after searching in github ... this seems to be about the AMSDU(MPDU) index into the AMPDU package, processed by the driver.
  • reducing AMSDU size from 8192 to 3839 will do nothing (that is the size Mikrotik uses anyway)
  • reducing it further will put more AMSDU/MPDU blocks into one AMPDU, so will make it worse
  • stopping the AMPDU aggregation in the Mikrotik (remove tick marks in the HT tab) , stops the error message. but this lack of integration hurts the performance.
2 Likes

Further testing and Google searches ...

Now 24hours already quite stable 2.4GHz uplink connection, even with speed-test induced traffic load.

The Google search result for "GL.inet disconnects" is not reassuring. All models seem affected.

What's different in this setup?

  • ISP router (non Mikrotik) [ not so easy to track disconnects as with MT log]
  • re-installed firmware 4.3.21 on SFT
  • 2.4GHz and 5 GHz SSID are different names
  • there is only one AP with that ISP based SSID name
  • no other known wifi networks in the SFT list
  • ISP modem router has 5 connections (3 Mikrotiks (wired), SFT1200 and Cudy TR1200))
  • no network acceleration on SFT1200
  • 2.4GHz set on 20MHz. Force 20MHz bandwidth for 2.4G [repeater options]
  • Allow auto network switching disabled [repeater options]

I know that sometimes the "client roaming" gives unstable wifi connections with MT-station, while the station was scanning on a quality dependent interval for a better AP.
With SFT and the multiple AP condition, the interval between disconnects was constant.

Is this "station roaming" disabled now on the SFT?
Getting 30Mbps download, better than ever with the SFT on 2.4 GHz.
Signal -58dBm, noise -88dBm , not static but always around that value.

If there is only one saved SSID it seems better.

MWAN3 may not the the problem.

So in your last test the 2.4G is stable. Did you manupulate the mikrotik 2.4G wifi settings as your previous post?

Thanks for the reaction. I was close to the point on giving up, after all these tests and no progress.

Tests are difficult. Results are not consistent. Sometmes it goes well, then suddenly it stops. No apparent reasons.

This night the SFT1200 was connected to the ISP (B-BOX 3 Sagem, VDSL modem) on 2.4GHz.

The 5 GHz Mikrotik SSID where also in the known network list,

SFT1200 was free to select both bands (Band selection is on AUTO)

"Allow Switching to Other Saved Networks" was DISABLED

"Force 20MHz Bandwidth for 2.4G" is ON

The ISP is not giving management access to their B-BOX modem

No traffic action (no clients) on the SFT1200 was done, but SFT is connected to Goodcloud.

There was no interruption these last 10 hours.

Wifi station setting on N

Mode: Client | SSID: WIFI-2.4-2987
BSSID: 6E:83:C4:49:DA:42
Encryption: WPA2 PSK (CCMP)
Channel: 6 (2.437 GHz)
Tx-Power: 0 dBm
Signal: -59 dBm | Noise: -90 dBm
Bitrate: 0.0 Mbit/s | Country: BE

Same SSID of B-BOX is nowhere else on another AP. Laboratory condition and no traffic load , however other wifi are arround (My Mikrotiks, and neighbor wifi devices).

Cudy was also connected to the B BOX 2.4GHz

Looks like perfect.

Had this good results before on a multi-mikrotik situation, but then with wifi on SFT set to "legacy". This could however not be repeated with version 4.3.21.

Where does it go wrong? What makes that switch from good to very bad? Tests are not consistent so far, except "Cudy TR1200" is always good.

2 Likes

This remains stable with the B-BOX 2.4GHz connection, and is the settings on the SFT 1200 as above

But special LUCI setting to "legacy" was removed, it is N for the 2.4GHz interface.

This SFT setting is/was ...
"Allow Switching to Other Saved Networks" is/was DISABLED

"Force 20MHz Bandwidth for 2.4G" is/was ON

The uplink SSID is offered by just 1 AP only.
Connection tracking is set to OFF, and is set to Failover (what cannot happen, no check, and no ethernet or USB connection.)

Country was set to BE before in Advanced setting (LUCI)

The wireless interface are set before the "repeater" was started. Set to 40 MHz for 5GHz band, and 20 MHz for the 2.4 GHz band. (As all my AP's around are set.)

Time to move on?

Switched 50 minutes ago to the hAP ax2 as uplink AP, instead of the B-BOX.

Same connections ... CuDY TR1200 and GL.inet SFT1200 to the 2.4GHz (wifi2) interface.
This is not the WLAN1 of Mikrotik, as AX hardware only works with the Qualcom wifi drivers. [package qcom_wifi]

My laptop connection is to the 5GHz interface on the hAP ax2 (managing the travel routers is done via an open port 80 on their WLAN interface)

Ookla Speedtest shows relative low speeds, like 10 Mbps down, for both travel routers.
But no instability or disconnects for more than one hour now.
These qcom_wifi drivers don't give me the detailed information as the legacy MT WLAN drivers.
But result seems to be identical to the B-BOX experiment.

One hour is not enough, I need days of confirmation with this intermittent disconnect problem.

Need more test setups also.

Only then I will try the Mikrotik WLAN environment with 10 AP's @home , and 35 AP's @holiday resort, all with the same SSID name. (The typical hotel setup !)

Problem here is, that when it works, it keeps working for other setups, and when it fails, it also fails in previous considered good setups. Some things are stored or cached. Can be seen as "cannot connect to network" happens even on the laptop, until deletion of the known network, and needs reentry of the credentials.

Lock BSSID is added for a reason, however SFT1200 is not in the supported list. ;-(

1 Like

Almost there ... my conclusions so far ... the SFT1200 repeater/router based implementation (WISP), has a serious flaw in its design. It is made to fail by design, AFAIK, and as far I did these tests.

The GOOD, the BAD and the UGLY. (1966 ‧ Western)

Problem is that the wifi connection does not stick even when GOOD, and the SFT does everything to find yet another connection when not needed, this is BAD, and when it just tries one then is destructs the complete setup, and that's UGLY.

GOOD
There is no reason to disconnect at -53dBm. Yes we do not like "sticky client" behaviour, but this is a tabel top placed router, not a smartphone in the pocket. But SFT does not validate the wifi connection on wifi quality but on some MWAN3 tests, that validate including the internet connection, and very strict with 3 bad Pings (timeout) to be countered by 8 good Pings. Countered is not erasing the disconnect action, which still happens. So we don't use the GOOD here, as we should.

BAD
Even with a good wifi connection the SFT goes for "client roaming" searching for a potential better wifi connection. It scans for AP's with the same SSID (but maybe different BSSID) , and even scans for other SSID in its known wifi network list. This 'Promiscuity' is unwanted, and not productive. Locking BSSID is an option, not even available in the SFT as an option, only at the "switch network" moment, and that only is a partly solution, and has negative side effects. Roaming is wanted if the connection (AP) is really gone out of reach.

UGLY
And then it gets very ugly. What the SFT implementation does when the connection is broken (mostly for no wifi reason) is very destructive. It reloads many things, like the firewall, breaking existing SRC-NAT or masquerade connections, destroying return paths through NAT, stopping existing sessions. Requiring 8 good pings before restarting. And it takes 'forever' to reconnect.

Normal client devices will also see disconnects to the same uplink AP (thats what happens in the wifi world) , but that interruption lasts like 2 seconds if the AP is switched to yet another. But the TCP session, and UDP NAT path remains, and has not to be reinitiated. You might have a voice call glitch, but most applications will just resend the packet in time.
This does not work for the UGLY disconnect destruction in the SFT.

So experience learns. The SFT might behaving when only one (one SSID, one AP) wifi connection can be made, and there are no other tempting known SSID's around. If more AP's have this same SSID then the SFT will fail to work. And that is mostly the case with 2.4GHz band (more overlapping areas), it might be @home, but certainly is there in the hotel, holiday resort, city network, event wifi , etc etc etc.

In other words this flawed implementation in the SFT1200 of a WISP (wifi uplink with NAT) is totally useless in the many cases where you need it. (Even If we use 'BSSID lock' this is only a partial workaround, but it might be a real case saver). The WISP implementation has to be as resilient as possible, and should avoid, not enforce reconnects, and handle reconnects to the same SSID without destructions.

2 Likes

The newer v4.6 or v4.7 firmware of the SFT1200 will merge this feature code to achieve lock BSSID. v4.6/4.7 firmware is planned to develop, but the release time is uncertain, please wait. Thank you.

1 Like