Flint 2 / GL-MT6000 Multi-Wan failover works, but fail back does not

The Multi-Wan failover functionality on the Flint 2 / GL-MT6000 appears to work correctly in terms of quickly switching from WAN1 to WAN2 when WAN1 is detected as "down".

However, when WAN1 comes back up, not all existing TCP/UDP streams are migrated from WAN2 back to WAN1.

From searching, it appears at one time (and possibly currently on other devices), there was an option called Forced Refresh Streams.

It's referenced in the latest Multi-Wan documentation under the Failover section:

That option is nowhere to be found in the Dashboard of the Flint 2 / GL-MT6000, and others have also needed it and noticed its absence:
https://old.reddit.com/r/GlInet/comments/1dthh52/noob_question_where_is_the_forced_refresh_streams/

Suffice to say, it looks like that option / functionality is needed (but missing) in order to have Multi-Wan failover (and then fail back) work as expected.

Can anyone shed any additional light on this? Is there a hidden setting somewhere in the Dashboard or in Luci that can achieve / force the desired functionality?

Thanks.

If the session has established, it cannot cut (fail) back to WAN1 from WAN2 if the WAN1 back to work.
The newly session can establish and go to WAN1 if the WAN1 back to work.

If cut (fail) it back, the session will be disconnected, and if the seesion is an audio-video conference connection, the connection will be dropped and reconnected, and the conference will be discontinued at that moment.

If you watch the live broadcast, if WAN1 back but it is intermittent and unstable, the video experience is too bad.

That is reason why the Multi-WAN will not cut (fail) back the WAN1 from WAN2 if the session is established, but the new session will go to WAN1.

Forced Refresh Streams has been removed for a long time, probably the docs has not removed, will update.

1 Like

Hi Bruce,

Thank you for the reply and thorough explanation — everything you said makes sense.

However, it would be nice to have Forced Refresh Streams back as an advanced option — maybe with a disclaimer like "Do not use unless you know what you're doing".

My particular use case does not involve any media or VOIP streaming, so having access to Forced Refresh Streams would solve my problem.

This is the second Flint 2 I have, and I really wanted to use it for this application, which requires 24/7 uptime AND high bandwidth usage. In this case, failing over to something like a secondary Starlink connection from a primary DOCSIS cable connection is better than having no secondary at all. But the need to fail back to the primary as soon as it's available again is absolutely paramount.

For now, I've had to switch over to a Ubiquity EdgeRouter X, which does indeed fail back the way I need.

Thanks again for your time and consideration on this matter.

3 Likes

So the TL;DR is if WAN1 dies, it will switch to WAN2 but will not switch back to WAN1 unless WAN2 dies or if the router is manually switched back to WAN1?

Is that accurate?

WAN2 will not switch back to WAN1 upon connection restore?

1 Like

That is generally accurate.

However I should mention that I'm not sure if NEW connections are established over WAN1 once it's back up. It's quite possible they may be.

For sure in my situation, existing TCP/UDP streams are not broken on WAN2 and migrated/reestablished on WAN1 once it's back up — and THAT is what I need.

I would not imagine connections would be broken and re-established on WAN1 when the interface is back online. That would be disruptive.

However I would like to know if NEW connections would be made on WAN1 after the connection has been restored. I would like to think this is the case.

I guess I can test this when I get home.

"Disruptive" is a relative term.

For video conferencing, VOIP, etc, sure, it would be disruptive.

But for applications (P2P, etc) that account for those kinds of connection breaks, it's not disruptive at all.

And that's my point / issue — it would be nice if the option was made available to the end user. Default it to off, but having it available would allow me to continue to use my new Flint 2 in this case.

Anyway, I'm in a pre-production environment and can't really afford the time to swap out devices again just to test if NEW connections are established over WAN1 once it's back up. As indicated, I need current connections automatically broken on WAN2, allowing for them to reestablish themselves over WAN1.

No.
After the restore of WAN1, the newly established session/link will switch to WAN1.

The old sessions that have been established will continue to work in WAN2 first.
When the session is over, WAN2 will not be used, and subsequent new sessions are in WAN1.

This is to ensure that the originally established session will not be suddenly disconnected, which will affect the work or business.
In addition, if WAN1 is restored but not stable, so for the sake of conservatism, the session already established on WAN2 will continue connected until the session ends.

3 Likes

Hi, I'm curious to know whether it is possible only to use 1 ISP for multi wan purposes of the ethernet connection on Flint 2?

Let says the main router of the Lan port for 1 & 2 connects to the Flint 2 Wan port 1 & 2, does the Flint 2 will get 2 ethernet connections at the same time? Or else it will consider 1 ethernet connection :thinking:

Thanks :blush:

  1. The ISP link needs to support 1 ISP 1 port for multiple WAN IP.

  2. GL firmware does not support 1 port and multiple WAN PPPOE for the time being. You can configure it through Luci.

In this case, Flint 2 can obtain two WANs, but the rates cannot be superimposed and merged.

1 Like

I ran into this same issue on my AC1300. My use case is that I’m using cellular tethering for backup so when WAN1 goes down, I fail over to cellular on WAN2. However, I pay for cellular data so I’d really like all sessions to fail back to WAN1, even if that means interrupting a stream. I’d much rather have a couple seconds of interruption on an established stream than have it use cellular data for hours/days that I have to pay for. I agree with the above comments that it would be a great option that defaults to off, but could be toggled on for use cases where failing back is more important than maintaining the existing sessions.

Thanks, -mike

Generally session/connection will not last more than several hours, unless the P2P download, uninterrupted meeting, etc.

You can observe the traffic statistics of the cellular and Client page to determine which client's and application's session will not be interrupted?

If it does not really interrupt, you can force all sessions back to WAN1 by reconnecting the cellular.

Hi bruce,

An analogy to my use case is using the router in remote stores where the point of sale (PoS) system and the store’s custom music stream are being pulled from the internet. If the primary WAN1 connection goes down, the router fails over to cellular.

I agree that the PoS system will make new connections and establish new sessions and move back to WAN1 when it comes back online pretty quickly. The issue is with the music stream.

The stream uses Secure Reliable Transport (SRT) which is a UDP NACK protocol. Once the stream is established, it won’t drop unless the store manually switches to another feed or for some reason the stream is interrupted. This means it could run for hours or days (or ideally weeks) without interruption. The stream runs at 0.5Mb/s so it can quickly add up to GB of data which can get expensive over a cellular connection if it runs for days or weeks (or the stream bandwidth gets throttled with “unlimited” cellular plans).

Obviously we could build something into our software to force the stream to drop and reconnect periodically, but given that WAN1 drops should be rare, we don’t want to interrupt the SRT UDP stream periodically for no reason if it never failed over to the cellular connection. We’d prefer if the router could just do that for us when WAN1 is stable.

Other devices I’ve worked with have a “fail-back after X seconds of higher priority connection being available.” This handles the majority of the cases mention above where WAN1 might be unstable and you don’t want to flap sessions between connections. I could, for example, configure it to fail-back all connections after 600s of WAN1 being available.

Thanks for considering the use case.

-mike

Please try to SSH to the router, and execute this to see if the session can be interruptted:

conntrack -F

Bruce,

Yes, conntrack -F does interrupt the session and move it back to the WAN. I ran two tests:

Without conntrack:

  1. Enable WAN and tethering
  2. Establish the SRT audio stream
  3. Unplug the WAN cable
  4. Hear an audio dropout for ~5 seconds
  5. Audio restored via the tethered connection
  6. Plug in the WAN cable
  7. Wait 30 seconds
  8. Unplug the tether cable
  9. Hear an audio dropout for ~5 seconds
  10. Audio restored via the WAN connection

With conntrack:

  1. Enable WAN and tethering
  2. Establish the SRT audio stream
  3. Unplug the WAN cable
  4. Hear an audio dropout for ~5 seconds
  5. Audio restored via the tethered connection
  6. Plug in the WAN cable
  7. Wait 30 seconds
  8. Run conntrack -F
  9. Hear an audio dropout for ~2 seconds
  10. Audio restored via the WAN connection
  11. Unplug the tether cable
  12. No audio drop (because the session had already moved to WAN)

FWIW, this is the SRT audio stream session listed in conntrack (remote IP removed):

udp      17 179 src=192.168.8.208 dst=137.x.x.x sport=53009 dport=5010 packets=17385 bytes=1251836 src=137.x.x.x dst=192.168.150.40 sport=5010 dport=53009 packets=39155 bytes=22513331 [ASSURED] mark=0 use=1

-mike

You can add the command conntrack -F to this script:

/etc/hotplug.d/kmwan/kmwan_status_change

Dear GL.iNet team.
As an owner of several GL.iNet devices and person who recommended them to my family and friends (you can check in goodcloud how many devices are assigned to me), i want to express my deep dissatisfaction that this team introduced this backward incompatible change and not giving their users an option to chose.

But i fond a solution. Tplink omada and festa have this option available (link backup in omada drouters reflects the current GL.iNet behavior, while “always link primary forcibly restores the primary wan). Each user can chose the strategy that is best for their use case.

Thus i’m switching to tp link. I wish this team best of luck when frustrating their customers by introducing backward incompatible changes to their systems after users took a decision to buy their devices.

Hello,

Sincerely sorry for the inconvenience.

If the link is forcibly switched, all existing sessions will be interrupted, which may lead to interruption or lag of streaming media or conference.
In addition, if the primary link is disconnected after a while, the streaming media or conference will be interrupted again because it is switched to the backup link.

This may cause a bad network experience for some users if they are forced to switch to an unstable main link.

When the session ends, the newly created session will actually automatically use the restored main link. This ensures that the streaming media or conference link is normal and uninterrupted.

But we'll review your request again with the PM team.

1 Like

Hi,

I vote for bringing back Forced Refresh Streams.

I use and enjoy multiple GliNet devices.

Had Spitz (GL-X750V2) and upgraded to Spitz Plus (GL-X2000). I use the Spitz cellular option ONLY for FailOver/backup when my rural primary cable modem internet WAN (Ethernet 1) goes down. (Unfortunately about once a month.)

Cell data rates are charged by data consumed and are on average, expensive.

When WAN 1 comes back up, I want Forced Refresh Streams to stop sending any traffic over Cellular.

Spitz (GL-X750V2) had/has Forced Refresh Streams as an OPTION and it worked great.

Spitz Plus (GL-X2000) does not have that option and it is costing me money/time logging in and disconnecting cellular.

I believe RV owners might have the same use case.

Please bring back Forced Refresh Streams as an option.

Thanks

1 Like

Thanks for your suggestion!

We will evaluate this request with PM team again.