Brume 3 (MT5000) broken DDNS bug (and part fix)

So.. third time in a month I’ve been woken up by customer emergencies when the VPN tunnel fails. All 3 due to Brume 3’s with DDNS that suddenly stopped working (separate customers / routers). Rebooting the router doesn’t help, but disable/re-enable of DDNS does fix it (after a few minutes for DNS propagation)

I haven’t figured out what’s killing DDNS to begin with, but I have found why it doesn’t self-heal after a basic router reboot.. in short, GL added a new (broken) flock wrapper in this latest firmware.

Symptom: After any reboot, DDNS shows "enabled" in the admin panel but the daemon is not running.

Root cause: A change was made to the hotplug script sometime between builds 224 (SlateAX 4.8.2) and 294 (Brume3 4.8.4) that introduced a flock wrapper.

Old version (working):
/etc/init.d/gl_ddns restart &

New version (broken, build 294):
flock -n /var/run/ddns/glddns.lock /etc/init.d/gl_ddns restart && rm /var/run/ddns/glddns.lock &

The flock call requires /var/run/ddns/ to exist in order to create the lock file. But /var is tmpfs and is wiped on every reboot. The new DDNS init script's boot() function is a no-op (return 0), so nothing recreates this directory at boot. The directory is only created when the DDNS updater script actually runs >>>> but it can't run because flock fails first.

Next is repeat chicken-and-egg problem:

  1. boot() returns 0 > nothing happens
  2. WAN comes up > hotplug fires
  3. flock -n /var/run/ddns/glddns.lock fails silently (no such file or directory)
  4. The restart command inside the flock never executes
  5. DDNS daemon never starts, and no log files + no error in the GUI

How to reproduce:

  1. enable DDNS in GUI
  2. reboot Brume 3
  3. check GUI and DDNS still shows enabled, but:
    /var/run/ddns/ - does not exist
    /var/log/ddns/ - does not exist
    DDNS process - not running
    Syslog DDNS entries - zero
  4. in DDNS GUI: Disable > Apply > Enable > Apply
  5. DDNS is working again.

Conclusion: DDNS is broken on every Brume3 reboot.

FIX: Add mkdir -p /var/run/ddns before the flock call in /etc/hotplug.d/iface/95-gl_ddns

Still not positive what broke it to start with. Have setup a monitoring script for now.

Thanks!

Edits - formatting and late night spelling.

1 Like

@bruce just experienced this on a another Brume3 today. I can confirm now on 4 separate Brume3s, that after being initialized, GL DDNS is broken upon any subsequent reboot.

Current only fix is to disable (apply) and re-enable the Dynamic DNS in the GUI after reboot.

1 Like

Posted with workarounds:
https://www.reddit.com/r/GlInet/comments/1s5gruc/important_notice_for_any_brume3_glmt5000_users/

1 Like

This is a HUGE issue especially for this unit as it's main use case is prob a VPN server.

Did you open a proper ticket with gl.inet support about this?

Agreed, but what is a proper ticket? They have no “official” issue tracker. If they’d give us a proper github then I’d raise an issue. Otherwise, posting here is more effort than it should already be to engage. No one should have to email anything.

That said, I do know it’s reported - and they’ve actually had this issue tracked for over a month and didn’t think it was a priority.

The lack of response here is regrettably astounding. This should be a “same day fix, new firmware push” type of issue. Currently the Brume 3 - their premier VPN gateway - is functionally broken for it’s primary purpose.

@will.qiu @bruce someone at GL should be ahead of this. It’s impacting your customers severely right now.