So.. third time in a month I’ve been woken up by customer emergencies when the VPN tunnel fails. All 3 due to Brume 3’s with DDNS that suddenly stopped working (separate customers / routers). Rebooting the router doesn’t help, but disable/re-enable of DDNS does fix it (after a few minutes for DNS propagation)
I haven’t figured out what’s killing DDNS to begin with, but I have found why it doesn’t self-heal after a basic router reboot.. in short, GL added a new (broken) flock wrapper in this latest firmware.
Symptom: After any reboot, DDNS shows "enabled" in the admin panel but the daemon is not running.
Root cause: A change was made to the hotplug script sometime between builds 224 (SlateAX 4.8.2) and 294 (Brume3 4.8.4) that introduced a flock wrapper.
Old version (working):
/etc/init.d/gl_ddns restart &
New version (broken, build 294):
flock -n /var/run/ddns/glddns.lock /etc/init.d/gl_ddns restart && rm /var/run/ddns/glddns.lock &
The flock call requires /var/run/ddns/ to exist in order to create the lock file. But /var is tmpfs and is wiped on every reboot. The new DDNS init script's boot() function is a no-op (return 0), so nothing recreates this directory at boot. The directory is only created when the DDNS updater script actually runs >>>> but it can't run because flock fails first.
Next is repeat chicken-and-egg problem:
- boot() returns 0 > nothing happens
- WAN comes up > hotplug fires
- flock -n /var/run/ddns/glddns.lock fails silently (no such file or directory)
- The restart command inside the flock never executes
- DDNS daemon never starts, and no log files + no error in the GUI
How to reproduce:
- enable DDNS in GUI
- reboot Brume 3
- check GUI and DDNS still shows enabled, but:
/var/run/ddns/ - does not exist
/var/log/ddns/ - does not exist
DDNS process - not running
Syslog DDNS entries - zero - in DDNS GUI: Disable > Apply > Enable > Apply
- DDNS is working again.
Conclusion: DDNS is broken on every Brume3 reboot.
FIX: Add mkdir -p /var/run/ddns before the flock call in /etc/hotplug.d/iface/95-gl_ddns
Still not positive what broke it to start with. Have setup a monitoring script for now.
Thanks!
Edits - formatting and late night spelling.