Device: GL-MT6000 (Flint 2) Firmware: GL.iNet 4.8.4 (OpenWrt 21.02-SNAPSHOT) Severity: High — silently breaks DNS resolution for one specific LAN client after the WAN public IP changes
Summary
After the ISP rotated the public IP assigned to the router (PPPoE WAN), one specific LAN client lost the ability to resolve DNS over UDP/53 — to any upstream server. TCP/443 outbound from the same client kept working perfectly. Other clients on the same VLAN were unaffected. Reboot of the client, reboot of the router, conntrack -F, and nf_conntrack_helper=1 did not fix the issue. Only adding a DoH proxy on the client (bypassing UDP/53 through TCP/443) restored DNS.
Packet captures show the query reaches Cloudflare, the response reaches the router's kernel (conntrack registers packets=1 bytes=333 on the reverse tuple), but no chain in the iptables filter table ever sees a packet with dst=<client IP> — meaning the reverse-NAT / UN-DNAT step silently drops or mis-rewrites the response for this specific flow.
Affected device
-
Model: GL-MT6000 (Flint 2)
-
Firmware: GL.iNet 4.8.4 (factory firmware)
-
Underlying OS: OpenWrt 21.02-SNAPSHOT
-
Tailscale: 1.80.3 (factory bundle, subnet-router + exit-node)
-
AdGuard Home: enabled, dns-port 3053, listening on the router
Network topology (relevant)
-
WAN: PPPoE on
pppoe-wan, dynamic public IP (ISP: RDS Digi) -
LAN: 4 VLANs (Trusted 192.168.1.0/24, Crypto 192.168.30.0/24, IoT 192.168.40.0/24, Guest 192.168.9.0/24)
-
Affected client: NAS QNAP TS-673A, MAC
24:5E:BE:83:43:3C, IP192.168.1.222, connected by cable to lan2 -
Working clients (same VLAN Trusted): laptop on Wi-Fi (different MAC), other LAN devices
Symptom (from the affected client)
$ getent hosts api.cloudflare.com
# (empty, no result)
$ curl https://api.cloudflare.com/
# HTTP 000 / 4-second timeout (DNS resolution fails)
$ curl --resolve api.cloudflare.com:443:104.16.132.229 https://api.cloudflare.com/
# HTTP 403 t=0.038s -- Direct IP works fine — TCP/443 outbound is OK
$ curl https://1.1.1.1/dns-query?name=api.cloudflare.com # DoH
# {"Status":0, ...} -- DoH (TCP/443) works perfectly
conntrack on the affected client shows all UDP/53 outbound connections as [UNREPLIED], regardless of the upstream server (Flint 2's own dnsmasq at 192.168.1.1, public 1.1.1.1, public 9.9.9.9 — all fail identically).
From any other client on the same VLAN (e.g. Wi-Fi laptop), DNS via UDP/53 to the same upstreams works fine.
Diagnostic data (gathered on the router)
Outbound query path
Counter rules added on the router for the affected client's UDP/53 traffic:
| Chain | Table | Filter | Pkts |
|---|---|---|---|
| PREROUTING | mangle | udp dport 53 src 192.168.1.222 |
OK (e.g. 19 pkts) |
| FORWARD | filter | (same flow at FORWARD) | OK |
| POSTROUTING | mangle | (same flow leaving WAN) | OK |
Query leaves the router toward the upstream resolver normally.
Inbound response path (the problem)
After flushing conntrack and triggering 5 fresh queries from the client:
| Chain | Table | Filter | Pkts | Notes |
|---|---|---|---|---|
| PREROUTING | mangle | udp sport 53 (any) |
6 | Response arrives |
| FORWARD | mangle | udp sport 53 (any) |
6 | Passes mangle FORWARD |
| FORWARD | filter | udp sport 53 dst 192.168.1.222 |
0 | Never seen with the un-NATted destination |
| POSTROUTING | mangle | udp sport 53 (any) |
6 | Goes back out (?) |
filter FORWARD never sees a packet whose destination is the original LAN client. The kernel's conntrack table does register the response (packets=1 bytes=333 on the reverse tuple, target dst=<router public IP>), confirming the packet reached the host, but the un-DNAT step that should rewrite dst to 192.168.1.222 is not producing the rewritten packet for FORWARD.
Counter on the affected client's INPUT chain
| Chain | Table | Filter | Pkts |
|---|---|---|---|
| INPUT | filter | udp sport 53 (any) |
0 |
| INPUT | mangle | udp sport 53 (any) |
0 |
No UDP/53 reply ever reaches the affected client's network stack.
Attempted fixes that did NOT resolve the issue
-
conntrack -Fon the router (twice) — no effect. -
sysctl -w net.netfilter.nf_conntrack_helper=1— no effect (was 0 by default; flushed conntrack after). -
Reboot of the affected NAS — no effect.
-
Reboot of the router — no effect.
-
Removing the affected MAC from
src_mac10/src_mac11ipsets (VPN policy) — confirmed not related (counters inTUNNEL10/11_ROUTE_POLICYchains showed 0 packets matched for this client at the time of test). -
Per-container DNS override on the affected client's Docker daemon — irrelevant, the host itself fails.
Diagnostic snapshot details
-
iptables-saveoutput captured (~12 KB, available on request). -
conntrack -Lshows ESTABLISHED flow on the reverse tuple but no FORWARD/INPUT delivery. -
dmesgshows:nf_conntrack: default automatic helper assignment has been turned off for security reasons and CT-based firewall rule not found.(documented OpenWrt warning, but enabling helpers did not fix it.)
The LAN client's DNS started failing at the same time as a forced PPPoE reconnection (router uptime at the time of the bug was 17 h, public IP had changed mid-day).
Suspected root cause
The nf_conntrack_nat IP-rewrite step, on the reply for an outbound UDP/53 flow originated by a specific MAC/IP combination, is silently dropped after a public-IP rotation on the WAN. Other LAN clients (different MAC, same VLAN) on the same router/WAN do not suffer this — strongly indicating a per-flow conntrack inconsistency that survives across conntrack -F. Possibly a bug in the GL.iNet-specific pre_dns_deal_conn_zone chain or the dns_dispatcher / policy_redirect NAT chains added by the AdGuard / VPN-policy modules, in combination with a stale CT zone entry pinned to the old WAN IP.
Workaround applied (works perfectly)
On the affected NAS:
-
Run
klutchell/dnscrypt-proxyDocker container in--network hostmode, listening on127.0.0.2:5354, upstream Cloudflare DoH (TCP/443). -
Configure
/etc/dnsmasq.confwith:server=127.0.0.2#5354+no-resolv. -
/etc/resolv.conf→nameserver 127.0.1.1(local dnsmasq).
Result: DNS resolves correctly via DoH; getent hosts api.cloudflare.com → HTTP 301, 51 ms. TCP/443 path is the only working DNS path for this client.
Severity / Impact
-
DNS-dependent services on the affected client (Cloudflare tunnels, Tailscale registration, Docker daemon registry pulls) were all silently failing with cascading errors.
-
Diagnosis took ~5 hours due to the silent drop and the inconsistency between the working conntrack registration vs. the missing FORWARD packet.
-
A user without packet-level diagnostic skill would have no way to recover other than factory-resetting the router.
Suggested investigation areas
-
Behaviour of
pre_dns_deal_conn_zone(raw PREROUTING) and theCT zone 40960assignment when the public IP changes. -
Interaction between the GL.iNet
policy_redirect/dns_dispatcherNAT chains and the kernel's UN-NAT for UDP/53 replies post-IP rotation. -
Whether
conntrack -Factually clears thenatmappings or only thefilterstate-tracking table; the symptom suggests the NAT mapping is pinned to the old public IP. -
Whether a
Tailscale + AdGuard + VPN-policy-by-MACconfiguration (all enabled) is exercising a code path that triggers this.
Test environment available
I can reproduce on demand and gather any specific data (tcpdump -w, iptables-save -c, conntrack -L, dmesg, etc.) if needed. Happy to run instrumented tests if engineering wants to dig in. Thanks!