I have a Slate AX (GL-AXT1800) running firmware 4.5.16. When Tailscale > Custom Exit Node is enabled, internet access from the guest wifi network starts dropping 50% of packets. However, Internet access from the main wifi network, as well as access to the Tailscale network devices and advertised subnets on the Tailscale network, continue to work.
I have reviewed similar threads on this, though it's not always clear whether the other issues relate to internet access via the main wifi network (which is routed to the exit node via Tailscale) or via the guest wifi network (which should go directly to the internet):
- Tailscale Exit Node Not Working - #18 by chico_r
- Tailscale brings down internet on Flint AX-1800 4.4.5
- GL-MT3000 stops working when connecting to Tailscale Exit Node
- Tailscale: No internet after enabling custom exit node - #6 by T309
I am seeing this issue when using firmware versions 4.5.0, 4.5.16 and the 4.6.0 beta. It doesn't occur with 4.4.6, but that firmware has a very old version of Tailscale that doesn't support setting specific IPs for Tailscale nodes so I can't use this version.
I have reproduced this issue with my AXT1800 from a factory default config running 4.5.16, using a network with internet access connected to the WAN port and using the default subnets for both main and guest wifi networks (192.168.8.0/24 and 192.168.9.0/24 respectively). A quick diagram of the test setup is as follows:
Here are the steps that I took to set up the test network:
- Reset AXT1800 to factory defaults, change password and sync time with browser
- Enable the 5G guest network
- Check that the internet is reachable from a device connected to both main and guest wifi networks by ping to 4.2.2.2 (ok)
- Enable Tailscale and connect the AXT1800 to the Tailscale network using the device bind link. Approve the new device on the Tailscale admin console
- Enable Tailscale > Remote Access LAN option. Check internet is still reachable from both main and guest wifi networks (ok)
- Enable Tailscale > Custom Exit Node
At this point, the main wifi network could still reach the internet via the exit node. But the guest wifi network started dropping 50% of packets to the internet.
The exit node on my Tailscale network advertises 5 internal routes into Tailscale. It is a small VM on my network running a recent version of Tailscale (1.66.4), and is also used by other devices as a remote access gateway and exit node and they work fine. The 5 advertised routes are all within 192.168/16 space, and do not overlap with the default 192.168.8.0/24 and 192.168.9.0/24 networks used by the AXT1800. There is a route for 192.168.8.0/24 on the network beyond the Tailscale exit node to the AXT1800 via Tailscale, and I can ping the AXT1800 at 192.168.8.1 from a host on one of those networks via the Tailscale network. After Tailscale is connected, hosts associated to the main wifi network can access servers on the 5 internal networks without NAT, so the Tailscale part of the configuration is working as expected.
Access to the internet from the main wifi network also works fine when Custom Exit Node is both on and off. When Custom Exit Node is off, the internet is accessed directly, and when the setting is on, the internet is reached via the exit node as expected. This is verified using a Google search for my ip address
showing the public IP address that is expected for each path.
However, from a host connected to the guest wifi network, I see 50% packet loss once Custom Exit Node is enabled. In the ping output below from such a host, Custom Exit Node was enabled after icmp_seq=6
:
% ping 4.2.2.2
PING 4.2.2.2 (4.2.2.2): 56 data bytes
64 bytes from 4.2.2.2: icmp_seq=0 ttl=52 time=21.915 ms
64 bytes from 4.2.2.2: icmp_seq=1 ttl=52 time=42.802 ms
64 bytes from 4.2.2.2: icmp_seq=2 ttl=52 time=18.757 ms
64 bytes from 4.2.2.2: icmp_seq=3 ttl=52 time=36.834 ms
64 bytes from 4.2.2.2: icmp_seq=4 ttl=52 time=18.757 ms
64 bytes from 4.2.2.2: icmp_seq=5 ttl=52 time=18.471 ms
64 bytes from 4.2.2.2: icmp_seq=6 ttl=52 time=18.894 ms
64 bytes from 4.2.2.2: icmp_seq=7 ttl=52 time=18.552 ms
Request timeout for icmp_seq 8
64 bytes from 4.2.2.2: icmp_seq=9 ttl=52 time=18.525 ms
Request timeout for icmp_seq 10
64 bytes from 4.2.2.2: icmp_seq=11 ttl=52 time=18.538 ms
Request timeout for icmp_seq 12
64 bytes from 4.2.2.2: icmp_seq=13 ttl=52 time=18.968 ms
Request timeout for icmp_seq 14
64 bytes from 4.2.2.2: icmp_seq=15 ttl=52 time=93.318 ms
Request timeout for icmp_seq 16
64 bytes from 4.2.2.2: icmp_seq=17 ttl=52 time=18.844 ms
Request timeout for icmp_seq 18
64 bytes from 4.2.2.2: icmp_seq=19 ttl=52 time=19.271 ms
^C
When Custom Exit Node is disabled, the IP rule and route tables are as follows (192.168.5.0/24 is the network connected to the WAN port):
root@GL-AXT1800:~# ip rule
0: from all lookup local
48: from all to 192.168.5.0/24 lookup main
49: from all to 192.168.8.0/24 lookup main
50: from all to 100.100.100.100 lookup 52
1099: from all fwmark 0x80000/0xc0000 lookup main
1100: from all lookup main suppress_prefixlength 0
1101: not from all fwmark 0x8000/0xc000 lookup 8000
5210: from all fwmark 0x80000/0xff0000 lookup main
5230: from all fwmark 0x80000/0xff0000 lookup default
5250: from all fwmark 0x80000/0xff0000 unreachable
5270: from all lookup 52
32766: from all lookup main
32767: from all lookup default
root@GL-AXT1800:~# ip route show
default via 192.168.5.1 dev eth0 proto static src 192.168.5.245 metric 10
100.64.0.0/10 dev tailscale0 scope link
192.168.5.0/24 dev eth0 proto static scope link metric 10
192.168.8.0/24 dev br-lan proto kernel scope link src 192.168.8.1
192.168.9.0/24 dev br-guest proto kernel scope link src 192.168.9.1
ip route show table 52
contains the expected Tailscale network host routes and the subnet routes advertised by the exit node.
When Custom Exit Node is enabled, the IP rule and route tables are as follows:
0: from all lookup local
47: from 192.168.9.0/24 lookup main
48: from all to 192.168.5.0/24 lookup main
49: from all to 192.168.8.0/24 lookup main
50: from all to 100.100.100.100 lookup 52
1099: from all fwmark 0x80000/0xc0000 lookup main
1100: from all lookup main suppress_prefixlength 0
1101: not from all fwmark 0x8000/0xc000 lookup 8000
5210: from all fwmark 0x80000/0xff0000 lookup main
5230: from all fwmark 0x80000/0xff0000 lookup default
5250: from all fwmark 0x80000/0xff0000 unreachable
5269: from all fwmark 0x80000/0x80000 lookup main
5270: from all lookup 52
32766: from all lookup main
32767: from all lookup default
I note that extra rules with priority 47 and 5269 are added when Custom Exit Node is enabled.
I used tcpdump running on the internet gateway connected to the WAN port of the AXT1800 and saw that only 50% of the ping request packets from a host on the guest wifi network via the AXT1800 were reaching the gateway. I could see the responses from the internet for those requests. Note that the timestamps are 2 seconds apart, the ping is sending a request every 1 second:
# tcpdump -ni igb3_vlan5 host 4.2.2.2
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on igb3_vlan5, link-type EN10MB (Ethernet), capture size 262144 bytes
10:14:50.882955 IP 192.168.5.245 > 4.2.2.2: ICMP echo request, id 3847, seq 353, length 64
10:14:50.897594 IP 4.2.2.2 > 192.168.5.245: ICMP echo reply, id 3847, seq 353, length 64
10:14:52.894171 IP 192.168.5.245 > 4.2.2.2: ICMP echo request, id 3847, seq 355, length 64
10:14:52.909108 IP 4.2.2.2 > 192.168.5.245: ICMP echo reply, id 3847, seq 355, length 64
10:14:54.907378 IP 192.168.5.245 > 4.2.2.2: ICMP echo request, id 3847, seq 357, length 64
10:14:54.922108 IP 4.2.2.2 > 192.168.5.245: ICMP echo reply, id 3847, seq 357, length 64
When Custom Exit Node is enabled, it looks to me that IP rule 47 should send all guest wifi network traffic directly to the internet and avoid all Tailscale routing. But there seems to be something else happening to routed packets before it hits this rule. I can't see anything obvious in the iptables rules that would cause half of the packets to be dropped, but the rulebase is quite difficult to follow.
I have also tried to update Tailscale on the AXT1800 Script: Update Tailscale on (nearly) all devices to the latest version (currently 1.68.1), but this didn't make a difference.
Has anyone seen this issue before, and if so are there any workarounds? I have tried the suggestions from similar posts suggesting modifying the WAN firewall zone covered devices, MSS clamping and masquerading settings, but none of these helped with this issue.