GL-MT6000 Flint 2 - policy based routing with VPN affects some connectivity

hrk_hrk · April 25, 2025, 1:54am

Strange issue. I'm trying to route traffic from a specific LXC (Linux container) with the MAC address BC:24:11:E4:F4:CF (, IP 192.168.0.6) through a WireGuard VPN on an GL-MT6000. When I select "VPN policy based on Client device" is not fully working as expected, I suspect some requests are lost. When I select "Auto detect" proxy mode - everything works but now all devices go through WG VPN and that's something I want to avoid.

Here are my tests run from LXC container:
[No VPN]

checking torrent tracker1 port: curl -v http://188.120.242.106:2710/ - OK
checking torrent tracker2 port: curl -v http://93.158.213.92:1337 - OK

[WireGUard with Policy based routing]

checking torrent tracker1 port: curl -v http://188.120.242.106:2710/ - Could not connect to server
checking torrent tracker2 port: curl -v http://93.158.213.92:1337 - OK

So one might think tracker is blocking VPN IPs, right?
However:
[WireGUard with Auto Detect proxy mode]

checking torrent tracker1 port: curl -v http://188.120.242.106:2710/ - OK
checking torrent tracker2 port: curl -v http://93.158.213.92:1337 - OK

And:
[Wireguard with with Policy based routing, plus manually added IP routing]
ip route add table wg_vpn default dev wgclient
ip rule add from 192.168.0.6 lookup wg_vpn

checking torrent tracker1 port: curl -v http://188.120.242.106:2710/ - OK
checking torrent tracker2 port: curl -v http://93.158.213.92:1337 - OK

I'm kinda at loss, IP and MAC based routing and FWs are not my strongest side and even for most of above I had to use ChatGPT.
How is that possible? How come some traffic goes though VPN but similar traffic to another destination is rejected? How come it doesn't work in Policy Based routing but works in "Auto"?

hansome · April 26, 2025, 4:12pm

Maybe using "VPN Policy Based on the Client Device" mode, you forgot to click Plus sign?

If it still doesn't work, please export a log to check.
http://192.168.8.1/#/logview

hrk_hrk · April 26, 2025, 5:54pm

No, I haven't forgot.

I'm checking exit point after VPN is up with command on affected LXC "curl https://ipleak.net/json/" so I can see that traffic indeed leaves through AirVPN WG server in CZ:

root@qbittorrent:~# curl https://ipleak.net/json/
{
    "as_number": 9009,
    "isp_name": "M247 Europe SRL",
    "country_code": "CZ",
    "country_name": "Czech Republic",
    "continent_code": "EU",
    "continent_name": "Europe",
    "city_name": "Prague",
    "postal_code": null,
    "postal_confidence": null,
    "latitude": "50.08333206176758",
    "longitude": "14.466667175292969",
    "accuracy_radius": 1,
    "time_zone": "Europe\/Prague",
    "metro_code": null,
    "level": "min",
    "country_confidence": 100,
    "city_confidence": 100,
    "region_confidence": 100,
    "cache": 1745689144,
    "ip": "185.156.174.27",
    "type": "AirVPN Server (Exit 1, Markab)",
    "reverse": "",
    "query_text": "185.156.174.27",
    "query_type": "myip",
    "query_date": 1745689144

Then I test connectivity to two trackers, one fails, second is OK

root@qbittorrent:~# curl -v http://188.120.242.106:2710
*   Trying 188.120.242.106:2710...
* connect to 188.120.242.106 port 2710 failed: Connection refused
* Failed to connect to 188.120.242.106 port 2710 after 0 ms: Couldn't connect to server
* Closing connection 0
curl: (7) Failed to connect to 188.120.242.106 port 2710 after 0 ms: Couldn't connect to server
root@qbittorrent:~# curl -v http://93.158.213.92:1337
*   Trying 93.158.213.92:1337...
* Connected to 93.158.213.92 (93.158.213.92) port 1337 (#0)
> GET / HTTP/1.1
> Host: 93.158.213.92:1337
> User-Agent: curl/7.88.1
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 302 Found
< Content-Length: 0
< Location: https://opentrackr.org/
< 
* Closing connection 0

If I change VPN mode "Global Proxy" or "Auto Detect" I see that my traffic still exits at the same AirVPN server but now both tests are OK.

root@qbittorrent:~# curl https://ipleak.net/json/
{
    "as_number": 9009,
    "isp_name": "M247 Europe SRL",
    "country_code": "CZ",
    "country_name": "Czech Republic",
    "continent_code": "EU",
    "continent_name": "Europe",
    "city_name": "Prague",
    "postal_code": null,
    "postal_confidence": null,
    "latitude": "50.08333206176758",
    "longitude": "14.466667175292969",
    "accuracy_radius": 1,
    "time_zone": "Europe\/Prague",
    "metro_code": null,
    "level": "min",
    "country_confidence": 100,
    "city_confidence": 100,
    "region_confidence": 100,
    "cache": 1745689936,
    "ip": "185.156.174.27",
    "type": "AirVPN Server (Exit 1, Markab)",
    "reverse": "",
    "query_text": "185.156.174.27",
    "query_type": "myip",
    "query_date": 1745689936

root@qbittorrent:~# curl -v http://188.120.242.106:2710
*   Trying 188.120.242.106:2710...
* Connected to 188.120.242.106 (188.120.242.106) port 2710 (#0)
> GET / HTTP/1.1
> Host: 188.120.242.106:2710
> User-Agent: curl/7.88.1
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 404 Not Found
< 
* Closing connection 0
root@qbittorrent:~# curl -v http://93.158.213.92:1337
*   Trying 93.158.213.92:1337...
* Connected to 93.158.213.92 (93.158.213.92) port 1337 (#0)
> GET / HTTP/1.1
> Host: 93.158.213.92:1337
> User-Agent: curl/7.88.1
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 302 Found
< Content-Length: 0
< Location: https://opentrackr.org/
< 
* Closing connection 0

I can't wrap my head around how it's possible, unless policy based routing missed some small, but important thing that causes this issue.
Logs for 1st case (policy based routing) are attached.

logread(1).tar

hrk_hrk · April 27, 2025, 12:04am

OK. I did some more digging and I found the issue, however I don't know how to resolve it.
So, we have policy based routing to WG for one specific MAC address. And some requests fail. Other don't.

I test with these commands from affected LXC container BC:24:11:E4:F4:CF (, IP 192.168.0.6)

root@qbittorrent:~# curl -v http://93.158.213.92:1337
*   Trying 93.158.213.92:1337...
* Connected to 93.158.213.92 (93.158.213.92) port 1337 (#0)
> GET / HTTP/1.1
> Host: 93.158.213.92:1337
> User-Agent: curl/7.88.1
> Accept: */*
> 
* HTTP 1.0, assume close after body
< HTTP/1.0 302 Found
< Content-Length: 0
< Location: https://opentrackr.org/
< 
* Closing connection 0
root@qbittorrent:~# curl -v http://188.120.242.106:2710
*   Trying 188.120.242.106:2710...
* connect to 188.120.242.106 port 2710 failed: Connection refused
* Failed to connect to 188.120.242.106 port 2710 after 0 ms: Couldn't connect to server
* Closing connection 0
curl: (7) Failed to connect to 188.120.242.106 port 2710 after 0 ms: Couldn't connect to server

Now.
Let's see what is happening on router. Hmm., not sure why is that, traffic FROM router should not go to WG at all... but here I'm seeing some kind of splittin, while it's not enabled?

root @GL-MT6000:~# ip route get 188.120.242.106
188.120.242.106 dev eth1 src 184.146.71.32 uid 0
cache
root @GL-MT6000:~# ip route get 93.158.213.92
93.158.213.92 dev wgclient table 8000 src 10.166.108.151 uid 0
cache

Everything that is tagged with 8000 is regular traffic and goes to main table of routing
the rest goes to table 8000 (why did you use same name as regular traffic tag, this is confusing) and table 8000 is WG routing
technically last two lines should not matter?

root @GL-MT6000:~# ip rule list
0: from all lookup local
1: from all iif lo lookup 16800
1099: from all fwmark 0x80000/0xc0000 lookup main
1100: from all lookup main suppress_prefixlength 0
1101: not from all fwmark 0x8000/0xc000 lookup 8000
32766: from all lookup main
32767: from all lookup default

root @GL-MT6000:~# ip route show table 8000
default dev wgclient scope link

root @GL-MT6000:~# ip route show table main
default via 142.124.38.166 dev eth1 proto static src 184.146.71.32 metric 10
128.0.0.0/2 dev eth1 proto static scope link metric 10
192.168.0.0/24 dev br-lan proto kernel scope link src 192.168.0.1

128.0.0.0/2 - some kind of splitting artefact? I don't understand it.
However if I do on router
ip route del 128.0.0.0/2 dev eth1
Then connectivity test on container works fine (probably still through WAN and not through WG)
ip route add 128.0.0.0/2 dev eth1 scope link metric 10 - and test from contaner fails again.

There are some flaws in the logic, I can't catch where as iptables and routing is not area of expertise.
I think you can easily reproduce the issue, I suspect with any VPN with policy based routing you try to reach those two IP addresses you get same results.

hrk_hrk · April 27, 2025, 1:09am

And some more to show that packets end up at wrong interface (eth1 instead of wgclient)
This is where I'm totally lost as I don't 100% understand the packets flow in Linux. This was suggested by LLM.

# Add comprehensive logging
iptables -t mangle -I PREROUTING 1 -s 192.168.0.6 -d 188.120.242.106 -j LOG --log-prefix "1_PRE_MANGLE: "
iptables -t nat -I PREROUTING 1 -s 192.168.0.6 -d 188.120.242.106 -j LOG --log-prefix "2_PRE_NAT: "
iptables -I FORWARD 1 -s 192.168.0.6 -d 188.120.242.106 -j LOG --log-prefix "3_FORWARD: "
iptables -t mangle -I ROUTE_POLICY 1 -s 192.168.0.6 -d 188.120.242.106 -j LOG --log-prefix "4_ROUTE_POL: "
iptables -t mangle -I POSTROUTING 1 -s 192.168.0.6 -d 188.120.242.106 -j LOG --log-prefix "5_POST_MANGLE: "
iptables -t nat -I POSTROUTING 1 -s 192.168.0.6 -d 188.120.242.106 -j LOG --log-prefix "6_POST_NAT: "

root @GL-MT6000:~# logread -f | grep -E "1_PRE_MANGLE|2_PRE_NAT|3_FORWARD|4_ROUTE_POL|5_POST_MANGLE|6_POST_NAT"

Sat Apr 26 20:49:57 2025 kern.warn kernel: [ 5747.309407] 1_PRE_MANGLE: IN=br-lan OUT= MAC=94:83:c4:a6:72:0a:bc:24:11:e4:f4:cf:08:00 SRC=192.168.0.6 DST=188.120.242.106 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=56561 DF PROTO=TCP SPT=57940 DPT=2710 WINDOW=64240 RES=0x00 SYN URGP=0
Sat Apr 26 20:49:57 2025 kern.warn kernel: [ 5747.329829] 4_ROUTE_POL: IN=br-lan OUT= MAC=94:83:c4:a6:72:0a:bc:24:11:e4:f4:cf:08:00 SRC=192.168.0.6 DST=188.120.242.106 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=56561 DF PROTO=TCP SPT=57940 DPT=2710 WINDOW=64240 RES=0x00 SYN URGP=0
Sat Apr 26 20:49:58 2025 kern.warn kernel: [ 5747.350160] 2_PRE_NAT: IN=br-lan OUT= MAC=94:83:c4:a6:72:0a:bc:24:11:e4:f4:cf:08:00 SRC=192.168.0.6 DST=188.120.242.106 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=56561 DF PROTO=TCP SPT=57940 DPT=2710 WINDOW=64240 RES=0x00 SYN URGP=0
Sat Apr 26 20:49:58 2025 kern.warn kernel: [ 5747.370332] 3_FORWARD: IN=br-lan OUT=eth1 MAC=94:83:c4:a6:72:0a:bc:24:11:e4:f4:cf:08:00 SRC=192.168.0.6 DST=188.120.242.106 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=56561 DF PROTO=TCP SPT=57940 DPT=2710 WINDOW=64240 RES=0x00 SYN URGP=0
^C

hansome · April 29, 2025, 11:40am

hrk_hrk:

root @GL-MT6000:~# ip route get 188.120.242.106
188.120.242.106 dev eth1 src 184.146.71.32 uid 0
cache
root @GL-MT6000:~# ip route get 93.158.213.92
93.158.213.92 dev wgclient table 8000 src 10.166.108.151 uid 0
cache

very insightful observation.

I reproduce and confirm the issue, it's caused by route conflict.

This two ip rules mean:
1100: from all lookup main suppress_prefixlength 0 - main route table first except default route
1101: not from all fwmark 0x8000/0xc000 lookup 8000 - not mark 0x8000 will go through VPN
So main table:

128.0.0.0/2 dev eth1 proto static scope link metric 10

is higher than

root@GL-MT6000:~# ip route show table 8000
default dev wgclient scope link

I set my mt6000's upstream gateway(openwrt) lan as such:

My mt6000 will have ip
128.0.0.223 and route:
128.0.0.0/2 dev eth1 proto static scope link metric 10
exactly the same as yours.

Why autodetect mode allow data pass, because lan-wan forward firewall is opened when VPN is on.

solution:
If it's possible to change gateway device ip, it's not a regular private ip range.

hrk_hrk · April 29, 2025, 5:46pm

So, are you saying that the fact my GL-MT6000 has public IP confuses the routing?
This is my current setup: ISP <-> modem (DMZ enabled) <-> GL-MT6000 (in modem's DMZ), so router gets public IP.
When I first installed router it was like this: ISP <-> modem (DMZ disabled) <-> GL-MT6000 (get private IP from modem). I changed it to DMZ because I did not want to use double port forwarding (on modem + on router). IIRC this setup actually worked just fine.

This is what happening:
1099: from all fwmark 0x80000/0xc0000 lookup main
1100: from all lookup main suppress_prefixlength 0
1101: not from all fwmark 0x8000/0xc000 lookup 8000

Packet tagged for VPN (no "8000" tag) skips 1099 rule and SHOULD go to 1101, however is being caught by 1100 rule, because main table has 128.0.0.0/2 dev eth1 proto static scope link metric 10 and that covers the range of router IP.

So there are several solutions.

like you suggested - use private IP for router. I will probably go this way, just because it's easy for me. However consider that that flaw in routing logic still remains and you will be getting customer complaining if they ever set IP of their router to anything in 128.0.0.0/2 range.
shuffle routing rules like this
1089: not from all fwmark 0x8000/0xc000 lookup 8000
1099: from all fwmark 0x80000/0xc0000 lookup main
1100: from all lookup main suppress_prefixlength 0
remove 128.0.0.0/2 from the main table. Could you please explain what is the purpose of this strange route? "128.0.0.0/2 dev eth1 proto static scope link metric 10"

hansome · May 3, 2025, 2:38am

Issue Analysis: Large Subnet Routes Affecting VPN Routing

Problem Description

We've identified an issue where netifd automatically adds a direct route with an extremely large subnet when receiving certain IP addresses from DHCP. For example, when assigned an IP like 128.0.39.144/2 with a very small subnet mask, the system creates a direct route covering a huge IP range (e.g., 128.0.0.0/2).

This causes two problems:

The router tries to ARP for remote IPs within this range instead of forwarding them to the gateway
VPN routes may be overridden by this large subnet route

When netifd receives an IP address, it automatically creates a subnet route.
For example, a 128.0.39.144/2 address creates a direct route for 128.0.0.0/2, which covers all IPs from 128.0.0.0 to 191.255.255.255. Any traffic to these IPs will try to use direct delivery instead of proper gateway routing.

Solution: Hotplug Script

The simplest solution is to add a hotplug script that removes these problematic routes. Here's a script you can add to your system:

cat > /etc/hotplug.d/iface/99-remove-large-subnet-routes << 'EOF'
#!/bin/sh

[ "$ACTION" = "ifup" ] || exit 0
[ "$INTERFACE" = "wan" ] || exit 0
[ -n "$DEVICE" ] || exit 0

# Get IP addresses of the interface
IP_INFO=$(ip addr show dev $DEVICE | grep "inet " | awk '{print $2}')

for IP_CIDR in $IP_INFO; do
    # Use ipcalc.sh to get network and prefix
    eval $(ipcalc.sh $IP_CIDR)
    
    # If prefix is smaller than 8, remove the corresponding route
    if [ -n "$PREFIX" ] && [ "$PREFIX" -lt 8 ]; then
        logger -t remove-large-subnet "Removing large subnet route: $NETWORK/$PREFIX via $DEVICE"
        ip route del "$NETWORK/$PREFIX" dev "$DEVICE" scope link
    fi
done

exit 0
EOF

chmod +x /etc/hotplug.d/iface/99-remove-large-subnet-routes

hansome · May 3, 2025, 2:58am

After further analysis, I get the above report.
Initially, I assumed the address was a special private subnet, but it turns out to be a public IP.
I believe your case is uncommon but it's very valuable.

As a temporary workaround, I’ve added a hotplug script. However, to resolve the issue properly at its source, a patch to netifd will be required. I’ll conduct more testing on this.

Reordering the rules as suggested does appear to help, but introduces a side effect:
direct routes (i.e., entries in the main routing table except the default route) may get overridden by the VPN route.
This can effectively render rule 1100 ineffective.

hrk_hrk · May 3, 2025, 6:41pm

Well, I'm glad I was able to help by finding the issue. I have just recently started using the router (moved away from Asus) and so far I like it very much, I also like that I can get a sensible replies from support.

hansome · May 6, 2025, 7:20am

After further testing and investigation, I found that adding an IP address always generates a corresponding direct route. For example:

root@GL-MT6000:/tmp# ip addr add 1.2.3.4/24 dev eth1
root@GL-MT6000:/tmp# 
root@GL-MT6000:/tmp# ip route
default via 192.168.106.1 dev eth1 proto static src 192.168.106.223 metric 10 
1.2.3.0/24 dev eth1 proto kernel scope link src 1.2.3.4 
192.168.8.0/24 dev br-lan proto kernel scope link src 192.168.8.1 
192.168.106.0/24 dev eth1 proto static scope link metric 10

And the default route depends on the direct route:

root@GL-MT6000:~# ip route
default via 128.0.0.1 dev eth1 proto static src 128.0.39.144 metric 10 
128.0.0.0/2 dev eth1 proto kernel scope link src 128.0.39.144 
128.0.39.0/24 dev eth1 proto static scope link metric 10 
192.168.8.0/24 dev br-lan proto kernel scope link src 192.168.8.1 
root@GL-MT6000:~# 
root@GL-MT6000:~# 
root@GL-MT6000:~# 
root@GL-MT6000:~# 
root@GL-MT6000:~# ip route del default via 128.0.0.1 dev eth1 proto static src 128.0.39.144 metric 10
root@GL-MT6000:~# ip route del 128.0.0.0/2 dev eth1 proto kernel scope link src 128.0.39.144
root@GL-MT6000:~# ip route add default via 128.0.0.1 dev eth1 proto static src 128.0.39.144 metric 10
RTNETLINK answers: Network unreachable
root@GL-MT6000:~#

Given this behavior, patching netifd may not be a reliable or viable solution.
Does the hotplug script workaround function as expected? Are there any known side effects?

hrk_hrk · May 6, 2025, 7:53pm

TBH I did not go script way, I changed network configuration from
ISP <-> modem (DMZ enabled) <-> GL-MT6000 in DMZ (gets public IP from modem)
to
ISP <-> modem (DMZ enabled) <-> GL-MT6000 in DMZ (gets private IP from modem)
In this configuration it all works as expected.

However during troubleshooting, when routed still had public IP I tried removing offending route 128.0.0.0/2 manually and it fixed the issue.