Here is my network topology for reference:
[External PC] xx.xx.xx.xx
|
| (WAN)
|
[VPS] yy.yy.yy.yy (Global IP) / 10.0.0.254 (WireGuard IP)
| iptables DNAT: yy.yy.yy.yy -> 10.0.0.2
|
| (WireGuard tunnel)
|
[GL-BE9300] 10.0.0.2 (WireGuard IP) / 192.168.60.254 (LAN IP)
| Port Forwarding: 10.0.0.2 -> 192.168.60.201
|
| (LAN)
|
[Server] 192.168.60.201
This specific topology may be required to reproduce the issue. The problem only manifests when an external PC accesses the server via DNAT on the VPS, through a WireGuard tunnel to the GL-BE9300, and then via port forwarding to the LAN server. Connecting directly from the VPS (10.0.0.254) works correctly, as the return packets naturally go back through the WireGuard tunnel without requiring the correct fwmark.
With this setup, port forwarding works correctly when connecting directly from the VPS (10.0.0.254), but fails for the external PC (xx.xx.xx.xx) whose traffic is DNAT'd through the VPS. This is because the return packets need to be routed back via wgclient1, which requires the correct fwmark to be set.
I also analyzed how the conntrack mark is used in the routing process.
In v4.8.4, the following nftables rules exist to synchronize the conntrack mark (ctmark) with the packet mark (fwmark):
iifname "wgclient1" ... ct state new ct mark set ct mark & 0xffff1fff | 0x00001000 (wgclient1_in_new_connmark)
ct mark & 0x0000f000 != 0x00000000 jump mark_ct_to_meta
ct mark & 0x0000f000 == 0x00001000 meta mark set meta mark & 0xffff1fff | 0x00001000 (wgclient1_mark_ct_to_meta)
This means:
-
New connections from wgclient1 are assigned ctmark 0x1000
-
The mark_ct_to_meta chain propagates ctmark to fwmark
-
The ip rule fwmark 0x1000/0xf000 matches, and return packets are correctly routed via wgclient1
In v4.9.0, the conntrack mark observed was 0x10000 instead of 0x1000. Since the ip rule still uses fwmark 0x1000/0xf000, the mask 0xf000 applied to 0x10000 results in 0x0000, which does not match. Therefore, return packets are not routed via wgclient1 and go out via WAN instead, making the service unreachable from the external PC.
The issue appears to be that the port_forward kernel module in v4.9.0 is setting the conntrack mark to 0x10000 instead of 0x1000, breaking the routing for WireGuard client setups.