4.x Wireguard REKEY-TIMEOUT troubleshooting

Hi All,

A lot of users feedback that in Firmware 4.x Wireguard has REKEY-TIMEOUT loops. This thread just want to have a summary so that you classify your issue in the correct category and go in the right direction.

I will list some cases that I have solutions and some cases I am not sure.

Please note, REKEY-TIMEOUT is a general message and it does not contains info about the reason. Wireguard has too little info in the log.

Scenario 1: Wireguard does not connect at all

  • Most of the cases are in this scenario. If wireguard does not connect at all, the general reason is that the server is not reachable or config not correct.
  • If this is your own server, have to make sure your are doing port forward correctly.
  • Some ISP blocks UDP port so your port forward may not work. Some ISP have a advanced firewall so you have extra settings in order to make your port forward work normally.
  • Some user also report that copy and paste caused the problems, but I don’t have a clue why.
  • If this happens to you, please try the wireguard on your phone or windows to make sure it works.
  • If you use Flint or Slate AX as wireguard server in firmware 4.0 or 4.1, when you active vpn client and server at the same time, you cannot connect to the vpn server from the Internet. This is called vpn cascading and pls upgrade to 4.2.0.

Scenario 2: Wireguard connects but has intermittent breaks and cannot connect by itself

Some users reported such cases. I believe this happens but it is very difficult for me to replicate this issue.

  • Several feedbacks that this problems resolved by itself so could not track more.
  • One user said it is related to his 5G network.
  • One nice users gave me some info that this may be related to 4G/5G network which has MTU limitations. He needs to adjust the Wireguard MTU 1324 and lower to make it connect well. If this is your case pls adjust the MTU lower to see if works better.

What to do when you met this problem:

  • Make sure the wireguard is valid by using the exact same config on your phone or pc.
  • Make sure the wireguard server side has correct setup, e.g. port forward
  • If you want to bring this issue up to me, please give me these info:
  1. What is the router model and firmware version
  2. Are you using 4G network? What is your ISP?
  3. Are you using IPV6?
  4. Are you using DDNS in the wireguard config end point?
  5. Is this your own wireguard server (a GL.iNet router, Netgear or Asus) or a commercial Wireguard service?
  6. Are you using vpn policy, adguard home etc?

If this is your own Wireguard server, the easiest way is to send one config to me to try out directly, if possible.

6 Likes

Case update:

One user case show that he is using Flint as wireguard server and cannot connect to it from the Internet. It is because he is using both vpn client and server at the same time on Flint.

When use use vpn server and client at the same time, it is called vpn cascading. Flint 4.0 and 4.1 does not support this. Pls upgrade to firmware 4.2.

This needs to be pinned.

Other issues:

  1. Your WireGuard IP cannot be within the 192.168.8.X or 192.168.9.X range for default GL.iNet setups. Either change the WireGuard range, or change the LAN.
  2. If you’re running your own server, and you can connect to (ping) the Wireguard endpoint, and you’re using 0.0.0.0/0 as your “Allowed IPs”, make sure the outbound port forwarding rules on your server are correct. See this guide for help.
  3. If you are testing and it works from your phone but not from your router, try connecting your router through your phone. Sometimes trying to access your public IP from your LAN behaves in strange ways.
3 Likes

Where can we find firmware V4.2? I’m unable to find it in the download center & on GitHub.

~Edit: nvm, found it😄

Update one case:

The ISP blocked incoming UDP traffic on the modem.

Even you set up port forward it does not work.

Case update: Start vpn ddns bug: Starvpn provide wireguard configure with endpoint as ddns. But the ddns may resolve to some servers that does not work at all. Users should hardcore the ddns as correct IP or ask Starvpn to fix the bug. Slate AX (GL-AXT1800) Wireguard Issue (REKEY-TIMEOUT) - #34 by hectorricardo

I had a new case that the presharekey is dropped when upload config to the router. This may be caused by old firmware. 4.2 is working fine.

Hi alzhao, I have been having a dialog with hansome here in this thread but with no satsifactory outcome as yet. I would like to keep my set up simple.

Some questions on the above:

a. If I have a perfectly working iPhone WG config, does that mean that Scenario 1 is not my issue? Or it could still be?

b. If so, then can you please provide an example of a port forward on the Server; and why do I need it if it works on the iPhone with an App but not on the GLiNET Device set up as a client?

c. If you suspect UDP port is blocked (PCCW?), can you please give examples of these “extra settings” you refer to please; and again, why do I need it if it works on the iPhone with an App but not on the GLiNET Device set up as a client?

k.

Before my

Before my reply so you finally worked out.

Yes thank you, all good now.

1 Like
  1. What is the router model and firmware version: GL-AXT1800, 4.1.0 release5
  2. Are you using 4G network? What is your ISP? Not 4G network, ISP “itissalat al maghrib”. Connected to ISP router by ethernet
  3. Are you using IPV6? no
  4. Are you using DDNS in the wireguard config end point? no
  5. Is this your own wireguard server (a GL.iNet router, Netgear or Asus) or a commercial Wireguard service? wireguard server on AWS lightsail instance
  6. Are you using vpn policy, adguard home etc? Using adguard home, though issue persists even when it’s off.

I followed this link to set up my wireguard server on AWS lightsail https://www.cyberciti.biz/faq/install-set-up-wireguard-on-amazon-linux-2/

The VPN works on the GL-AXT1800, it’s just that every day at the same time wireguard server fails to handshake and I see REKEY-TIMEOUT in the client logs on the router. For a while the failure would happen everyday at 10 AM UTC, now it’s happening about 2PM UTC which I think changed because I went through a process of resetting the router, recreating a wireguard server one day around that time. It seems the failure happens every 24 hrs and works perfectly until I’m forced to turn off the vpn client, allow non vpn traffic through the router and let it sit for a few minutes before turning it on all again.

Wireguard server logs right before and after the vpn stopped working. Peer 2 is the glinet router, I’ve hidden the ip as a precaution

[Mar 2 13:33] wireguard: wg0: Receiving keepalive packet from peer 2 (hidden ip)
[ +11.240752] wireguard: wg0: Receiving keepalive packet from peer 2 (hidden ip)
[ +20.099812] wireguard: wg0: Receiving keepalive packet from peer 2 (hidden ip)
[ +11.560462] wireguard: wg0: Receiving keepalive packet from peer 2 (hidden ip)
[  +0.007109] wireguard: wg0: Receiving handshake initiation from peer 2 (hidden ip)
[  +0.005997] wireguard: wg0: Sending handshake response to peer 2 (hidden ip)
[  +0.006059] wireguard: wg0: Keypair 3717 destroyed for peer 2
[  +0.004523] wireguard: wg0: Keypair 3719 created for peer 2
[  +0.118935] wireguard: wg0: Receiving keepalive packet from peer 2 (hidden ip)
[Mar 2 13:34] wireguard: wg0: Retrying handshake with peer 2 (hidden ip) because we stopped hearing back after 15 seconds
[  +0.013659] wireguard: wg0: Sending handshake initiation to peer 2 (hidden ip)
[  +5.106470] wireguard: wg0: Handshake for peer 2 (hidden ip) did not complete after 5 seconds, retrying (try 2)
[  +0.252731] wireguard: wg0: Sending handshake initiation to peer 2 (hidden ip)
[  +5.379406] wireguard: wg0: Handshake for peer 2 (hidden ip) did not complete after 5 seconds, retrying (try 3)
[  +0.010842] wireguard: wg0: Sending handshake initiation to peer 2 (hidden ip)
[  +5.365275] wireguard: wg0: Handshake for peer 2 (hidden ip) did not complete after 5 seconds, retrying (try 4)
[  +0.015024] wireguard: wg0: Sending handshake initiation to peer 2 (hidden ip)
[  +0.496969] wireguard: wg0: Retrying handshake with peer 2 (hidden ip) because we stopped hearing back after 15 seconds
[  +4.864131] wireguard: wg0: Handshake for peer 2 (hidden ip) did not complete after 5 seconds, retrying (try 2)
[  +0.014704] wireguard: wg0: Sending handshake initiation to peer 2 (hidden ip)
[  +5.361419] wireguard: wg0: Handshake for peer 2 (hidden ip) did not complete after 5 seconds, retrying (try 3)
[  +0.014232] wireguard: wg0: Sending handshake initiation to peer 2 (hidden ip)
[  +5.361896] wireguard: wg0: Handshake for peer 2 (hidden ip) did not complete after 5 seconds, retrying (try 4)
[  +0.015637] wireguard: wg0: Sending handshake initiation to peer 2 (hidden ip)
[  +5.360471] wireguard: wg0: Handshake for peer 2 (hidden ip) did not complete after 5 seconds, retrying (try 5)

I used the same setup link to create another vpn client config to use on my iphone and I’ve never had issues on the iphone and the vpn. I can still use the vpn through my phone even when the glinet router is not working.

The VPN client I believe starts to work (after I enabled it again) once the WG server zeros out all keys for the peer after 540 seconds.

[Mar 2 13:54] wireguard: wg0: Zeroing out all keys for peer 2 (hidden ip), since we haven't received a new one in 540 seconds

Hi, does toggleing wireguard client “Enable” button work? or do you have to do “a process of resetting the router, recreating a wireguard server”?
How about change a listen port?

image

image

@hansome

I don’t need to reset the router or recreate the wireguard server. All I have to do is the following:

  1. turn off the vpn client on the router
  2. Enable non-vpn traffic
  3. Wait 30sec - 2 min
  4. Turn on vpn client on router (at this point, the router is working again but the traffic is not going through my wireguard server).
  5. Block non-vpn traffic.
  6. Router starts working with all traffic going through wireguard server.

Also to note, if I reboot the router at any time, the next time that the vpn client will stop working is exactly 24 hr from the reboot time. Today, I rebooted it at 8 AM automatically, I expect 8 AM on Tuesday it will stop working again.

I will try changing the listen port when it next fails to see if that does anything.

I’ve also updated to the latest beta (previously on stable) and will see if that prevents the failure on Tuesday.

@hansome

I just encountered the the vpn not working at 9:50 PM UTC +1 Monday. I thought the reboot i scheduled in the morning would move the issue to the morning of Tuesday but it didn’t.

I changed the listen port this time, and the vpn client started working again immediately.

Many thanks for the feedback.

Please try the following command:

sed -i 's/echo "ListenPort.*$/:/g' /lib/netifd/proto/wgclient.sh

This will ignore the ListenPort parameter. If that works, we’ll consider a way to merge to the firmware.

@hansome

I sshed into the router and ran the command. I encountered the same issue tonight but I think we should wait one more day because I think I should have restarted netifd. But I was able to confirm that once again, changing the listen port fixed it right away.

I’ve rebooted the router now. Hopefully tomorrow we can confirm that command addresses the issue.

@hansome

Ok Issue just happened again. So doesn’t look like ignoring listenPort helped. But there was a difference. With the latest beta firmware, all I have to do is disable and then enable the wireguard vpn client for it to work. No need to change listen port anymore. Also with the beta firmware, when the vpn stops working, it still has the light blue circle showing its “working” when it really isn’t. On the latest stable, it would turn yellow in the same situation which is clearer that the vpn client is not connected to the tunnel.

Thanks for that info, so what the vpn outage duration then?

The duration is until the vpn client is turned off and reenabled. @hansome