January 28, 2024 · networking bug

Deep Dive into WireGuard Stickiness

While setting up failover networking with an active WireGuard connection, I noticed that fallback to the primary connection would not happen, i.e. it would keep using the low-quality route. What gives? Running ip route get after restoring the primary connection shows that indeed eth0 is preferred:

yoonsik@cloud:~$ ip -6 route get 2602:XXXX:: mark 0xca6c
2602:XXXX:: from :: via fe80::XXXX dev eth0 proto ra src 2607:XXXX:XXXX:: metric 1024 hoplimit 64 pref medium

After a search, I learned that WireGuard has a feature to prevent route flapping called "sticky sockets" (2018 presentation) ensuring that previous routes, specifically source addresses, are cached and preserved.

Screen-Shot-2024-01-27-at-11.41.09-PM-1

In theory, I could reset the connection (i.e. systemctl restart wg-quick) after each failure but this would be annoying and result in dropped packets. Surely there must be a better way? systemctl reload wg-quick was a bit quicker but still caused dropped packets. I decided to check the kernel source code to understand when WireGuard updates the route. It turns out that if WireGuard does not receive any packets from the peer for REKEY_TIMEOUT, it will then clear the source address cache:

// timers.c
static void wg_expired_retransmit_handshake(struct timer_list *timer)
{
    ...
    // We clear the endpoint address src address, in case this is the cause of trouble.
    wg_socket_clear_peer_endpoint_src(peer);
    wg_packet_send_queued_handshake_initiation(peer, true);
    ...
}

Where else is wg_socket_clear_peer_endpoint_src called? It is used in the set_port function, and only runs if the new port is different.

// netlink.c
static int set_port(struct wg_device *wg, u16 port)
{
    struct wg_peer *peer;

    if (wg->incoming_port == port)
        return 0;
    list_for_each_entry(peer, &wg->peer_list, peer_list)
        wg_socket_clear_peer_endpoint_src(peer);
    if (!netif_running(wg->dev)) {
        wg->incoming_port = port;
        return 0;
    }
    return wg_socket_init(wg, port);
}

I tried setting a new source port every ten seconds or so, and even toggling away then back to the original source port in quick succession, but this still led to significant packet loss. By taking a closer look at the function wg_socket_clear_peer_endpoint_src, we can see that it calls dst_cache_reset as a lower level function:

// socket.c
void wg_socket_clear_peer_endpoint_src(struct wg_peer *peer)
{
    ...
    
    dst_cache_reset_now(&peer->endpoint_cache);
    write_unlock_bh(&peer->endpoint_lock);
}

dst_cache_reset is also called in the context of setting the endpoint for a peer, as long as the endpoint_eq function doesn't hold.

// socket.c
void wg_socket_set_peer_endpoint(struct wg_peer *peer, const struct endpoint *endpoint)
{
    if (endpoint_eq(endpoint, &peer->endpoint))
        return;
    ...
    dst_cache_reset(&peer->endpoint_cache);
    ...
}

This equality function checks both destination and source addresses to see if the newly suggested route is the same as the cached route. Even if we update the peer with the same endpoint, Linux will suggest the new preferred source address and the equality check fails. This command is the most efficient way to reset the source address, without disturbing the connection:

# replace with peer public key and IPv6 endpoint
wg set wg1 peer mXXXXQ= endpoint "[2602:XXXX::]:51820"

Looping this command at 100 times a second showed no packet loss, meaning it can run in the background safely without affecting the quality of the WireGuard connection. I was able to automate this process by inserting the following PostUp and PreDown commands into my wg1.conf file:

[Interface]
...
PostUp = nohup sh -c 'while true; do wg set %i peer mXXXX= endpoint [2602:XXXX::]:51820 >/dev/null 2>&1; sleep 1; done' & echo $! > /tmp/%i_watchdog.pid
PreDown = sh -c 'kill "$(cat /tmp/%i_watchdog.pid)"' &

[Peer]
PublicKey = mXXXX=
...

The PostUp command starts a "no-hangup" for-loop that will call the wg set command every second, and saves the PID of the background process to a temporary file. The PreDown command uses the PID file to kill the background process. If you have multiple peers, you will need multiple wg set commands in the the PostUp block.