Resetting a Corrupted Docker Network on Windows

Docker for Windows uses a few Windows Containers features to provide networking to Docker containers. Sometimes (usually bad shutdowns, killing host VM, etc) those networks can be corrupted. An example error you might encounter is:

Error response from daemon: HNS failed with error : The object already exists.

These kinds of errors, where HNS (Host Networking Service) is complaining about container names, can be fixed by removing the vNIC and vSwitch on your Windows host. You can see the virtual networks using the following Powershell commands:

PS C:\> get-containernetwork

Name Id                                   Subnets          Mode
---- --                                   -------          ----
nat  e6c458e8-63af-40ce-b9d6-602508092c0e {172.22.32.0/20} NAT
PS C:\> get-vmswitch

Name SwitchType NetAdapterInterfaceDescription
---- ---------- ------------------------------
nat  Internal
PS C:\> get-netnat

Name                             : H9833117d-8b6b-4651-8c4e-b557ba7cfbe2
ExternalIPInterfaceAddressPrefix :
InternalIPInterfaceAddressPrefix : 172.25.240.0/20
IcmpQueryTimeout                 : 30
TcpEstablishedConnectionTimeout  : 1800
TcpTransientConnectionTimeout    : 120
TcpFilteringBehavior             : AddressDependentFiltering
UdpFilteringBehavior             : AddressDependentFiltering
UdpIdleSessionTimeout            : 120
UdpInboundRefresh                : False
Store                            : Local
Active                           : True

You can also see the configuration of the virtual network on your running container by using docker inspect on your container:

PS C:\> docker inspect my-container-name

[... lots of other stuff ...]

"Networks": {
    "nat": {
        "IPAMConfig": null,
        "Links": null,
        "Aliases": null,
        "NetworkID": "d4819cde5841b85885670b0188cfadf05455cc74c1d81d90f7fa544ea4b550dd",
        "EndpointID": "e7c846df220163a28c6f5bdf044330fa11896377a1055af07077e2f02df280ce",
        "Gateway": "172.22.32.1",
        "IPAddress": "172.22.39.25",
        "IPPrefixLen": 16,
        "IPv6Gateway": "",
        "GlobalIPv6Address": "",
        "GlobalIPv6PrefixLen": 0,
        "MacAddress": "00:15:5d:e0:1b:15",
        "DriverOpts": null
    }
}

The error might crop up when you try to start your container. For example:

PS C:\> docker start my-container-name
Error response from daemon: failed to create endpoint my-container-name on network nat: HNS failed with error : The object already exists. Error: failed to start containers: my-container-name

To fix this, we need to try to reset the nat network. If we try to remove the network through Docker, we'll get this error:

PS C:\> docker network rm nat
Error response from daemon: nat is a pre-defined network and cannot be removed

Yipes – so how do we reset this network?

The clue is that the network is not managed by Docker, it's managed by the Windows Containers services, vNIC and vSwitch, etc. To reset this network, we'll need to stop the Docker service (so that the Windows Containers services don't complain about the network being in use) and then remove the virtual network nat.

After we've removed the virtual network, we can restart Docker. When the Docker service sees that the default network nat does not exist, it will ask the Windows Containers services to re-create it, and we're back in business!

PS C:\> stop-service docker
PS C:\> remove-containernetwork -name nat
PS C:\> start-service docker
PS C:\> docker restart my-container-name