"read: connection reset by peer" errors on Teleport Proxy in AWS when using NLB

Issue:

Users and admins may notice an influx of sporadic “read: connection reset by peer” errors on the Teleport Proxy originating from the network load balancer when running in AWS.

Log(s):

Aug 16 19:49:04 ip-172-31-43-50.us-west-2.compute.internal /usr/bin/teleport[9157]: ERRO             read tcp 172.31.43.50:3023->172.31.43.253:22619: read: connection reset by peer
Aug 16 19:49:14 ip-172-31-43-50.us-west-2.compute.internal /usr/bin/teleport[9157]: ERRO             read tcp 172.31.43.50:3023->172.31.43.253:47961: read: connection reset by peer
Aug 16 19:51:04 ip-172-31-43-50.us-west-2.compute.internal /usr/bin/teleport[9157]: ERRO             read tcp 172.31.43.50:3023->172.31.43.253:63845: read: connection reset by peer
Aug 16 19:51:14 ip-172-31-43-50.us-west-2.compute.internal /usr/bin/teleport[9157]: ERRO             read tcp 172.31.43.50:3023->172.31.43.253:13943: read: connection reset by peer
Aug 16 19:52:44 ip-172-31-43-50.us-west-2.compute.internal /usr/bin/teleport[9157]: ERRO             read tcp 172.31.43.50:3023->172.31.43.253:39664: read: connection reset by peer
Aug 16 19:53:14 ip-172-31-43-50.us-west-2.compute.internal /usr/bin/teleport[9157]: ERRO             read tcp 172.31.43.50:3023->172.31.43.253:50548: read: connection reset by peer
Aug 16 19:53:57 ip-172-31-43-50.us-west-2.compute.internal /usr/bin/teleport[9157]: ERRO             read tcp 172.31.43.50:3023->172.31.43.253:56575: read: connection reset by peer
Aug 16 19:58:15 ip-172-31-43-50.us-west-2.compute.internal /usr/bin/teleport[9157]: ERRO             read tcp 172.31.43.50:3023->172.31.43.253:13857: read: connection reset by peer
Aug 16 19:58:44 ip-172-31-43-50.us-west-2.compute.internal /usr/bin/teleport[9157]: ERRO             read tcp 172.31.43.50:3023->172.31.43.253:54184: read: connection reset by peer
Aug 16 20:00:15 ip-172-31-43-50.us-west-2.compute.internal /usr/bin/teleport[9157]: ERRO             read tcp 172.31.43.50:3023->172.31.43.253:19349: read: connection reset by peer

Analysis:

The AWS NLB health check requests are known to be bursty. AWS NLB uses multiple distributed health checkers to evaluate target health. Each of these health checkers will make a request to the target at the interval you specify, but all of them are going to make a request to it at that interval, so you will see one request from each of the distributed probes. The target health is then evaluated based on how many of the probes were successful.

As long as there are no actual connection instability issues then these can be safely ignored.

Solution(s):

Though this is not a connectivity issue, it does make logs harder to read at times. An issue has been raised to address the bursty logging as result of these health checks and is being evaluated at the time of writing of this article. Once resolved this article will be updated.