-
-
Notifications
You must be signed in to change notification settings - Fork 910
Downloads (GET): [Errno 104] Connection reset by peer #907
Description
Affected versions: 1.6.1 and 2.0.0 (from EPEL and EPEL-testing repositories)
Distribution: Amazon Linux 2017.03
Kernel: 4.9.32-15.41.amzn1.x86_64
First of all, I am not sure if this is a bug, a change on AWS limits at S3 (I will also open a ticket at AWS), or maybe even something else.
This started happening a couple of weeks ago more frequently than ever before.
We have several s3cmd running in paralell (20 in preproduction, 100 in production) to download files from a bucket.
Until two weeks ago, this produced some "[Errno 104] Connection reset by peer" from time to time (maybe once in a month or a couple of monts), but s3cmd 1.6.1 was able to retry and continue.
Then we suddenly started getting more and more errors and s3cmd 1.6.1 was not able to recover:
Traceback (most recent call last):
File "/usr/bin/s3cmd", line 2919, in <module>
rc = main()
File "/usr/bin/s3cmd", line 2841, in main
rc = cmd_func(args)
File "/usr/bin/s3cmd", line 548, in cmd_object_get
response = s3.object_get(uri, dst_stream, destination, start_position = start_position, extra_label = seq_label)
File "/usr/lib/python2.6/site-packages/S3/S3.py", line 647, in object_get
response = self.recv_file(request, stream, labels, start_position)
File "/usr/lib/python2.6/site-packages/S3/S3.py", line 1393, in recv_file
return self._http_400_handler(request, response, self.recv_file, request, stream, labels, start_position)
File "/usr/lib/python2.6/site-packages/S3/S3.py", line 1058, in _http_400_handler
return fn(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/S3/S3.py", line 1427, in recv_file
data = http_response.read(this_chunk)
File "/usr/lib64/python2.6/httplib.py", line 587, in read
s = self.fp.read(amt)
File "/usr/lib64/python2.6/socket.py", line 383, in read
data = self._sock.recv(left)
File "/usr/lib64/python2.6/ssl.py", line 215, in recv
return self.read(buflen)
File "/usr/lib64/python2.6/ssl.py", line 136, in read
return self._sslobj.read(len)
error: [Errno 104] Connection reset by peer
I reviewed the instances, and no system packages were upgrades recently, there were no configuration changes, instance type changes or anything similar.
We decided check an upgrade to s3cmd 2.0.0 and it is able to retry, but still we see too many errors, as well as a lot of connections on TIME_WAIT, even when there are no more files to download.
Has somebody else experienced this?
I see a lot of reports about the error for uploads, but none for downloads.