Skip to content

Downloads (GET): [Errno 104] Connection reset by peer  #907

@juliogonzalez

Description

@juliogonzalez

Affected versions: 1.6.1 and 2.0.0 (from EPEL and EPEL-testing repositories)
Distribution: Amazon Linux 2017.03
Kernel: 4.9.32-15.41.amzn1.x86_64

First of all, I am not sure if this is a bug, a change on AWS limits at S3 (I will also open a ticket at AWS), or maybe even something else.

This started happening a couple of weeks ago more frequently than ever before.

We have several s3cmd running in paralell (20 in preproduction, 100 in production) to download files from a bucket.

Until two weeks ago, this produced some "[Errno 104] Connection reset by peer" from time to time (maybe once in a month or a couple of monts), but s3cmd 1.6.1 was able to retry and continue.

Then we suddenly started getting more and more errors and s3cmd 1.6.1 was not able to recover:

Traceback (most recent call last):
File "/usr/bin/s3cmd", line 2919, in <module>
rc = main()
File "/usr/bin/s3cmd", line 2841, in main
rc = cmd_func(args)
File "/usr/bin/s3cmd", line 548, in cmd_object_get
response = s3.object_get(uri, dst_stream, destination, start_position = start_position, extra_label = seq_label)
File "/usr/lib/python2.6/site-packages/S3/S3.py", line 647, in object_get
response = self.recv_file(request, stream, labels, start_position)
File "/usr/lib/python2.6/site-packages/S3/S3.py", line 1393, in recv_file
return self._http_400_handler(request, response, self.recv_file, request, stream, labels, start_position)
File "/usr/lib/python2.6/site-packages/S3/S3.py", line 1058, in _http_400_handler
return fn(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/S3/S3.py", line 1427, in recv_file
data = http_response.read(this_chunk)
File "/usr/lib64/python2.6/httplib.py", line 587, in read
s = self.fp.read(amt)
File "/usr/lib64/python2.6/socket.py", line 383, in read
data = self._sock.recv(left)
File "/usr/lib64/python2.6/ssl.py", line 215, in recv
return self.read(buflen)
File "/usr/lib64/python2.6/ssl.py", line 136, in read
return self._sslobj.read(len)
error: [Errno 104] Connection reset by peer

I reviewed the instances, and no system packages were upgrades recently, there were no configuration changes, instance type changes or anything similar.

We decided check an upgrade to s3cmd 2.0.0 and it is able to retry, but still we see too many errors, as well as a lot of connections on TIME_WAIT, even when there are no more files to download.

Has somebody else experienced this?

I see a lot of reports about the error for uploads, but none for downloads.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions