Object download failed complaining about checksum mismatch. Downloading the object through gsutils works fine.
./gcs-download-object.py
Traceback (most recent call last):
File "./gcs-download-object.py", line 29, in <module>
download_blob('##REDACTED##',
File "./gcs-download-object.py", line 20, in download_blob
blob.download_to_filename(destination_file_name)
File "/usr/local/lib/python3.8/site-packages/google/cloud/storage/blob.py", line 1184, in download_to_filename
client.download_blob_to_file(
File "/usr/local/lib/python3.8/site-packages/google/cloud/storage/client.py", line 719, in download_blob_to_file
blob_or_uri._do_download(
File "/usr/local/lib/python3.8/site-packages/google/cloud/storage/blob.py", line 956, in _do_download
response = download.consume(transport, timeout=timeout)
File "/usr/local/lib/python3.8/site-packages/google/resumable_media/requests/download.py", line 171, in consume
self._write_to_stream(result)
File "/usr/local/lib/python3.8/site-packages/google/resumable_media/requests/download.py", line 120, in _write_to_stream
raise common.DataCorruption(response, msg)
google.resumable_media.common.DataCorruption: Checksum mismatch while downloading:
##REDACTED##
The X-Goog-Hash header indicated an MD5 checksum of:
lAhluFgTEwcNJDvTSap2fQ==
but the actual MD5 checksum of the downloaded contents was:
61Kz/FQdqRvwqacGuwuFIA==
The Code itself is pretty straight forward:
#!/usr/bin/env python3.8
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "storage-object-name"
# destination_file_name = "local/path/to/file"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
# Construct a client side representation of a blob.
# Note `Bucket.blob` differs from `Bucket.get_blob` as it doesn't retrieve
# any content from Google Cloud Storage. As we don't need additional data,
# using `Bucket.blob` is preferred here.
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print(
"Blob {} downloaded to {}.".format(
source_blob_name, destination_file_name
)
)
download_blob('##REDACTED##',
'remedia/mezzanines/Live/2018-06-24/M31_POL-COL_ESFUHD_06_24.mov', 'M31_POL-COL_ESFUHD_06_24.mov')
The file size is 2.3TB if that matters.
Following are the plugin versions
pip3.8 list
Package Version
------------------------ ---------
boto3 1.17.13
botocore 1.20.13
cachetools 4.2.1
certifi 2020.12.5
cffi 1.14.5
chardet 4.0.0
google-api-core 1.26.0
google-auth 1.27.0
google-cloud-core 1.6.0
google-cloud-storage 1.36.0
google-crc32c 1.1.2
google-resumable-media 1.2.0
googleapis-common-protos 1.52.0
idna 2.10
jmespath 0.10.0
packaging 20.9
pip 19.2.3
protobuf 3.15.1
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycparser 2.20
pyparsing 2.4.7
python-dateutil 2.8.1
pytz 2021.1
requests 2.25.1
rsa 4.7.1
s3transfer 0.3.4
setuptools 41.2.0
six 1.15.0
urllib3 1.26.3
I'm able to reproduce this issue for this file. I had downloaded several hundred objects with the same SDK. Not sure why its failing on this file.
Object download failed complaining about checksum mismatch. Downloading the object through gsutils works fine.
The Code itself is pretty straight forward:
The file size is 2.3TB if that matters.
Following are the plugin versions
I'm able to reproduce this issue for this file. I had downloaded several hundred objects with the same SDK. Not sure why its failing on this file.