Skip to content

Conversation

@kevgeo
Copy link
Contributor

@kevgeo kevgeo commented Feb 8, 2024

This PR fixes: #27488.

It solves the timeout error when trying to synchronize large files by using the rewrite function instead of copy.

There was also previously a bug when synchronizing from a subdirectory where an empty subdirectory gets created which is solved by this PR.
For e.g synchronizing files in subdirectory bigdata2 from source bucket would unnecessarily create an empty bigdata2 folder as seen in the image. This should not happen if I understand the docs correctly.

airflow_pr4

The behaviour of GCSSynchronizeBucketsOperator otherwise remains the same.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Feb 8, 2024
@kevgeo kevgeo requested a review from dirrao February 15, 2024 13:56
Update comment to be more clear
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GCSSynchronizeBucketsOperator fails on 30-second timeout on large files

4 participants