Skip to content

docs: Add snippets for upload_chunks_concurrently and add chunk_size#1135

Merged
andrewsg merged 6 commits intomainfrom
tm-mpu-sample
Oct 9, 2023
Merged

docs: Add snippets for upload_chunks_concurrently and add chunk_size#1135
andrewsg merged 6 commits intomainfrom
tm-mpu-sample

Conversation

@andrewsg
Copy link
Contributor

Adds a snippet for upload_chunks_concurrently; also adds chunk_size to an existing snippet, and runs snippets_test.py through black for formatting purposes.

@andrewsg andrewsg requested a review from a team as a code owner September 30, 2023 00:04
@andrewsg andrewsg requested review from a team and ohmayr September 30, 2023 00:04
@snippet-bot
Copy link

snippet-bot bot commented Sep 30, 2023

Here is the summary of changes.

You are about to add 1 region tag.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: storage Issues related to the googleapis/python-storage API. labels Sep 30, 2023
@@ -0,0 +1,56 @@
# Copyright 2022 Google LLC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix

# 5 GiB.
# chunk_size = 32 * 1024 * 1024 (32 MiB)

# The maximum number of processes to use for the operation. The performance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the rule of thumb here: "number of cores your CPU has"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, workloads with small files benefit from many times that number and workloads with large files max out the NIC below that, so the number of cores is not a good starting place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that would go into docs instead of sample comments?

BLOB_NAME = "test_file.txt"

with tempfile.NamedTemporaryFile() as file:
file.write(b"test")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't exercise multiple chunks; recommend increasing the object size to test chunk_size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's sufficient to test the snippet. The feature itself is not under test here - it is fully covered in the integration tests, with appropriately-sized test files.

source_filename,
destination_blob_name,
chunk_size=32 * 1024 * 1024,
processes=8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it might be easier to understand as "workers" for this and the other samples, I'll change it.

@andrewsg andrewsg merged commit 3a0f551 into main Oct 9, 2023
@andrewsg andrewsg deleted the tm-mpu-sample branch October 9, 2023 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the googleapis/python-storage API. size: m Pull request size is medium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants