Use Azure upload method instead of our own implementation#26751
Use Azure upload method instead of our own implementation#26751dadoonet merged 12 commits intoelastic:masterfrom
Conversation
We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote. And well... Let's just read the doc!
While working on #26751, I found that we are passing the container name on every single method although we don't need it as it is stored within the blobstore object already. This PR simplifies a bit that part of the code.
imotov
left a comment
There was a problem hiding this comment.
LGTM. Thanks for finding it out! Can we backport it? I think it would be great to switch to this sooner rather then later. It looks like we are missing quite a bit of logic by uploading the blob our old way.
That was a missing part in elastic#23405.
|
@imotov Thanks for the review. I did some manual testings this morning and it does not work. Apparently the file I'm digging... Probably something stupid on my end. :) |
While working on elastic#26751 and doing some manual integration testing I found that this elastic#22858 removed an important line of our code: `AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff. But with PR elastic#22858, I wrote: ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } } ``` instead of ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } super.initializeSnapshot(snapshotId, indices, clusterMetadata); } ``` As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved. Closes elastic#26777.
While working on #26751 and doing some manual integration testing I found that this #22858 removed an important line of our code: `AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff. But with PR #22858, I wrote: ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } } ``` instead of ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } super.initializeSnapshot(snapshotId, indices, clusterMetadata); } ``` As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved. Closes #26777.
While working on #26751 and doing some manual integration testing I found that this #22858 removed an important line of our code: `AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff. But with PR #22858, I wrote: ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } } ``` instead of ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } super.initializeSnapshot(snapshotId, indices, clusterMetadata); } ``` As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved. Closes #26777.
While working on #26751 and doing some manual integration testing I found that this #22858 removed an important line of our code: `AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff. But with PR #22858, I wrote: ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } } ``` instead of ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } super.initializeSnapshot(snapshotId, indices, clusterMetadata); } ``` As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved. Closes #26777.
This commit: * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository * merges 2 IT classes so all IT tests are ran from one single class * We don't remove/create anymore the container between each single test but only for the test suite
|
@imotov I worked on IT so we can now pass them when needed (still a manual operation). I tested everything manually:
# Clean test env
curl -XDELETE localhost:9200/foo?pretty
curl -XDELETE localhost:9200/_snapshot/my_backup1/snap1?pretty
curl -XDELETE localhost:9200/_snapshot/my_backup1?pretty
# Create data
curl -XPUT localhost:9200/foo/doc/1?pretty -H 'Content-Type: application/json' -d '{
"foo": "bar"
}'
curl -XPOST localhost:9200/foo/_refresh?pretty
# Create repository using default account
curl -XPUT localhost:9200/_snapshot/my_backup1?pretty -H 'Content-Type: application/json' -d '{
"type": "azure"
}'
# Backup
curl -XPOST "localhost:9200/_snapshot/my_backup1/snap1?pretty&wait_for_completion=true"
# Delete existing index
curl -XDELETE localhost:9200/foo?pretty
# Restore using default account
curl -XPOST "localhost:9200/_snapshot/my_backup1/snap1/_restore?pretty&wait_for_completion=true"
# Check
curl -XGET localhost:9200/foo/_search?pretty
# Remove backup
curl -XDELETE localhost:9200/_snapshot/my_backup1/snap1?pretty
curl -XDELETE localhost:9200/_snapshot/my_backup1?prettyEverything is correct. I'm going to test with a bigger dataset now and check everything works. Thanks! |
|
I tested with much more data (300mb) and everything is working well. |
|
@dadoonet would it make sense to base this tests on |
We already have nice tests written. Let's just use them.
|
@imotov Great! I did not remember about that class. Yeah. Definitely better using it as well. I pushed new changes. |
imotov
left a comment
There was a problem hiding this comment.
LGTM. Thanks for moving the tests! Now we just need to figure out how to get it into CI to run on a regular basis.
| import org.junit.BeforeClass; | ||
|
|
||
| import java.net.URISyntaxException; | ||
| import java.net.UnknownHostException; |
* Use Azure upload method instead of our own implementation We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote. And well... Let's just read the doc! * Adapt integration tests to secure settings That was a missing part in #23405. * Simplify all the integration tests and *extends ESBlobStoreRepositoryIntegTestCase tests * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository * merges 2 IT classes so all IT tests are ran from one single class * We don't remove/create anymore the container between each single test but only for the test suite Backport of #26751 in 6.x branch
) * Use Azure upload method instead of our own implementation We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote. And well... Let's just read the doc! * Adapt integration tests to secure settings That was a missing part in elastic#23405. * Simplify all the integration tests and *extends ESBlobStoreRepositoryIntegTestCase tests * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository * merges 2 IT classes so all IT tests are ran from one single class * We don't remove/create anymore the container between each single test but only for the test suite Backport of elastic#26751 in 6.0 branch
|
Backported to 6.0 as well with 9aa5595 |
) * Use Azure upload method instead of our own implementation We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote. And well... Let's just read the doc! * Adapt integration tests * Simplify all the integration tests and extends ESBlobStoreRepositoryIntegTestCase tests * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository * merges 2 IT classes so all IT tests are ran from one single class * We don't remove/create anymore the container between each single test but only for the test suite Backport of elastic#26751 in 5.6 branch
…26839) * Use Azure upload method instead of our own implementation We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote. And well... Let's just read the doc! * Adapt integration tests * Simplify all the integration tests and extends ESBlobStoreRepositoryIntegTestCase tests * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository * merges 2 IT classes so all IT tests are ran from one single class * We don't remove/create anymore the container between each single test but only for the test suite Backport of #26751 in 5.6 branch
While working on #26751, I found that we are passing the container name on every single method although we don't need it as it is stored within the blobstore object already. This commit simplifies a bit that part of the code. It also removes `repositoryName` from AzureBlobStore which was not used anymore. Also we move some properties in AzureBlobContainer to `private` members.
While working on #26751, I found that we are passing the container name on every single method although we don't need it as it is stored within the blobstore object already. This commit simplifies a bit that part of the code. It also removes `repositoryName` from AzureBlobStore which was not used anymore. Also we move some properties in AzureBlobContainer to `private` members. Backport of #26752 in 6.x branch
We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container
Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote.
And well... Let's just read the doc!