Use Azure upload method instead of our own implementation by dadoonet · Pull Request #26751 · elastic/elasticsearch

dadoonet · 2017-09-22T10:31:06Z

We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container

Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote.

And well... Let's just read the doc!

We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote. And well... Let's just read the doc!

While working on #26751, I found that we are passing the container name on every single method although we don't need it as it is stored within the blobstore object already. This PR simplifies a bit that part of the code.

imotov

LGTM. Thanks for finding it out! Can we backport it? I think it would be great to switch to this sooner rather then later. It looks like we are missing quite a bit of logic by uploading the blob our old way.

That was a missing part in elastic#23405.

dadoonet · 2017-09-25T09:52:14Z

@imotov Thanks for the review. I did some manual testings this morning and it does not work.

Apparently the file master.dat-temp is not written in the azure container...
Getting an exception saying that the container does not exist although I can see it in the azure Web interface...

I'm digging... Probably something stupid on my end. :)

@OverRide

While working on elastic#26751 and doing some manual integration testing I found that this elastic#22858 removed an important line of our code: `AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff. But with PR elastic#22858, I wrote: ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } } ``` instead of ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } super.initializeSnapshot(snapshotId, indices, clusterMetadata); } ``` As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved. Closes elastic#26777.

@OverRide

While working on #26751 and doing some manual integration testing I found that this #22858 removed an important line of our code: `AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff. But with PR #22858, I wrote: ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } } ``` instead of ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } super.initializeSnapshot(snapshotId, indices, clusterMetadata); } ``` As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved. Closes #26777.

@OverRide

While working on #26751 and doing some manual integration testing I found that this #22858 removed an important line of our code: `AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff. But with PR #22858, I wrote: ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } } ``` instead of ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } super.initializeSnapshot(snapshotId, indices, clusterMetadata); } ``` As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved. Closes #26777.

@OverRide

While working on #26751 and doing some manual integration testing I found that this #22858 removed an important line of our code: `AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff. But with PR #22858, I wrote: ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } } ``` instead of ```java @OverRide public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) { if (blobStore.doesContainerExist(blobStore.container()) == false) { throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " + " creating an azure snapshot repository backed by it."); } super.initializeSnapshot(snapshotId, indices, clusterMetadata); } ``` As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved. Closes #26777.

This commit: * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository * merges 2 IT classes so all IT tests are ran from one single class * We don't remove/create anymore the container between each single test but only for the test suite

dadoonet · 2017-09-26T09:36:18Z

@imotov I worked on IT so we can now pass them when needed (still a manual operation).
I tried to simplify and remove non needed things.

I tested everything manually:

Install elasticsearch 7.0.0-alpha1-SNAPSHOT
Install repository-azure plugin
Run the following test:

# Clean test env
curl -XDELETE localhost:9200/foo?pretty
curl -XDELETE localhost:9200/_snapshot/my_backup1/snap1?pretty
curl -XDELETE localhost:9200/_snapshot/my_backup1?pretty

# Create data
curl -XPUT localhost:9200/foo/doc/1?pretty -H 'Content-Type: application/json' -d '{
 "foo": "bar"
}'
curl -XPOST localhost:9200/foo/_refresh?pretty

# Create repository using default account
curl -XPUT localhost:9200/_snapshot/my_backup1?pretty -H 'Content-Type: application/json' -d '{
 "type": "azure"
}'

# Backup
curl -XPOST "localhost:9200/_snapshot/my_backup1/snap1?pretty&wait_for_completion=true"

# Delete existing index
curl -XDELETE localhost:9200/foo?pretty

# Restore using default account
curl -XPOST "localhost:9200/_snapshot/my_backup1/snap1/_restore?pretty&wait_for_completion=true"

# Check
curl -XGET localhost:9200/foo/_search?pretty

# Remove backup
curl -XDELETE localhost:9200/_snapshot/my_backup1/snap1?pretty
curl -XDELETE localhost:9200/_snapshot/my_backup1?pretty

Everything is correct. I'm going to test with a bigger dataset now and check everything works.
Could you give a final review on the code please as I changed some code recently?

Thanks!

dadoonet · 2017-09-26T09:53:06Z

I tested with much more data (300mb) and everything is working well.
LMK! :)

imotov · 2017-09-26T19:18:16Z

@dadoonet would it make sense to base this tests on ESBlobStoreRepositoryIntegTestCase? I think this base class has most of the tests that we want to run a repo to ensure that it behaves reasonably. If you find it lacking something, I think it would make sense to extend it so all other repos would benefit

We already have nice tests written. Let's just use them.

dadoonet · 2017-09-26T20:21:42Z

@imotov Great! I did not remember about that class. Yeah. Definitely better using it as well.

I pushed new changes.

imotov

LGTM. Thanks for moving the tests! Now we just need to figure out how to get it into CI to run on a regular basis.

imotov · 2017-09-27T18:31:52Z

...tory-azure/src/test/java/org/elasticsearch/repositories/azure/AzureSnapshotRestoreTests.java

+import org.junit.BeforeClass;

 import java.net.URISyntaxException;
+import java.net.UnknownHostException;


* Use Azure upload method instead of our own implementation We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote. And well... Let's just read the doc! * Adapt integration tests to secure settings That was a missing part in #23405. * Simplify all the integration tests and *extends ESBlobStoreRepositoryIntegTestCase tests * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository * merges 2 IT classes so all IT tests are ran from one single class * We don't remove/create anymore the container between each single test but only for the test suite Backport of #26751 in 6.x branch

dadoonet · 2017-09-28T11:52:40Z

I backported it on 6.x yet.

I'm planning to backport on 6.0 but it's a bit harder as some PR have not been merged to 6.0 like #23518 and #23405.

) * Use Azure upload method instead of our own implementation We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote. And well... Let's just read the doc! * Adapt integration tests to secure settings That was a missing part in elastic#23405. * Simplify all the integration tests and *extends ESBlobStoreRepositoryIntegTestCase tests * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository * merges 2 IT classes so all IT tests are ran from one single class * We don't remove/create anymore the container between each single test but only for the test suite Backport of elastic#26751 in 6.0 branch

dadoonet · 2017-09-29T13:59:02Z

Backported to 6.0 as well with 9aa5595

) * Use Azure upload method instead of our own implementation We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote. And well... Let's just read the doc! * Adapt integration tests * Simplify all the integration tests and extends ESBlobStoreRepositoryIntegTestCase tests * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository * merges 2 IT classes so all IT tests are ran from one single class * We don't remove/create anymore the container between each single test but only for the test suite Backport of elastic#26751 in 5.6 branch

…26839) * Use Azure upload method instead of our own implementation We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote. And well... Let's just read the doc! * Adapt integration tests * Simplify all the integration tests and extends ESBlobStoreRepositoryIntegTestCase tests * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository * merges 2 IT classes so all IT tests are ran from one single class * We don't remove/create anymore the container between each single test but only for the test suite Backport of #26751 in 5.6 branch

dadoonet · 2017-10-03T13:30:03Z

Backported to 5.6 with 28f17a7 (see #26839)

While working on #26751, I found that we are passing the container name on every single method although we don't need it as it is stored within the blobstore object already. This commit simplifies a bit that part of the code. It also removes `repositoryName` from AzureBlobStore which was not used anymore. Also we move some properties in AzureBlobContainer to `private` members.

While working on #26751, I found that we are passing the container name on every single method although we don't need it as it is stored within the blobstore object already. This commit simplifies a bit that part of the code. It also removes `repositoryName` from AzureBlobStore which was not used anymore. Also we move some properties in AzureBlobContainer to `private` members. Backport of #26752 in 6.x branch

dadoonet added :Plugin Repository Azure >bug v7.0.0 labels Sep 22, 2017

dadoonet self-assigned this Sep 22, 2017

dadoonet requested a review from imotov September 22, 2017 10:31

dadoonet mentioned this pull request Sep 22, 2017

Simplify Azure blobStore method signatures #26752

Merged

imotov approved these changes Sep 22, 2017

View reviewed changes

dadoonet added 2 commits September 25, 2017 11:27

Adapt integration tests to secure settings

cb78313

That was a missing part in elastic#23405.

Adapt integration tests to secure settings

ba66cd3

dadoonet mentioned this pull request Sep 25, 2017

Azure snapshots can not be restored anymore #26777

Closed

dadoonet added 3 commits September 25, 2017 17:17

Writing a blob needs to write under the right blob key name

1ebe4d7

Remove non needed IT class

d32f358

Fix metadata files are not written

08078d4

dadoonet mentioned this pull request Sep 25, 2017

Azure snapshots can not be restored anymore #26778

Merged

Merge branch 'master' into pr/use-azure-upload-method

8369f5b

dadoonet added 2 commits September 26, 2017 09:48

Merge branch 'master' into pr/use-azure-upload-method

8aa357a

dadoonet added 2 commits September 26, 2017 22:16

Extends ESBlobStoreRepositoryIntegTestCase tests

e53d22d

We already have nice tests written. Let's just use them.

Remove non needed test

bb204b9

imotov approved these changes Sep 27, 2017

View reviewed changes

Remove unused import

7f0e7ff

dadoonet added v6.0.0 v6.1.0 and removed review labels Sep 28, 2017

dadoonet merged commit 1ccb497 into elastic:master Sep 28, 2017

dadoonet deleted the pr/use-azure-upload-method branch September 28, 2017 11:15

dadoonet added v6.1.0 and removed v6.1.0 v6.0.0 labels Sep 28, 2017

dadoonet added the v6.0.0 label Sep 29, 2017

dadoonet mentioned this pull request Sep 29, 2017

Use Azure upload method instead of our own implementation (#26751) #26839

Merged

lcawl added v6.0.0-rc2 and removed v6.0.0 labels Oct 30, 2017

lcawl removed the v6.1.0 label Dec 12, 2017

clintongormley added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs and removed :Plugin Repository Azure labels Feb 14, 2018

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Azure upload method instead of our own implementation#26751

Use Azure upload method instead of our own implementation#26751
dadoonet merged 12 commits intoelastic:masterfrom
dadoonet:pr/use-azure-upload-method

dadoonet commented Sep 22, 2017

Uh oh!

imotov left a comment

Uh oh!

dadoonet commented Sep 25, 2017

Uh oh!

dadoonet commented Sep 26, 2017

Uh oh!

dadoonet commented Sep 26, 2017

Uh oh!

imotov commented Sep 26, 2017

Uh oh!

dadoonet commented Sep 26, 2017

Uh oh!

imotov left a comment

Uh oh!

imotov Sep 27, 2017

Uh oh!

dadoonet commented Sep 28, 2017

Uh oh!

dadoonet commented Sep 29, 2017

Uh oh!

dadoonet commented Oct 3, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

dadoonet commented Sep 22, 2017

Uh oh!

imotov left a comment

Choose a reason for hiding this comment

Uh oh!

dadoonet commented Sep 25, 2017

Uh oh!

dadoonet commented Sep 26, 2017

Uh oh!

dadoonet commented Sep 26, 2017

Uh oh!

imotov commented Sep 26, 2017

Uh oh!

dadoonet commented Sep 26, 2017

Uh oh!

imotov left a comment

Choose a reason for hiding this comment

Uh oh!

imotov Sep 27, 2017

Choose a reason for hiding this comment

Uh oh!

dadoonet commented Sep 28, 2017

Uh oh!

dadoonet commented Sep 29, 2017

Uh oh!

dadoonet commented Oct 3, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants