Rename index folder to index_uuid by areek · Pull Request #16442 · elastic/elasticsearch

areek · 2016-02-04T05:24:17Z

Following up #16217 , This PR uses ${data.paths}/nodes/{node.id}/indices/{index.uuid}
instead of ${data.paths}/nodes/{node.id}/indices/{index.name} pattern to store index
folder on disk.
This way we avoid collision between indices that are named the same (deleted and recreated).

Closes #13265
Closes #13264
Closes #14932
Closes #15853
Closes #14512

areek · 2016-02-04T05:56:47Z

This is still a work in progress but would be good to have it reviewed.
IMO, we should rename pre-3.0 index folder when loading the global state for the first time in GatewayMetaState. This means once the indices are renamed they can't be read by 2.x. WDYT @bleskes @s1monw?

bleskes · 2016-02-04T07:43:22Z

core/src/main/java/org/elasticsearch/env/NodeEnvironment.java

can we always use the Index class to refer to an index? we should get in this habit so we always have both name and uuid available.

bleskes · 2016-02-04T07:45:22Z

@areek you work in amazing speeds :)

I scanned through this and I think this is the right direction. My only feedback is that we want to have the index folder be "INDEXNAME-UUID", for example if the index name is test and the uuid is ax5fd the folder will be test-ax5fd . The goal here is to make something both human readable and unique.

bleskes · 2016-02-04T08:07:25Z

we should rename pre-3.0 index folder when loading the global state for the first time in GatewayMetaState.

Agreed - see 2.x's MultiDataPathUpgrader.upgradeMultiDataPath for an example (which I think you alread saw :) ).

This means once the indices are renamed they can't be read by 2.x.

Correct, but Simon pointed out that this is easy to reverse by renaming the folders back. I do think we should double check how the dangling index logic in 2.x will behave with these renamed folders (to 2.x, it's weird that a folder name is different than the what the index state file in it says).

s1monw · 2016-02-04T15:30:14Z

I scanned through this and I think this is the right direction. My only feedback is that we want to have the index folder be "INDEXNAME-UUID", for example if the index name is test and the uuid is ax5fd the folder will be test-ax5fd . The goal here is to make something both human readable and unique.

maybe bikeshedding but should it be test_ax5fd?

dakrone · 2016-02-04T15:30:46Z

My only feedback is that we want to have the index folder be "INDEXNAME-UUID", for example if the index name is test and the uuid is ax5fd the folder will be test-ax5fd . The goal here is to make something both human readable and unique.

Yes please, I was going to leave the same feedback, ${index-name}-${uuid} would be great. Or with a _ like Simon said.

bleskes · 2016-02-04T15:31:47Z

maybe bikeshedding but should it be test_ax5fd?

sure :)

s1monw · 2016-02-04T16:37:31Z

core/src/main/java/org/elasticsearch/cluster/metadata/MetaData.java

this should really be hasIndex(Index index) no?

this has been removed

areek · 2016-02-05T03:39:02Z

Thanks for the feedback @bleskes, @s1monw and @dakrone! I have changed the index folder name to ${index-name}_${uuid} as suggested.

Would like to get some thoughts on the state of DanglingIndicesState see #16442 (comment).

Agreed - see 2.x's MultiDataPathUpgrader.upgradeMultiDataPath for an example (which I think you alread saw :) ).

Thanks for the pointer @bleskes. I will start on the bwc after we are happy with the rename.

clintongormley · 2016-02-05T05:10:14Z

I thought we were going to deal with some of the issues around using null bytes, control chars, and non-ascii chars in index names with this PR as well? (eg the pathIdentifier could replace any chars in the name that don't match /\w|\-/

Happy to leave that for a separate PR though.

bleskes · 2016-02-05T07:49:42Z

(eg the pathIdentifier could replace any chars in the name that don't match /\w|-/

seems like an easy one to add here... I don't see why not. @areek any objections?

areek · 2016-02-05T21:52:12Z

I thought we were going to deal with some of the issues around using null bytes, control chars, and non-ascii chars in index names with this PR as well?

Do you mean validating the index name upon index creation or replacing illegal characters on upgrade? We should do both, IMO tighten index name validation and rename pre-3x indices if they contain anything other then alphanumeric or hyphen characters in this pr. one problem is we might have collisions in index names for old indices, if we replace illegal characters on upgrade, index metadata is still keyed by index name rather than the index object. Thoughts?

bleskes · 2016-02-06T10:06:50Z

Do you mean validating the index name upon index creation or replacing illegal characters on upgrade?

I think the idea is to make sure that index names are not bound by the limitation of the file system. This will make us free to decided what we want to enforce from ES on the higher levels, without having to take this OS/FS specific limitations. Since we always add a unique identifier to the folder name, it's OK to just clean up the index name part and replace potential troublesome chars. I think a simple replace in the getPathIdentifier() utility method. Nothing more.

clintongormley · 2016-02-29T13:01:44Z

Just a note: users will no longer be able to import dangling indices under a different name, just by renaming the index folder. Not sure there is anything we can do about this short of providing a command line tool to rename the index.

bleskes · 2016-02-29T13:12:28Z

users will no longer be able to import dangling indices under a different name, just by renaming the index folder

Correct. I think we're good with that. I believe that the current plan is to rename this folder to the right name on node start, which is how we plan to deal with BWC. Correct @areek ?

Another dangling indices limitation is that we may not be able to import a (Correctly named) folder because the cluster already has another index of the same name. We should decide how to deal with it. Options are:

Ignore and leave the folder as is (log warnings though).
Wait a bit (2h I believe was an option before), log warnings and eventually delete.
Delete immediately as it is clear the index was replaced.

I'm inclined to go with (1) . Ideas?

clintongormley · 2016-03-02T08:50:13Z

I'm good with option 1

bleskes · 2016-03-14T10:30:56Z

core/src/main/java/org/elasticsearch/gateway/DanglingIndicesState.java

"[{}] can not be imported as a dangling index, as there is already another index with the same name but a different uuid. local index will be ignored (but not deleted)"?

bleskes · 2016-03-14T10:48:15Z

Thx @areek . I made another round and it looks really good. Left some comments, most importantly around the the custom path case and validating state versions in advance.

bleskes · 2016-03-14T21:15:33Z

core/src/main/java/org/elasticsearch/common/util/IndexFolderUpgrader.java

can we also check that this location exists? I'm not sure we can guarantee that the hasCustomDataPath is created... @dakrone - any opinions on this?

aah, so when we have the index state but there is no data yet?

Yeah, I think if the settings are there but the index has not been opened yet (don't ask, it's a weird but valid use case), the location that the settings point to may not exist

areek · 2016-03-14T21:16:09Z

Thanks @bleskes and @s1monw for the feedback, I addressed all the comments!

bleskes · 2016-03-14T21:20:49Z

core/src/main/java/org/elasticsearch/gateway/DanglingIndicesState.java

I'm confused by this danglingIndices.conainsKey() - isn't supposed to be filtered out by the excludeIndexPathIds?

This was left over, we don't need the danglingIndices.conainsKey() check anymore

bleskes · 2016-03-14T21:28:50Z

Thx @areek . Left two questions about hasCustomDataPath strictness and the dangling indices logic. O.w. looks good to me.

…x name

Since elastic#16442 is merged we should be able to reenable this test as a followup of elastic#15853 - all issues blocking it have been resolved I guess.

Reenable CreateIndexIT#testCreateAndDeleteIndexConcurrently Since #16442 is merged we should be able to reenable this test as a followup of #15853 - all issues blocking it have been resolved I guess.

…cknowledgment (#18602) Index deletion requests currently use a custom acknowledgement mechanism that wait for the data nodes to actually delete the data before acknowledging the request to the client. This was initially put into place as a new index with same name could only be created if the old index was wiped as we used the index name as data folder on the data nodes. With PR #16442, we now use the index uuid as folder name which avoids collision between indices that are named the same (deleted and recreated). This allows us to get rid of the custom acknowledgment mechanism altogether and rely on the standard cluster state-based acknowledgment instead. Closes #18558

areek added review WIP :Distributed/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. v5.0.0-alpha1 labels Feb 4, 2016

bleskes reviewed Feb 4, 2016
View reviewed changes

s1monw reviewed Feb 4, 2016
View reviewed changes

areek force-pushed the enhancement/index_uuid_rename branch 3 times, most recently from 19ff090 to d13dcb2 Compare February 9, 2016 04:05

clintongormley added resiliency >enhancement and removed :Internal labels Feb 13, 2016

clintongormley mentioned this pull request Feb 14, 2016

Restrict index names to ASCII #15553

Closed

bleskes reviewed Mar 14, 2016
View reviewed changes

areek force-pushed the enhancement/index_uuid_rename branch from d6a018f to 770cdd4 Compare March 15, 2016 03:04

areek added 4 commits March 14, 2016 23:24

Add upgrader to upgrade old indices to new naming convention

35f7cfb

remove redundant getters in MetaData

3daa83b

use index uuid as folder name to decouple index folder name from inde…

2b18a3c

…x name

adapt tests to use index uuid as folder name

c3078f4

areek force-pushed the enhancement/index_uuid_rename branch from 770cdd4 to c3078f4 Compare March 15, 2016 03:24

areek merged commit c3078f4 into elastic:master Mar 15, 2016

s1monw mentioned this pull request Mar 15, 2016

Reenable CreateIndexIT#testCreateAndDeleteIndexConcurrently #17103

Merged

areek mentioned this pull request Mar 15, 2016

Standardize state format type for global and index level metadata #17123

Merged

jasontedor mentioned this pull request Mar 22, 2016

Es goes insane when doing bulk-upsert with index from template where index-name has ":" in it #17242

Closed

clintongormley mentioned this pull request Mar 31, 2016

_rename index feature request for a read-only index #17426

Closed

ywelsch mentioned this pull request Jun 1, 2016

Acknowledge index deletion requests based on standard cluster state acknowledgment #18602

Merged

clintongormley added :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Distributed/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. labels Feb 13, 2018

Conversation

areek commented Feb 4, 2016

Uh oh!

areek commented Feb 4, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bleskes commented Feb 4, 2016

Uh oh!

bleskes commented Feb 4, 2016

Uh oh!

s1monw commented Feb 4, 2016

Uh oh!

dakrone commented Feb 4, 2016

Uh oh!

bleskes commented Feb 4, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

areek commented Feb 5, 2016

Uh oh!

clintongormley commented Feb 5, 2016

Uh oh!

bleskes commented Feb 5, 2016

Uh oh!

areek commented Feb 5, 2016

Uh oh!

bleskes commented Feb 6, 2016

Uh oh!

clintongormley commented Feb 29, 2016

Uh oh!

bleskes commented Feb 29, 2016

Uh oh!

clintongormley commented Mar 2, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bleskes commented Mar 14, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

areek commented Mar 14, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bleskes commented Mar 14, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants