Skip to content

Rename index folder to index_uuid#16442

Merged
areek merged 4 commits intoelastic:masterfrom
areek:enhancement/index_uuid_rename
Mar 15, 2016
Merged

Rename index folder to index_uuid#16442
areek merged 4 commits intoelastic:masterfrom
areek:enhancement/index_uuid_rename

Conversation

@areek
Copy link
Copy Markdown
Contributor

@areek areek commented Feb 4, 2016

Following up #16217 , This PR uses ${data.paths}/nodes/{node.id}/indices/{index.uuid}
instead of ${data.paths}/nodes/{node.id}/indices/{index.name} pattern to store index
folder on disk.
This way we avoid collision between indices that are named the same (deleted and recreated).

Closes #13265
Closes #13264
Closes #14932
Closes #15853
Closes #14512

@areek areek added review WIP :Distributed/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. v5.0.0-alpha1 labels Feb 4, 2016
@areek
Copy link
Copy Markdown
Contributor Author

areek commented Feb 4, 2016

This is still a work in progress but would be good to have it reviewed.
IMO, we should rename pre-3.0 index folder when loading the global state for the first time in GatewayMetaState. This means once the indices are renamed they can't be read by 2.x. WDYT @bleskes @s1monw?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we always use the Index class to refer to an index? we should get in this habit so we always have both name and uuid available.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Feb 4, 2016

@areek you work in amazing speeds :)

I scanned through this and I think this is the right direction. My only feedback is that we want to have the index folder be "INDEXNAME-UUID", for example if the index name is test and the uuid is ax5fd the folder will be test-ax5fd . The goal here is to make something both human readable and unique.

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Feb 4, 2016

we should rename pre-3.0 index folder when loading the global state for the first time in GatewayMetaState.

Agreed - see 2.x's MultiDataPathUpgrader.upgradeMultiDataPath for an example (which I think you alread saw :) ).

This means once the indices are renamed they can't be read by 2.x.

Correct, but Simon pointed out that this is easy to reverse by renaming the folders back. I do think we should double check how the dangling index logic in 2.x will behave with these renamed folders (to 2.x, it's weird that a folder name is different than the what the index state file in it says).

@s1monw
Copy link
Copy Markdown
Contributor

s1monw commented Feb 4, 2016

I scanned through this and I think this is the right direction. My only feedback is that we want to have the index folder be "INDEXNAME-UUID", for example if the index name is test and the uuid is ax5fd the folder will be test-ax5fd . The goal here is to make something both human readable and unique.

maybe bikeshedding but should it be test_ax5fd?

@dakrone
Copy link
Copy Markdown
Member

dakrone commented Feb 4, 2016

My only feedback is that we want to have the index folder be "INDEXNAME-UUID", for example if the index name is test and the uuid is ax5fd the folder will be test-ax5fd . The goal here is to make something both human readable and unique.

Yes please, I was going to leave the same feedback, ${index-name}-${uuid} would be great. Or with a _ like Simon said.

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Feb 4, 2016

maybe bikeshedding but should it be test_ax5fd?

sure :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should really be hasIndex(Index index) no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this has been removed

@areek
Copy link
Copy Markdown
Contributor Author

areek commented Feb 5, 2016

Thanks for the feedback @bleskes, @s1monw and @dakrone! I have changed the index folder name to ${index-name}_${uuid} as suggested.

Would like to get some thoughts on the state of DanglingIndicesState see #16442 (comment).

Agreed - see 2.x's MultiDataPathUpgrader.upgradeMultiDataPath for an example (which I think you alread saw :) ).

Thanks for the pointer @bleskes. I will start on the bwc after we are happy with the rename.

@clintongormley
Copy link
Copy Markdown
Contributor

I thought we were going to deal with some of the issues around using null bytes, control chars, and non-ascii chars in index names with this PR as well? (eg the pathIdentifier could replace any chars in the name that don't match /\w|\-/

Happy to leave that for a separate PR though.

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Feb 5, 2016

(eg the pathIdentifier could replace any chars in the name that don't match /\w|-/

seems like an easy one to add here... I don't see why not. @areek any objections?

@areek
Copy link
Copy Markdown
Contributor Author

areek commented Feb 5, 2016

I thought we were going to deal with some of the issues around using null bytes, control chars, and non-ascii chars in index names with this PR as well?

Do you mean validating the index name upon index creation or replacing illegal characters on upgrade? We should do both, IMO tighten index name validation and rename pre-3x indices if they contain anything other then alphanumeric or hyphen characters in this pr. one problem is we might have collisions in index names for old indices, if we replace illegal characters on upgrade, index metadata is still keyed by index name rather than the index object. Thoughts?

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Feb 6, 2016

Do you mean validating the index name upon index creation or replacing illegal characters on upgrade?

I think the idea is to make sure that index names are not bound by the limitation of the file system. This will make us free to decided what we want to enforce from ES on the higher levels, without having to take this OS/FS specific limitations. Since we always add a unique identifier to the folder name, it's OK to just clean up the index name part and replace potential troublesome chars. I think a simple replace in the getPathIdentifier() utility method. Nothing more.

@areek areek force-pushed the enhancement/index_uuid_rename branch 3 times, most recently from 19ff090 to d13dcb2 Compare February 9, 2016 04:05
@clintongormley
Copy link
Copy Markdown
Contributor

Just a note: users will no longer be able to import dangling indices under a different name, just by renaming the index folder. Not sure there is anything we can do about this short of providing a command line tool to rename the index.

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Feb 29, 2016

users will no longer be able to import dangling indices under a different name, just by renaming the index folder

Correct. I think we're good with that. I believe that the current plan is to rename this folder to the right name on node start, which is how we plan to deal with BWC. Correct @areek ?

Another dangling indices limitation is that we may not be able to import a (Correctly named) folder because the cluster already has another index of the same name. We should decide how to deal with it. Options are:

  1. Ignore and leave the folder as is (log warnings though).
  2. Wait a bit (2h I believe was an option before), log warnings and eventually delete.
  3. Delete immediately as it is clear the index was replaced.

I'm inclined to go with (1) . Ideas?

@clintongormley
Copy link
Copy Markdown
Contributor

I'm good with option 1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"[{}] can not be imported as a dangling index, as there is already another index with the same name but a different uuid. local index will be ignored (but not deleted)"?

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Mar 14, 2016

Thx @areek . I made another round and it looks really good. Left some comments, most importantly around the the custom path case and validating state versions in advance.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we also check that this location exists? I'm not sure we can guarantee that the hasCustomDataPath is created... @dakrone - any opinions on this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aah, so when we have the index state but there is no data yet?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think if the settings are there but the index has not been opened yet (don't ask, it's a weird but valid use case), the location that the settings point to may not exist

@areek
Copy link
Copy Markdown
Contributor Author

areek commented Mar 14, 2016

Thanks @bleskes and @s1monw for the feedback, I addressed all the comments!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by this danglingIndices.conainsKey() - isn't supposed to be filtered out by the excludeIndexPathIds?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was left over, we don't need the danglingIndices.conainsKey() check anymore

@bleskes
Copy link
Copy Markdown
Contributor

bleskes commented Mar 14, 2016

Thx @areek . Left two questions about hasCustomDataPath strictness and the dangling indices logic. O.w. looks good to me.

@areek areek force-pushed the enhancement/index_uuid_rename branch from d6a018f to 770cdd4 Compare March 15, 2016 03:04
@areek areek force-pushed the enhancement/index_uuid_rename branch from 770cdd4 to c3078f4 Compare March 15, 2016 03:24
@areek areek merged commit c3078f4 into elastic:master Mar 15, 2016
s1monw added a commit to s1monw/elasticsearch that referenced this pull request Mar 15, 2016
Since elastic#16442 is merged we should be able to reenable this test as a followup
of elastic#15853 - all issues blocking it have been resolved I guess.
s1monw added a commit that referenced this pull request Mar 15, 2016
Reenable CreateIndexIT#testCreateAndDeleteIndexConcurrently

Since #16442 is merged we should be able to reenable this test as a followup
of #15853 - all issues blocking it have been resolved I guess.
ywelsch added a commit that referenced this pull request Jun 1, 2016
…cknowledgment (#18602)

Index deletion requests currently use a custom acknowledgement mechanism that wait for the data nodes to actually delete the data before acknowledging the request to the client. This was initially put into place as a new index with same name could only be created if the old index was wiped as we used the index name as data folder on the data nodes. With PR #16442, we now use the index uuid as folder name which avoids collision between indices that are named the same (deleted and recreated). This allows us to get rid of the custom acknowledgment mechanism altogether and rely on the standard cluster state-based acknowledgment instead.

Closes #18558
@clintongormley clintongormley added :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Distributed/Store Issues around managing unopened Lucene indices. If it touches Store.java, this is a likely label. labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement resiliency v5.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants