Rename index folder to index_uuid#16442
Conversation
There was a problem hiding this comment.
can we always use the Index class to refer to an index? we should get in this habit so we always have both name and uuid available.
|
@areek you work in amazing speeds :) I scanned through this and I think this is the right direction. My only feedback is that we want to have the index folder be "INDEXNAME-UUID", for example if the index name is |
Agreed - see 2.x's MultiDataPathUpgrader.upgradeMultiDataPath for an example (which I think you alread saw :) ).
Correct, but Simon pointed out that this is easy to reverse by renaming the folders back. I do think we should double check how the dangling index logic in 2.x will behave with these renamed folders (to 2.x, it's weird that a folder name is different than the what the index state file in it says). |
|
maybe bikeshedding but should it be |
Yes please, I was going to leave the same feedback, |
sure :) |
There was a problem hiding this comment.
this should really be hasIndex(Index index) no?
|
Thanks for the feedback @bleskes, @s1monw and @dakrone! I have changed the index folder name to Would like to get some thoughts on the state of
Thanks for the pointer @bleskes. I will start on the bwc after we are happy with the rename. |
|
I thought we were going to deal with some of the issues around using null bytes, control chars, and non-ascii chars in index names with this PR as well? (eg the pathIdentifier could replace any chars in the name that don't match Happy to leave that for a separate PR though. |
seems like an easy one to add here... I don't see why not. @areek any objections? |
Do you mean validating the index name upon index creation or replacing illegal characters on upgrade? We should do both, IMO tighten index name validation and rename pre-3x indices if they contain anything other then alphanumeric or hyphen characters in this pr. one problem is we might have collisions in index names for old indices, if we replace illegal characters on upgrade, index metadata is still keyed by index name rather than the index object. Thoughts? |
I think the idea is to make sure that index names are not bound by the limitation of the file system. This will make us free to decided what we want to enforce from ES on the higher levels, without having to take this OS/FS specific limitations. Since we always add a unique identifier to the folder name, it's OK to just clean up the index name part and replace potential troublesome chars. I think a simple replace in the getPathIdentifier() utility method. Nothing more. |
19ff090 to
d13dcb2
Compare
|
Just a note: users will no longer be able to import dangling indices under a different name, just by renaming the index folder. Not sure there is anything we can do about this short of providing a command line tool to rename the index. |
Correct. I think we're good with that. I believe that the current plan is to rename this folder to the right name on node start, which is how we plan to deal with BWC. Correct @areek ? Another dangling indices limitation is that we may not be able to import a (Correctly named) folder because the cluster already has another index of the same name. We should decide how to deal with it. Options are:
I'm inclined to go with (1) . Ideas? |
|
I'm good with option 1 |
There was a problem hiding this comment.
"[{}] can not be imported as a dangling index, as there is already another index with the same name but a different uuid. local index will be ignored (but not deleted)"?
|
Thx @areek . I made another round and it looks really good. Left some comments, most importantly around the the custom path case and validating state versions in advance. |
There was a problem hiding this comment.
can we also check that this location exists? I'm not sure we can guarantee that the hasCustomDataPath is created... @dakrone - any opinions on this?
There was a problem hiding this comment.
aah, so when we have the index state but there is no data yet?
There was a problem hiding this comment.
Yeah, I think if the settings are there but the index has not been opened yet (don't ask, it's a weird but valid use case), the location that the settings point to may not exist
There was a problem hiding this comment.
I'm confused by this danglingIndices.conainsKey() - isn't supposed to be filtered out by the excludeIndexPathIds?
There was a problem hiding this comment.
This was left over, we don't need the danglingIndices.conainsKey() check anymore
|
Thx @areek . Left two questions about hasCustomDataPath strictness and the dangling indices logic. O.w. looks good to me. |
d6a018f to
770cdd4
Compare
770cdd4 to
c3078f4
Compare
Since elastic#16442 is merged we should be able to reenable this test as a followup of elastic#15853 - all issues blocking it have been resolved I guess.
…cknowledgment (#18602) Index deletion requests currently use a custom acknowledgement mechanism that wait for the data nodes to actually delete the data before acknowledging the request to the client. This was initially put into place as a new index with same name could only be created if the old index was wiped as we used the index name as data folder on the data nodes. With PR #16442, we now use the index uuid as folder name which avoids collision between indices that are named the same (deleted and recreated). This allows us to get rid of the custom acknowledgment mechanism altogether and rely on the standard cluster state-based acknowledgment instead. Closes #18558
Following up #16217 , This PR uses
${data.paths}/nodes/{node.id}/indices/{index.uuid}instead of
${data.paths}/nodes/{node.id}/indices/{index.name}pattern to store indexfolder on disk.
This way we avoid collision between indices that are named the same (deleted and recreated).
Closes #13265
Closes #13264
Closes #14932
Closes #15853
Closes #14512