Do not update+pickup docs if SO types mappings have not changed#179937
Do not update+pickup docs if SO types mappings have not changed#179937gsoldevila merged 2 commits intoelastic:mainfrom
Conversation
|
Pinging @elastic/kibana-core (Team:Core) |
a337502 to
b672ecb
Compare
jloleysens
left a comment
There was a problem hiding this comment.
Overall these changes make sense to me, approving to unblock progress. Added a comment making sure I understand the reasoning correctly.
...es/core/saved-objects/core-saved-objects-migration-server-internal/src/core/diff_mappings.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Maybe we can check left.updatedFields.length to adjust the message so that we can indicate in the logs when root fields have changed. WDYT?
There was a problem hiding this comment.
Sounds good to me, having that info logged could be interesting!
There was a problem hiding this comment.
Just to make sure I'm following this correctly: this is because we can skip running transform functions (we already know what the new mappings will be).
There was a problem hiding this comment.
this is primarily because we have dynamic: strict in the base mappings for the root fields https://github.com/elastic/kibana/blob/main/packages/core/saved-objects/core-saved-objects-migration-server-internal/src/core/build_active_mappings.ts#L36-L49
So we can't ever have documents with fields in them that haven't previously been mapped.
So if we ever add a new root field mapping it means we're guaranteed that we don't have to "pick up" the fields in existing docs.
There was a problem hiding this comment.
nit: It's not so clean that we have do null / array length checks because type: 'mappings_changed' basically has two optional fields.
I think it was cleaner with the type: 'root_fields_changed' and explicitly handling this case, even if we say "well the root fields changed but there's nothing to do".
Alternatively we can return a right when the root_fields_changed to say "actually no mappings changed", since in both cases we go to CHECK_VERSION_INDEX_READY_ACTIONS. This perhaps makes the model slightly simpler even if it can be a little confusing "why does the action say no mappings changed if actually the root fields did change?". I don't really have a preference between these two options.
There was a problem hiding this comment.
The problem is that we have 4 combinations:
root_fields_changed: true, types_changed: true
root_fields_changed: true, types_changed: false
root_fields_changed: false, types_changed: true
root_fields_changed: false, types_changed: false
In previous implementation root fields had a bigger impact in the flow, so whenever root_fields_changed we did not care about types_changed (cause we picked up everything).
Now it's almost the other way round, in fact we don't care that much whether root fields change.
So having a return type saying types_changed feels inaccurate, as we might still have root_fields_changed as well. On top of that, it seems that the diffMappings has knowledge about what's more important for the flow (model.ts), by returning one prior to the other.
There was a problem hiding this comment.
I can either make that knowledge explicit in diffMappings by having something like:
- { type: 'mappings_changed_must_update_types', updatedTypes: [...] }
- { type: 'mappings_changed_only_root_fields_changed', udpatedFields: [...] }
Or we could add booleans to prevent the null / array length checks, and let model.ts decide what to do.
I'd prefer this one (in fact that was the reason behind this change).
EDIT: A third option would be wrapping the diffMappings in some other action, (checkPickupDocs? naming is hard) and returning the appropriate result, so that this knowledge stays outside of model.ts.
- This one could return
no_pickup_neededvspickup_needed(or similar), which in the end is what drives the 2 paths of the flow.
There was a problem hiding this comment.
Ah I see. Yes I do agree that generally having more logic in the model makes sense, and actions are generally more "dumb" and just deal with Elasticsearch.
Part of me wants to keep root field change detection because we might need it. But we probably would never need it and it does make everything simpler if we just have types_changed. We won't be able to log "root fields changed" but I'm also not sure that's really necessary or relevant as it doesn't have any impact or migration performance. Do we loose anything if we completely remove root fields changed from checkTargetMappings?
There was a problem hiding this comment.
I think we don't lose anything besides logging, so I will simplify and rename:
checkTargetMappings => checkTargetTypesMappings
It will only contemplate the case where types_changed.
b672ecb to
189d9b0
Compare
💚 Build Succeeded
Metrics [docs]
History
To update your PR or re-run it, just comment with: |
pgayvallet
left a comment
There was a problem hiding this comment.
Changes looks fine to me
Summary
Align V2 behavior with ZDT after #179595
Under the assumption that whenever we want to add new root fields (e.g. created_by), we will systematically add mappings for them, we can skip the
updateAndPickupMappingsoperation iif ONLY root fields' mappings have changed during an upgrade (aka if none of the SO types have updated their mappings).Up until now, the logic was updating ALL SO types whenever a
rootfield was added / updated.This is expensive and unnecessary, and can cause a noticeable impact to large customers during migrations.