Skip to content

Do not update+pickup docs if SO types mappings have not changed#179937

Merged
gsoldevila merged 2 commits intoelastic:mainfrom
gsoldevila:improve-v2-pickup
Apr 9, 2024
Merged

Do not update+pickup docs if SO types mappings have not changed#179937
gsoldevila merged 2 commits intoelastic:mainfrom
gsoldevila:improve-v2-pickup

Conversation

@gsoldevila
Copy link
Copy Markdown
Member

Summary

Align V2 behavior with ZDT after #179595

Under the assumption that whenever we want to add new root fields (e.g. created_by), we will systematically add mappings for them, we can skip the updateAndPickupMappings operation iif ONLY root fields' mappings have changed during an upgrade (aka if none of the SO types have updated their mappings).

Up until now, the logic was updating ALL SO types whenever a root field was added / updated.
This is expensive and unnecessary, and can cause a noticeable impact to large customers during migrations.

@gsoldevila gsoldevila added Team:Core Platform Core services: plugins, logging, config, saved objects, http, ES client, i18n, etc t// enhancement New value added to drive a business result release_note:skip Skip the PR/issue when compiling release notes backport:skip This PR does not require backporting Feature:Migrations labels Apr 3, 2024
@gsoldevila gsoldevila requested a review from a team as a code owner April 3, 2024 14:11
@elasticmachine
Copy link
Copy Markdown
Contributor

Pinging @elastic/kibana-core (Team:Core)

Copy link
Copy Markdown
Contributor

@jloleysens jloleysens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall these changes make sense to me, approving to unblock progress. Added a comment making sure I understand the reasoning correctly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can check left.updatedFields.length to adjust the message so that we can indicate in the logs when root fields have changed. WDYT?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, having that info logged could be interesting!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I'm following this correctly: this is because we can skip running transform functions (we already know what the new mappings will be).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is primarily because we have dynamic: strict in the base mappings for the root fields https://github.com/elastic/kibana/blob/main/packages/core/saved-objects/core-saved-objects-migration-server-internal/src/core/build_active_mappings.ts#L36-L49

So we can't ever have documents with fields in them that haven't previously been mapped.

So if we ever add a new root field mapping it means we're guaranteed that we don't have to "pick up" the fields in existing docs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It's not so clean that we have do null / array length checks because type: 'mappings_changed' basically has two optional fields.

I think it was cleaner with the type: 'root_fields_changed' and explicitly handling this case, even if we say "well the root fields changed but there's nothing to do".

Alternatively we can return a right when the root_fields_changed to say "actually no mappings changed", since in both cases we go to CHECK_VERSION_INDEX_READY_ACTIONS. This perhaps makes the model slightly simpler even if it can be a little confusing "why does the action say no mappings changed if actually the root fields did change?". I don't really have a preference between these two options.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we have 4 combinations:

root_fields_changed: true,  types_changed: true
root_fields_changed: true,  types_changed: false
root_fields_changed: false, types_changed: true
root_fields_changed: false, types_changed: false

In previous implementation root fields had a bigger impact in the flow, so whenever root_fields_changed we did not care about types_changed (cause we picked up everything).
Now it's almost the other way round, in fact we don't care that much whether root fields change.

So having a return type saying types_changed feels inaccurate, as we might still have root_fields_changed as well. On top of that, it seems that the diffMappings has knowledge about what's more important for the flow (model.ts), by returning one prior to the other.

Copy link
Copy Markdown
Member Author

@gsoldevila gsoldevila Apr 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can either make that knowledge explicit in diffMappings by having something like:

- { type: 'mappings_changed_must_update_types', updatedTypes: [...] }
- { type: 'mappings_changed_only_root_fields_changed', udpatedFields: [...] }

Or we could add booleans to prevent the null / array length checks, and let model.ts decide what to do.
I'd prefer this one (in fact that was the reason behind this change).

EDIT: A third option would be wrapping the diffMappings in some other action, (checkPickupDocs? naming is hard) and returning the appropriate result, so that this knowledge stays outside of model.ts.

  • This one could return no_pickup_needed vs pickup_needed (or similar), which in the end is what drives the 2 paths of the flow.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Yes I do agree that generally having more logic in the model makes sense, and actions are generally more "dumb" and just deal with Elasticsearch.

Part of me wants to keep root field change detection because we might need it. But we probably would never need it and it does make everything simpler if we just have types_changed. We won't be able to log "root fields changed" but I'm also not sure that's really necessary or relevant as it doesn't have any impact or migration performance. Do we loose anything if we completely remove root fields changed from checkTargetMappings?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't lose anything besides logging, so I will simplify and rename:

checkTargetMappings => checkTargetTypesMappings

It will only contemplate the case where types_changed.

@gsoldevila gsoldevila requested a review from rudolf April 8, 2024 08:50
@kibana-ci
Copy link
Copy Markdown

💚 Build Succeeded

Metrics [docs]

✅ unchanged

History

  • 💚 Build #200916 succeeded b672ecbef54e44b4f78de98c7fa70bdfc0439960
  • 💔 Build #200854 failed a33750272d88f9bf54df792a1930f3df88d2008b

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Copy Markdown
Contributor

@pgayvallet pgayvallet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks fine to me

@gsoldevila gsoldevila merged commit 9046abb into elastic:main Apr 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:skip This PR does not require backporting enhancement New value added to drive a business result Feature:Migrations release_note:skip Skip the PR/issue when compiling release notes Team:Core Platform Core services: plugins, logging, config, saved objects, http, ES client, i18n, etc t// v8.14.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants