Skip to content

Reasons for not using saved objects for storing kibana data #80912

@kobelb

Description

@kobelb
🚩 Note I intend to use the main issue description to reflect our growing understanding of the situation, so I will be periodically updating the main issue description to reflect what we discuss. I'll make sure to add a comment denoting that this has occurred, so it's not silently changing.

A majority of Kibana's entities are persisted in saved-objects. However, there's a growing number of non-saved-object Elasticsearch indices that are being used to store Kibana specific entities. The following are the ones that I'm currently aware of:

  1. Alerting's event log - .kibana-event-log-*
  2. APM agent configuration - .apm-agent-configuration
  3. APM custom link - .apm-custom-link
  4. Detection engine signals - .siem-signals-*
  5. Security solution lists - .lists and .values
  6. Reporting - .reporting-*

I've started this discuss issue to determine what other Elasticsearch indices are being used to store Kibana specific entities, and enumerate the reasons for why they aren't being stored as saved-objects. Saved-objects provide a number of features including migrations, authorization, audit logging, export/import, space awareness, and encrypted attributes that developers forgo when using non-saved-object ES indices.

I'd like to perform this exercise to ensure that there aren't limitations that should be addressed with saved-objects to make them applicable to other use-cases or figure out which current saved-object specific features should be made available when using non-saved-object ES indices.

Reasons we haven't used saved-objects

End-users should be able to query the indices directly

Saved-objects are stored in a "system index", and as such, end-users will not be able to query these indices directly starting in 8.0. Even if end-users could theoretically query system-indices, we treat the ES document format as an implementation detail of saved-objects, and they're prone to change during minor versions in a non-backward compatible manner, so end-users shouldn't be querying them directly.

Applies to: Alerting's event log, Detection engine signals

There are too many saved-objects

The SIEM team has outlined a few of the issues that they experienced when trying to model their lists using saved-objects in #64715. Notably, SavedObjectsClient#find's paging implementation doesn't function properly when there are more than 10k results, which is being tracked by #77961.

Applies to: Security solution lists

Documents are too large

Reporting is using its own dedicated .reporting-* indices because they include base64 encoded data for the generated CSVs, PDFs and PNGs. Since these documents are generally so large, they can't be migrated using saved-object migrations, and they're created on a weekly basis.

Applies to: Reporting

Aggregations

Plugins wanting to run aggregations cannot use the saved objects client (we have made good progress in #64002 but it might take some time for plugins to adopt it).

In addition, it will not be possible to use a query to limit the documents to aggregate over. One workaround is to use a KQL filter, but this impacts performance and is discouraged by the ES team #69172

Applies to: APM Agent Configuration

Filtering on update / delete queries

It's not possible to efficiently delete or update many documents without doing these operations over all documents of a certain saved object type

Filtering on nested fields

Filter validation fails when writing a KQL query for nested field types #81009

Metadata

Metadata

Assignees

No one assigned

    Labels

    Feature:Saved ObjectsTeam:CorePlatform Core services: plugins, logging, config, saved objects, http, ES client, i18n, etc t//discuss

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions