[Security Solution] Historical rules packages PoC#145851
Closed
xcrzx wants to merge 1 commit intoelastic:mainfrom
xcrzx:rules-package-poc
Closed
[Security Solution] Historical rules packages PoC#145851xcrzx wants to merge 1 commit intoelastic:mainfrom xcrzx:rules-package-poc
xcrzx wants to merge 1 commit intoelastic:mainfrom
xcrzx:rules-package-poc
Conversation
💔 Build FailedFailed CI StepsTest Failures
Metrics [docs]Async chunks
Saved Objects .kibana field count
Unknown metric groupsESLint disabled in files
ESLint disabled line counts
Total ESLint disabled count
History
To update your PR or re-run it, just comment with: cc @xcrzx |
xcrzx
pushed a commit
that referenced
this pull request
Jan 3, 2023
…ects (#148141) **Resolves: #147695, #148174 **Related to: #145851, #137420 ## Summary This PR improves the stability of the Fleet packages installation process with many saved objects. 1. Changed mappings of the `installed_kibana` and `package_assets` fields from `nested` to `object` with `enabled: false`. Values of those fields were retrieved from `_source`, and no queries or aggregations were performed against them. So the mappings were unused, while during the installation of packages containing more than 10,000 saved objects, an error was thrown due to the nested field limitations: ``` Error installing security_detection_engine 8.4.1: The number of nested documents has exceeded the allowed limit of [10000]. This limit can be set by changing the [index.mapping.nested_objects.limit] index level setting. ``` 2. Improved the deletion of previous package assets by switching from sending multiple `savedObjectsClient.delete` requests in parallel to a single `savedObjectsClient.bulkDelete` request. Multiple parallel requests were causing the Elasticsearch cluster to stop responding for some time; see [this ticket](#147695) for more info. **Before**  **After** 
Contributor
Author
|
Closing this PR as both data structures were thoroughly tested. |
8 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Related to: #137420
I've run a number of tests measuring the performance of the two versions of historical rules packages:
On a relatively small total number of historical versions (< 10000 total versions, or 10 versions per rule for 1000 rules), the composite structure outperforms the flat one when installing the rules package:
The difference is visible but not so significant in those numbers. However, things become ugly for the
flatstricture when we increase the total version number to 15-20k.Maximum number of items in a nested field
The first problem that becomes visible is related to the maximum number of items in a nested field. It has already been discussed here and could be easily overcome by adding
enabled: falseto the mappings for theinstalled_kibanafield:kibana/x-pack/plugins/fleet/server/saved_objects/index.ts
Lines 261 to 267 in ab8dd04
Refresh ran out of slots and forced a refresh
After fixing the above error, the rules package becomes installable, but its installation starts to fail randomly, making Kibana unresponsive for some time. The problem seems to come from the Elasticsearch level. Console logs show dozens of warnings similar to this:
During that time, all requests to Kibana fail with
{"statusCode":503,"error":"Service Unavailable","message":"connect EADDRNOTAVAIL 127.0.0.1:9200 - Local (0.0.0.0:0)"}But after some time Elasticsearch cluster recovers by itself.
Created a ticket for the Fleet team: #147695
All shards failed
Another common error associated with the flat package installation is the following is shards failure. It occurs randomly and is not always easily reproduced:
{"statusCode":503,"error":"Service Unavailable","message":"all shards failed: search_phase_execution_exception: [no_shard_available_action_exception] Reason: null"}After this failure, Elasticsearch doesn't recover by itself, and Kibana responds with:
{"statusCode":503,"error":"Service Unavailable","message":"[No shard available for [get [.kibana_8.7.0][space:default]: routing [null]]: no_shard_available_action_exception: [no_shard_available_action_exception] Reason: No shard available for [get [.kibana_8.7.0][space:default]: routing [null]]]: No shard available for [get [.kibana_8.7.0][space:default]: routing [null]]"}Response timeout
Sometimes flat package installation just fails with a timeout.
{"statusCode":503,"error":"Service Unavailable","message":"Request timed out"}Testing instructions
For reference: https://www.elastic.co/guide/en/integrations-developer/current/build-a-new-integration.html
curl -XPOST http://elastic:changeme@localhost:5601/kbn/internal/detection_engine/rules/prebuilt/_install_test_assets -d '{"num_versions_per_rule":10}' -H 'kbn-xsrf: true' -H 'Content-Type: application/json'cd fleet-packages/detection-rules-flat && elastic-package build --skip-validationcd fleet-packages/detection-rules-composite && elastic-package build --skip-validationelastic-package stack up --services package-registrydocker cp <container id>:/etc/ssl/package-registry/ca-cert.pem fleet-packageskibana.dev.yml:xpack.fleet.registryUrl: https://localhost:8080NODE_EXTRA_CA_CERTS=./fleet-packages/ca-cert.pem yarn starthttp://localhost:5601/kbn/app/integrations/browse. You should find there two detection rules packages:Prebuilt detection rules (composite)andPrebuilt detection rules (flat)curl http://elastic:changeme@localhost:5601/kbn/api/fleet/epm/packages/security_rules_flat/8.3.2 -d '{"force":true}' -H 'kbn-xsrf: true' -H 'Content-Type: application/json'curl http://elastic:changeme@localhost:5601/kbn/api/fleet/epm/packages/security_rules_composite/8.3.2 -d '{"force":true}' -H 'kbn-xsrf: true' -H 'Content-Type: application/json'Conclusion
According to this PoC, the composite rule structure looks more stable. However, as outlined in another PoC, the flat structure provides more benefits when it comes to business logic implementation and overall looks more future-proof. My suggestion would be to fix the current performance issues that are associated with the flat structure and use it as a foundation for the rule customization work.