Skip to content

[Detection Engine] Investigation: Testing whether es_archiver can cause duplicate documents#223043

Closed
rylnd wants to merge 16 commits intoelastic:mainfrom
rylnd:rylnd/break_es_archiver
Closed

[Detection Engine] Investigation: Testing whether es_archiver can cause duplicate documents#223043
rylnd wants to merge 16 commits intoelastic:mainfrom
rylnd:rylnd/break_es_archiver

Conversation

@rylnd
Copy link
Copy Markdown
Contributor

@rylnd rylnd commented Jun 6, 2025

This integration test is executed 100 times, with the single-document archive being loaded and unloaded before/after each test. If this test fails, it will eliminate the detection engine as a culprit.

This integration test is executed 100 times, with the single-document
archive being loaded and unloaded before/after each test. If this test
fails, it will eliminate the detection engine as a culprit.
@elasticmachine
Copy link
Copy Markdown
Contributor

🤖 Jobs for this PR can be triggered through checkboxes. 🚧

ℹ️ To trigger the CI, please tick the checkbox below 👇

  • Click to trigger kibana-pull-request for this PR!
  • Click to trigger kibana-deploy-project-from-pr for this PR!
  • Click to trigger kibana-deploy-cloud-from-pr for this PR!

@rylnd
Copy link
Copy Markdown
Contributor Author

rylnd commented Jun 6, 2025

200 runs (with 100 executions each, for a total of 20_000 executions): https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/8349

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#8349

[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 160/200 tests passed.

see run history

rylnd added 2 commits July 8, 2025 17:11
 Conflicts:
	x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/index.ts
I don't know if APM is disabled by default in my scenario, but this will
ensure it is.
@rylnd
Copy link
Copy Markdown
Contributor Author

rylnd commented Jul 8, 2025

Attempting to trigger another build with APM manually enabled: https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/8561

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#8561

[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 71/100 tests passed.

see run history

rylnd added 2 commits July 9, 2025 23:01
I think the values in this config will be respected, as per this thread:
https://elastic.slack.com/archives/C05UT5PP1EF/p1741788732650719, but
we'll find out.
@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#8577

[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 0/10 tests passed.

see run history

At this point I'm just seeing what sticks.
@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#8597

[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 0/10 tests passed.

see run history

After looking more closely at how these env variables interact, I think
that the context propagation var needs to explicitly be false in order
to allow reporting; I had assumed that this wasn't needed originally.

This also removes the duplicated definitions in a substep of the
pipeline.
@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#8600

[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 1/10 tests passed.

see run history

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#8601

[✅] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 10/10 tests passed.

see run history

@rylnd
Copy link
Copy Markdown
Contributor Author

rylnd commented Jul 10, 2025

/ci

Now that we're running an actual build, we can't get to the tests until
it lints 🙄
@rylnd
Copy link
Copy Markdown
Contributor Author

rylnd commented Jul 10, 2025

/ci

@elasticmachine
Copy link
Copy Markdown
Contributor

elasticmachine commented Jul 11, 2025

💔 Build Failed

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #68 / Search solution tests Search onboarding API keys Elasticsearch Start [Onboarding Empty State] should create a new api key when the existing one is invalidated
  • [job] [logs] FTR Configs #68 / Search solution tests Search onboarding API keys Elasticsearch Start [Onboarding Empty State] should create a new api key when the existing one is invalidated

Metrics [docs]

✅ unchanged

History

rylnd added 2 commits July 15, 2025 15:35
This is copied liberally from the APM performance FTRs.
So that this all happens at the base config, instead.
@rylnd
Copy link
Copy Markdown
Contributor Author

rylnd commented Jul 15, 2025

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#8661

[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 9/10 tests passed.

see run history

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#8662

[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 7/10 tests passed.

see run history

* Adds a delay at the end of the run to allow APM to report
* Adds server args for telemetryOptIn and labels

I assumed telemetry was opt out, but perhaps this was the missing piece.
@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#8665

[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 5/10 tests passed.

see run history

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#8690

[✅] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 10/10 tests passed.

see run history

We set a default key from vault, but my guess is that this is
conflicting with the server/token being used to send to the performance
APM server.
@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#8691

[✅] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 10/10 tests passed.

see run history

I suspect that setting the pipeline env var to `undefined` might not
override an existing value, there, so I'm trying to remove any setting
of that API key at all (in addition to unsetting the server URL, which
_seems_ to be getting overridden elsewhere, to the perf server, but at
this point I have no confidence in anything.
@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🎉 All tests passed! - kibana-flaky-test-suite-runner#8692

[✅] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 10/10 tests passed.

see run history

@rylnd
Copy link
Copy Markdown
Contributor Author

rylnd commented Jul 16, 2025

Okay, with this change I was able to unset the default APM API key, which was causing our reporting to the perf server to be rejected. I'm now able to see my data there, but:

  1. I'm not sure if it's the data that was asked for
  2. The data doesn't yet include a failure

I'm addressing the second point with a new 50x build, and will attempt to clarify the first point with the others involved.

@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#8695

[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 25/50 tests passed.

see run history

@rylnd
Copy link
Copy Markdown
Contributor Author

rylnd commented Jul 16, 2025

labels.ciBuildJobId: "01981533-3610-44fa-b238-a3a3cf351589" is a failing job from the previous build.

This adds APM around our functional test runner, which should allow us
to collect data on what's happening with elasticsearchduring our
(failing) test runs.
@kibanamachine
Copy link
Copy Markdown
Contributor

Flaky Test Runner Stats

🟠 Some tests failed. - kibana-flaky-test-suite-runner#8738

[❌] x-pack/test/security_solution_api_integration/test_suites/detections_response/detection_engine/rule_execution_logic/general_logic/basic_license_essentials_tier/configs/ess.config.ts: 0/20 tests passed.

see run history

@rylnd
Copy link
Copy Markdown
Contributor Author

rylnd commented Jul 23, 2025

Closing this for now, in favor of the more productive (and more straightforward) #228850.

@rylnd rylnd closed this Jul 23, 2025
dmlemeshko added a commit that referenced this pull request Aug 6, 2025
## Summary

This PR fixes duplicate document creation in esArchiver by generating an
`_id` for index (non-data-stream, non-time-series) documents that don't
have an id already.

### Details

- Under some circumstances, the `es-helper-bulk` that is used by
esArchiver can ingest a duplicate document (just with different id), see
investigations [here](#228556) and
[here](#223043), also bug report
[here](elastic/elasticsearch-js#2924).
- With explicitly setting the id, the flakiness didn't show up anymore,
which matches the expected behavior as of the [bulk
docs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)
`A create action fails if a document with the same ID already exists in
the target An index action adds or replaces a document as necessary.`
- In order to unblock testing, this PR is actually working around the
underlying problem, which should still be investigated separately

---------

Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Aug 6, 2025
## Summary

This PR fixes duplicate document creation in esArchiver by generating an
`_id` for index (non-data-stream, non-time-series) documents that don't
have an id already.

### Details

- Under some circumstances, the `es-helper-bulk` that is used by
esArchiver can ingest a duplicate document (just with different id), see
investigations [here](elastic#228556) and
[here](elastic#223043), also bug report
[here](elastic/elasticsearch-js#2924).
- With explicitly setting the id, the flakiness didn't show up anymore,
which matches the expected behavior as of the [bulk
docs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)
`A create action fails if a document with the same ID already exists in
the target An index action adds or replaces a document as necessary.`
- In order to unblock testing, this PR is actually working around the
underlying problem, which should still be investigated separately

---------

Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
(cherry picked from commit 42377e4)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Aug 6, 2025
## Summary

This PR fixes duplicate document creation in esArchiver by generating an
`_id` for index (non-data-stream, non-time-series) documents that don't
have an id already.

### Details

- Under some circumstances, the `es-helper-bulk` that is used by
esArchiver can ingest a duplicate document (just with different id), see
investigations [here](elastic#228556) and
[here](elastic#223043), also bug report
[here](elastic/elasticsearch-js#2924).
- With explicitly setting the id, the flakiness didn't show up anymore,
which matches the expected behavior as of the [bulk
docs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)
`A create action fails if a document with the same ID already exists in
the target An index action adds or replaces a document as necessary.`
- In order to unblock testing, this PR is actually working around the
underlying problem, which should still be investigated separately

---------

Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
(cherry picked from commit 42377e4)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Aug 6, 2025
## Summary

This PR fixes duplicate document creation in esArchiver by generating an
`_id` for index (non-data-stream, non-time-series) documents that don't
have an id already.

### Details

- Under some circumstances, the `es-helper-bulk` that is used by
esArchiver can ingest a duplicate document (just with different id), see
investigations [here](elastic#228556) and
[here](elastic#223043), also bug report
[here](elastic/elasticsearch-js#2924).
- With explicitly setting the id, the flakiness didn't show up anymore,
which matches the expected behavior as of the [bulk
docs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)
`A create action fails if a document with the same ID already exists in
the target An index action adds or replaces a document as necessary.`
- In order to unblock testing, this PR is actually working around the
underlying problem, which should still be investigated separately

---------

Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
(cherry picked from commit 42377e4)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Aug 6, 2025
## Summary

This PR fixes duplicate document creation in esArchiver by generating an
`_id` for index (non-data-stream, non-time-series) documents that don't
have an id already.

### Details

- Under some circumstances, the `es-helper-bulk` that is used by
esArchiver can ingest a duplicate document (just with different id), see
investigations [here](elastic#228556) and
[here](elastic#223043), also bug report
[here](elastic/elasticsearch-js#2924).
- With explicitly setting the id, the flakiness didn't show up anymore,
which matches the expected behavior as of the [bulk
docs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)
`A create action fails if a document with the same ID already exists in
the target An index action adds or replaces a document as necessary.`
- In order to unblock testing, this PR is actually working around the
underlying problem, which should still be investigated separately

---------

Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
(cherry picked from commit 42377e4)
kibanamachine added a commit that referenced this pull request Aug 6, 2025
# Backport

This will backport the following commits from `main` to `9.1`:
- [FTR - fix esArchiver duplicate doc ingestion
(#229457)](#229457)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Robert
Oskamp","email":"robert.oskamp@elastic.co"},"sourceCommit":{"committedDate":"2025-08-06T15:28:34Z","message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0","branchLabelMapping":{"^v9.2.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:version","v9.2.0","v9.0.5","v9.1.1","v8.18.5","v8.19.1"],"title":"FTR
- fix esArchiver duplicate doc
ingestion","number":229457,"url":"https://github.com/elastic/kibana/pull/229457","mergeCommit":{"message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0"}},"sourceBranch":"main","suggestedTargetBranches":["9.0","9.1","8.18","8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.2.0","branchLabelMappingKey":"^v9.2.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/229457","number":229457,"mergeCommit":{"message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0"}},{"branch":"9.0","label":"v9.0.5","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"9.1","label":"v9.1.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.18","label":"v8.18.5","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.19","label":"v8.19.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Robert Oskamp <robert.oskamp@elastic.co>
Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
kibanamachine added a commit that referenced this pull request Aug 6, 2025
# Backport

This will backport the following commits from `main` to `8.19`:
- [FTR - fix esArchiver duplicate doc ingestion
(#229457)](#229457)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Robert
Oskamp","email":"robert.oskamp@elastic.co"},"sourceCommit":{"committedDate":"2025-08-06T15:28:34Z","message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0","branchLabelMapping":{"^v9.2.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:version","v9.2.0","v9.0.5","v9.1.1","v8.18.5","v8.19.1"],"title":"FTR
- fix esArchiver duplicate doc
ingestion","number":229457,"url":"https://github.com/elastic/kibana/pull/229457","mergeCommit":{"message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0"}},"sourceBranch":"main","suggestedTargetBranches":["9.0","9.1","8.18","8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.2.0","branchLabelMappingKey":"^v9.2.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/229457","number":229457,"mergeCommit":{"message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0"}},{"branch":"9.0","label":"v9.0.5","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"9.1","label":"v9.1.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.18","label":"v8.18.5","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.19","label":"v8.19.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Robert Oskamp <robert.oskamp@elastic.co>
Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
kibanamachine added a commit that referenced this pull request Aug 6, 2025
# Backport

This will backport the following commits from `main` to `9.0`:
- [FTR - fix esArchiver duplicate doc ingestion
(#229457)](#229457)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Robert
Oskamp","email":"robert.oskamp@elastic.co"},"sourceCommit":{"committedDate":"2025-08-06T15:28:34Z","message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0","branchLabelMapping":{"^v9.2.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:version","v9.2.0","v9.0.5","v9.1.1","v8.18.5","v8.19.1"],"title":"FTR
- fix esArchiver duplicate doc
ingestion","number":229457,"url":"https://github.com/elastic/kibana/pull/229457","mergeCommit":{"message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0"}},"sourceBranch":"main","suggestedTargetBranches":["9.0","9.1","8.18","8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.2.0","branchLabelMappingKey":"^v9.2.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/229457","number":229457,"mergeCommit":{"message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0"}},{"branch":"9.0","label":"v9.0.5","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"9.1","label":"v9.1.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.18","label":"v8.18.5","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.19","label":"v8.19.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Robert Oskamp <robert.oskamp@elastic.co>
Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
kibanamachine added a commit that referenced this pull request Aug 6, 2025
# Backport

This will backport the following commits from `main` to `8.18`:
- [FTR - fix esArchiver duplicate doc ingestion
(#229457)](#229457)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Robert
Oskamp","email":"robert.oskamp@elastic.co"},"sourceCommit":{"committedDate":"2025-08-06T15:28:34Z","message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0","branchLabelMapping":{"^v9.2.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:version","v9.2.0","v9.0.5","v9.1.1","v8.18.5","v8.19.1"],"title":"FTR
- fix esArchiver duplicate doc
ingestion","number":229457,"url":"https://github.com/elastic/kibana/pull/229457","mergeCommit":{"message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0"}},"sourceBranch":"main","suggestedTargetBranches":["9.0","9.1","8.18","8.19"],"targetPullRequestStates":[{"branch":"main","label":"v9.2.0","branchLabelMappingKey":"^v9.2.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/229457","number":229457,"mergeCommit":{"message":"FTR
- fix esArchiver duplicate doc ingestion (#229457)\n\n## Summary\n\nThis
PR fixes duplicate document creation in esArchiver by generating
an\n`_id` for index (non-data-stream, non-time-series) documents that
don't\nhave an id already.\n\n### Details\n\n- Under some circumstances,
the `es-helper-bulk` that is used by\nesArchiver can ingest a duplicate
document (just with different id), see\ninvestigations
[here](#228556)
and\n[here](#223043), also bug
report\n[here](https://github.com/elastic/elasticsearch-js/issues/2924).\n-
With explicitly setting the id, the flakiness didn't show up
anymore,\nwhich matches the expected behavior as of the
[bulk\ndocs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)\n`A
create action fails if a document with the same ID already exists
in\nthe target An index action adds or replaces a document as
necessary.`\n- In order to unblock testing, this PR is actually working
around the\nunderlying problem, which should still be investigated
separately\n\n---------\n\nCo-authored-by: Dzmitry Lemechko
<dzmitry.lemechko@elastic.co>","sha":"42377e498dc7a563367cf1e259ea068e117c9ad0"}},{"branch":"9.0","label":"v9.0.5","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"9.1","label":"v9.1.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.18","label":"v8.18.5","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.19","label":"v8.19.1","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Robert Oskamp <robert.oskamp@elastic.co>
Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
gergoabraham pushed a commit to gergoabraham/kibana that referenced this pull request Aug 7, 2025
## Summary

This PR fixes duplicate document creation in esArchiver by generating an
`_id` for index (non-data-stream, non-time-series) documents that don't
have an id already.

### Details

- Under some circumstances, the `es-helper-bulk` that is used by
esArchiver can ingest a duplicate document (just with different id), see
investigations [here](elastic#228556) and
[here](elastic#223043), also bug report
[here](elastic/elasticsearch-js#2924).
- With explicitly setting the id, the flakiness didn't show up anymore,
which matches the expected behavior as of the [bulk
docs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)
`A create action fails if a document with the same ID already exists in
the target An index action adds or replaces a document as necessary.`
- In order to unblock testing, this PR is actually working around the
underlying problem, which should still be investigated separately

---------

Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
denar50 pushed a commit to denar50/kibana that referenced this pull request Aug 8, 2025
## Summary

This PR fixes duplicate document creation in esArchiver by generating an
`_id` for index (non-data-stream, non-time-series) documents that don't
have an id already.

### Details

- Under some circumstances, the `es-helper-bulk` that is used by
esArchiver can ingest a duplicate document (just with different id), see
investigations [here](elastic#228556) and
[here](elastic#223043), also bug report
[here](elastic/elasticsearch-js#2924).
- With explicitly setting the id, the flakiness didn't show up anymore,
which matches the expected behavior as of the [bulk
docs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)
`A create action fails if a document with the same ID already exists in
the target An index action adds or replaces a document as necessary.`
- In order to unblock testing, this PR is actually working around the
underlying problem, which should still be investigated separately

---------

Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
NicholasPeretti pushed a commit to NicholasPeretti/kibana that referenced this pull request Aug 18, 2025
## Summary

This PR fixes duplicate document creation in esArchiver by generating an
`_id` for index (non-data-stream, non-time-series) documents that don't
have an id already.

### Details

- Under some circumstances, the `es-helper-bulk` that is used by
esArchiver can ingest a duplicate document (just with different id), see
investigations [here](elastic#228556) and
[here](elastic#223043), also bug report
[here](elastic/elasticsearch-js#2924).
- With explicitly setting the id, the flakiness didn't show up anymore,
which matches the expected behavior as of the [bulk
docs](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk)
`A create action fails if a document with the same ID already exists in
the target An index action adds or replaces a document as necessary.`
- In order to unblock testing, this PR is actually working around the
underlying problem, which should still be investigated separately

---------

Co-authored-by: Dzmitry Lemechko <dzmitry.lemechko@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants