Skip to content

Disable deprecation log indexing until templates are loaded#80406

Merged
pgomulka merged 14 commits intoelastic:masterfrom
pgomulka:main/bulk_processor_flush_delay
Nov 9, 2021
Merged

Disable deprecation log indexing until templates are loaded#80406
pgomulka merged 14 commits intoelastic:masterfrom
pgomulka:main/bulk_processor_flush_delay

Conversation

@pgomulka
Copy link
Copy Markdown
Contributor

@pgomulka pgomulka commented Nov 5, 2021

Deprecation log indexing relies on index templates to be present in
cluster state in order to create a data stream.
If a very early deprecation is emitted after node is started up and
templates are not loaded yet this can result in regular index with
default settings and mappings to be created.
To prevent this a bulk processor used for indexing deprecation logs
delays flushing until templates are loaded.

closes #80265

Deprecation log indexing relies on index templates to be present in
cluster state in order to create a data stream.
If a very early deprecation is emitted after node is started up and
templates are not loaded yet this can result in regular index with
default settings and mappings to be created.
To prevent this a bulk processor used for indexing deprecation logs
delays flushing.

closes elastic#80265
@pgomulka pgomulka marked this pull request as ready for review November 5, 2021 14:56
@elasticmachine elasticmachine added Team:Core/Infra Meta label for core/infra team Team:Data Management (obsolete) DO NOT USE. This team no longer exists. labels Nov 5, 2021
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

apply plugin: 'elasticsearch.build'
tasks.named("test").configure { enabled = false }

dependencies {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mark-vieira would you be able to review this PR from gradle point of view?
I wasn't too sure if I should use api or implementation for dependency? do we have a readme for this?
also do I have to copy license files here as well?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Licenses are only required once, you don't need them every time you use a dependency. We favor implementation as those dependencies are not transitively inherited by the compile classpath of consuming projects. Since this is a QA project it's mostly irrelevant (there are no consumers) but I'd still leave it all implementation here.

Copy link
Copy Markdown
Contributor Author

@pgomulka pgomulka Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I remove the licence folder I get

Execution failed for task ':x-pack:plugin:deprecation:qa:common:dependencyLicenses'.
> Licences dir /Users/przemek/workspace/pgomulka/elasticsearch/x-pack/plugin/deprecation/qa/common/licenses does not exist, but there are dependencies: jackson-annotations-2.10.4.jar
  jackson-databind-2.10.4.jar

I wonder if I miss any plugin or import?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You shouldn't be adding the elasticsearch.build plugin to QA projects. That's specifically for production code. What is this "common" project for anyhow? I don't see any actual code belonging to it?

Copy link
Copy Markdown
Contributor Author

@pgomulka pgomulka Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this "common" project for anyhow?

it has only one class DeprecationTestUtils - I felt that copying that code would be a smell.
The problem boils down to the fact that EarlyDeprecationIndexingIT has to be first running against a fresh test cluster - and hence it was easiest to move it to its own project

I removed the elasticsearch.build but I had to add elasticsearch.java otherwise gradle would complain about implementation

Could not find method implementation() for arguments

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, elasticsearch.java is the correct choice here.

@@ -0,0 +1,41 @@
import org.elasticsearch.gradle.util.GradleUtils
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is basically a copy of x-pack/plugin/deprecation/qa/rest/build.gradle
can I move shared bits to common module?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to create that module, so that I could make sure that EarlyDeprecationIndexingIT is the test that runs first thing after test cluster starts

Copy link
Copy Markdown
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left 2 questions. Other than that this looks good to me.

private String globalIndex;
private String globalRouting;
private String globalPipeline;
private Supplier<Boolean> flushCondition = () -> true;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


package org.elasticsearch.xpack.deprecation;

import com.fasterxml.jackson.databind.JsonNode;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is databind being used here? I believe in mast of the rest integration test we use XContentMapValues, could that be used here as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, XContentMapValues will make it much cleaner

@pgomulka
Copy link
Copy Markdown
Contributor Author

pgomulka commented Nov 8, 2021

@elasticmachine run elasticsearch-ci/bwc

@pgomulka
Copy link
Copy Markdown
Contributor Author

pgomulka commented Nov 8, 2021

@elasticmachine update branch

@pgomulka pgomulka requested a review from martijnvg November 9, 2021 10:27

@Override
public void clusterChanged(ClusterChangedEvent event) {
if (flushEnabled.get() == false) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of wrapping multiple if statements, maybe do this:

if (flushEnabled.get()) {
   return;
}

if (event.metadataChanged() == false) {
   return;
}

// rest of the code

I think this is more readable.

Copy link
Copy Markdown
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pgomulka, LGTM 👍

Response response = getIndexSettings();
ObjectMapper mapper = new ObjectMapper();

final JsonNode jsonNode = mapper.readTree(response.getEntity().getContent());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the JsonNode usage here as well? And use XContentMapValues ?

@pgomulka pgomulka added auto-backport Automatically create backport pull requests when merged and removed auto-backport Automatically create backport pull requests when merged labels Nov 9, 2021
@pgomulka pgomulka merged commit b48c078 into elastic:master Nov 9, 2021
pgomulka added a commit to pgomulka/elasticsearch that referenced this pull request Nov 9, 2021
…80406)

Deprecation log indexing relies on index templates to be present in
cluster state in order to create a data stream.
If a very early deprecation is emitted after node is started up and
templates are not loaded yet this can result in regular index with
default settings and mappings to be created.
To prevent this a bulk processor used for indexing deprecation logs
delays flushing until templates are loaded.

closes elastic#80265
pgomulka added a commit to pgomulka/elasticsearch that referenced this pull request Nov 9, 2021
…80406)

Deprecation log indexing relies on index templates to be present in
cluster state in order to create a data stream.
If a very early deprecation is emitted after node is started up and
templates are not loaded yet this can result in regular index with
default settings and mappings to be created.
To prevent this a bulk processor used for indexing deprecation logs
delays flushing until templates are loaded.

closes elastic#80265
pgomulka added a commit that referenced this pull request Nov 10, 2021
…80406) (#80529)

Deprecation log indexing relies on index templates to be present in
cluster state in order to create a data stream.
If a very early deprecation is emitted after node is started up and
templates are not loaded yet this can result in regular index with
default settings and mappings to be created.
To prevent this a bulk processor used for indexing deprecation logs
delays flushing until templates are loaded.

closes #80265
backports #80406
pgomulka added a commit that referenced this pull request Nov 10, 2021
…ckport(#80406) (#80345)

Deprecation log indexing relies on index templates to be present in
cluster state in order to create a data stream.
If a very early deprecation is emitted after node is started up and
templates are not loaded yet this can result in regular index with
default settings and mappings to be created.
To prevent this a bulk processor used for indexing deprecation logs
delays flushing until templates are loaded.

closes #80265
backport #80406
@pgomulka pgomulka added the >bug label Nov 10, 2021
@pgomulka pgomulka deleted the main/bulk_processor_flush_delay branch July 25, 2022 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Core/Infra/Logging Log management and logging utilities Team:Core/Infra Meta label for core/infra team Team:Data Management (obsolete) DO NOT USE. This team no longer exists. v7.16.0 v8.0.0-rc2 v8.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deprecation log datastream is missing settings after upgrade

5 participants