Log exceptions in TcpTransport at DEBUG level by DaveCTurner · Pull Request #51612 · elastic/elasticsearch

DaveCTurner · 2020-01-29T13:06:23Z

When running Elasticsearch on a flaky network, we may see nodes leaving the
cluster with reason disconnected. It may be useful to the cluster
administrator to see the full exception that caused the disconnection, but this
is only available with TRACE level logging which commingles the details of
the problem with other messages that are not useful to end users.

This commit promotes logging of exceptions in TcpTransport from TRACE to
DEBUG to separate them from the truly TRACE-level messages.

When running Elasticsearch on a flaky network, we may see nodes leaving the cluster with reason `disconnected`. It may be useful to the cluster administrator to see the full exception that caused the disconnection, but this is only available with `TRACE` level logging which commingles the details of the problem with other messages that are not useful to end users. This commit promotes logging of exceptions in `TcpTransport` from `TRACE` to `DEBUG` to separate them from the truly `TRACE`-level messages.

elasticmachine · 2020-01-29T13:06:26Z

Pinging @elastic/es-distributed (:Distributed/Network)

original-brownbear

Makes sense IMO => LGTM

original-brownbear · 2020-01-29T13:09:27Z

server/src/test/java/org/elasticsearch/transport/TcpTransportTests.java

+            Loggers.removeAppender(LogManager.getLogger(TcpTransport.class), appender);
+            appender.stop();
+            IOUtils.close(tcpTransport);
+            testThreadPool.shutdown();


NIT: Maybe use org.elasticsearch.threadpool.ThreadPool#terminate(java.util.concurrent.ExecutorService, long, java.util.concurrent.TimeUnit) here so we don't get the leaked thread warning when running this one?

TIL. Yes, done in f0e813b.

Tim-Brooks

If we are going to implement these tests can we extract the exception handling to a separate class and test in a manner that does not require overriding the transport? See SecurityHttpExceptionHandler as a place where we extracted exception handling from the http transport.

DaveCTurner · 2020-01-29T17:20:23Z

Not sure of the value of a whole nother class for this, how about a static method? See a62032c.

Tim-Brooks

LGTM

DaveCTurner · 2020-01-30T17:23:32Z

@elasticmachine please run elasticsearch-ci/default-distro

When running Elasticsearch on a flaky network, we may see nodes leaving the cluster with reason `disconnected`. It may be useful to the cluster administrator to see the full exception that caused the disconnection, but this is only available with `TRACE` level logging which commingles the details of the problem with other messages that are not useful to end users. This commit promotes logging of exceptions in `TcpTransport` from `TRACE` to `DEBUG` to separate them from the truly `TRACE`-level messages.

In elastic#51612 we promoted certain log messages regarding unexpected network exceptions from `TRACE` to `DEBUG`. In fact it's often useful to see these exceptions by default, so with this commit we show the message (but not the stack trace) at `INFO` level. This commit also adds some commentary about what each of the exceptions means. Closes elastic#66473

In #51612 we promoted certain log messages regarding unexpected network exceptions from `TRACE` to `DEBUG`. In fact it's often useful to see these exceptions by default, so with this commit we show the message (but not the stack trace) at `INFO` level. This commit also adds some commentary about what each of the exceptions means. Closes #66473

DaveCTurner added >non-issue :Distributed/Network Http and internode communication implementations v8.0.0 v7.7.0 labels Jan 29, 2020

DaveCTurner requested review from Tim-Brooks and original-brownbear January 29, 2020 13:06

original-brownbear approved these changes Jan 29, 2020

View reviewed changes

ThreadPool.terminate

f0e813b

Tim-Brooks requested changes Jan 29, 2020

View reviewed changes

Use a static method to reduce the mockery

a62032c

DaveCTurner requested a review from Tim-Brooks January 29, 2020 17:20

Tim-Brooks approved these changes Jan 30, 2020

View reviewed changes

DaveCTurner merged commit 336a395 into elastic:master Jan 31, 2020

DaveCTurner deleted the 2020-01-29-less-extreme-logging-in-TcpTransport branch January 31, 2020 01:12

DaveCTurner mentioned this pull request Dec 16, 2020

Increase log levels of definitely-broken network disconnects #66473

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

DaveCTurner mentioned this pull request Dec 15, 2021

Report close connection exceptions at INFO #81768

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log exceptions in TcpTransport at DEBUG level#51612

Log exceptions in TcpTransport at DEBUG level#51612
DaveCTurner merged 3 commits intoelastic:masterfrom
DaveCTurner:2020-01-29-less-extreme-logging-in-TcpTransport

DaveCTurner commented Jan 29, 2020

Uh oh!

elasticmachine commented Jan 29, 2020

Uh oh!

original-brownbear left a comment

Uh oh!

original-brownbear Jan 29, 2020

Uh oh!

DaveCTurner Jan 29, 2020

Uh oh!

Tim-Brooks left a comment

Uh oh!

DaveCTurner commented Jan 29, 2020

Uh oh!

Tim-Brooks left a comment

Uh oh!

DaveCTurner commented Jan 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

DaveCTurner commented Jan 29, 2020

Uh oh!

elasticmachine commented Jan 29, 2020

Uh oh!

original-brownbear left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear Jan 29, 2020

Choose a reason for hiding this comment

Uh oh!

DaveCTurner Jan 29, 2020

Choose a reason for hiding this comment

Uh oh!

Tim-Brooks left a comment

Choose a reason for hiding this comment

Uh oh!

DaveCTurner commented Jan 29, 2020

Uh oh!

Tim-Brooks left a comment

Choose a reason for hiding this comment

Uh oh!

DaveCTurner commented Jan 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants