feat: improve error handling by NguyenHoangSon96 · Pull Request #140 · InfluxCommunity/influxdb3-python

NguyenHoangSon96 · 2025-05-21T02:49:25Z

Closes #134

Proposed Changes

Query api will throws InfluxdbClientQueryError when catching ArrowException exceptions from gRPC servers
I want to create InfluxdbClientWriteError but it will break backwards compatibility for the write api If users already use InfluxDBError class
Some try catch blocks quite useless If we just catch generic exceptions like Exception class then throws - I think

Checklist

codecov · 2025-05-21T03:26:57Z

Codecov Report

Attention: Patch coverage is 96.15385% with 1 line in your changes missing coverage. Please review.

Project coverage is 63.43%. Comparing base (cd55bcd) to head (e9b7ce4).
Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
...write_client/client/util/multiprocessing_helper.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #140      +/-   ##
==========================================
- Coverage   64.70%   63.43%   -1.27%     
==========================================
  Files          33       34       +1     
  Lines        2207     2210       +3     
==========================================
- Hits         1428     1402      -26     
- Misses        779      808      +29

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

karel-rehor

Looking at #134 I think the new classes InfluxdDBClientError and InfluxdbClientQueryError are a good start, but oclyke is looking for a taxonomy of more specific error definitions. Also class names could include the version 3 value to make sure they are distinct from other influxdb libraries.

Compare this to the Arrow Flight taxonomy of error classes. (https://arrow.apache.org/docs/python/generated/pyarrow.flight.FlightError.html).

Firstly from oclyke's comment I think the user does not want to see these Arrow Flight errors or to deal with them.

Secondly also from oclyke's comment I think the user wants to see an error class that is self documenting or self explanatory so that it is readily understood.

Thirdly we could implement a taxonomy similar to that of Arrow, which would wrap specific Arrow exceptions and their contents.

e.g.

class InfluxDB3ClientQueryServerError(InfluxDB3ClientError):
...
class InfluxDB3ClientQueryServerUnkownDBError(InfluxDB3ClientQueryServerError):
...
class InfluxDB3ClientQueryUnauthenticatedError(InfluxDB3ClientError):
...
class InfluxDB3ClientQueryUnavailableError(InfluxDB3ClientError):
...
etc.

We could then maybe have an ArrowErrorHandler class with a static method to map caught Arrow errors to classes in our error taxonomy

karel-rehor

An item for a more finely defined error taxonomy across all client libraries has been opened in the backlog. So this PR represents only a partial solution to #140
1 older unit test was failing locally, so I've updated it. You may want to review or revert these changes.
Some refactoring comments
1. Since there already exists an InfluxDBError class in the write_client package, I think this needs to be taken into account.
2. Since InfluxdbClientQueryError only deals with query errors, it could be declared in the query package
3. another alternative might be to declare all exceptions/errors in a new exceptions package.

BTW. When running unit tests locally, even though all tests now pass, once they are finished, I keep running across a threading issue associated with urllib3 after the tests complete. I don't recall seeing this in the past. Needs further investigation.

...
RuntimeError: can't create new thread at interpreter shutdown
The retriable error occurred during request. Reason: '<urllib3.connection.HTTPSConnection object at 0x71b678a54b90>: Failed to resolve 'none' ([Errno -3] Temporary failure in name resolution)'.
The retriable error occurred during request. Reason: '<urllib3.connection.HTTPSConnection object at 0x71b678a57e00>: Failed to resolve 'none' ([Errno -3] Temporary failure in name resolution)'.
...

influxdb_client_3/influxdb_client_error.py

NguyenHoangSon96 · 2025-05-27T03:21:22Z

An item for a more finely defined error taxonomy across all client libraries has been opened in the backlog. So this PR represents only a partial solution to feat: improve error handling #140

1 older unit test was failing locally, so I've updated it. You may want to review or revert these changes.

Some refactoring comments

Since there already exists an InfluxDBError class in the write_client package, I think this needs to be taken into account.

Since InfluxdbClientQueryError only deals with query errors, it could be declared in the query package

another alternative might be to declare all exceptions/errors in a new exceptions package.

BTW. When running unit tests locally, even though all tests now pass, once they are finished, I keep running across a threading issue associated with urllib3 after the tests complete. I don't recall seeing this in the past. Needs further investigation.
...
RuntimeError: can't create new thread at interpreter shutdown
The retriable error occurred during request. Reason: '<urllib3.connection.HTTPSConnection object at 0x71b678a54b90>: Failed to resolve 'none' ([Errno -3] Temporary failure in name resolution)'.
The retriable error occurred during request. Reason: '<urllib3.connection.HTTPSConnection object at 0x71b678a57e00>: Failed to resolve 'none' ([Errno -3] Temporary failure in name resolution)'.
...

Hi @karel-rehor
I have moved all exceptions to the exception package
Regarding the InfluxDBError class I don't want to touch it because of backward compatibility
And I changed all exception names to include the number 3

karel-rehor

The package exceptions could be improved with an import statement in __init__.py.

It's best to avoid, whenever possible, imports that repeat the same token twice, like...

from influxdb_client_3.exceptions.exceptions import InfluxDB3ClientQueryError

influxdb_client_3/exceptions/__init__.py

karel-rehor

I located the source of the problem with urllib3 connection attempts cycling in _write_batching, so I updated this test with a forced disposal.

RuntimeError: can't create new thread at interpreter shutdown
The retriable error occurred during request. Reason: '<urllib3.connection.HTTPSConnection object at 0x79d5a779a780>: Failed to resolve 'none' ([Errno -3] Temporary failure in name resolution)'.
The retriable error occurred during request. Reason: '<urllib3.connection.HTTPSConnection object at 0x79d5a7799520>: Failed to resolve 'none' ([Errno -3] Temporary failure in name resolution)'.
The retriable error occurred during request. Reason: '<urllib3.connection.HTTPSConnection object at 0x79d5a779a630>: Failed to resolve 'none' ([Errno -3] Temporary failure in name resolution)'.
The retriable error occurred during request. Reason: '<urllib3.connection.HTTPSConnection object at 0x79d5a77986b0>: Failed to resolve 'none' ([Errno -3] Temporary failure in name resolution)'.
The batch item wasn't processed successfully because: HTTPSConnectionPool(host='none', port=443): Max retries exceeded with url: /api/v2/write?org=org&bucket=bucket&precision=ns (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x79d5a77aa7e0>: Failed to resolve 'none' ([Errno -3] Temporary failure in name resolution)"))

Test pipeline now takes 2 minutes instead of 5 minutes.

I see code coverage is missing something now in write/retry.py. This is likely a side-effect of this change. Retry was being called after the test_write_api_custom_options_no_error() test had completed. Execution of these code blocks in retry.py was inadvertent, because the test is supposed to be testing only options. A specific retry test for these dropped blocks could be added. But that now goes too far beyond the scope of this PR.

NguyenHoangSon96 · 2025-05-27T13:59:52Z

@karel-rehor Thank you 👨‍🔬

NguyenHoangSon96 added 3 commits May 20, 2025 16:02

feat: improve error handling

e3f8f9d

feat: improve error handling

b0794da

feat: improve error handling

b0ab8d8

NguyenHoangSon96 self-assigned this May 21, 2025

NguyenHoangSon96 linked an issue May 21, 2025 that may be closed by this pull request

Client exception handling improvements #134

Closed

NguyenHoangSon96 added 2 commits May 21, 2025 10:02

feat: improve error handling

3acc8af

feat: improve error handling

71c56eb

NguyenHoangSon96 requested review from jansimonb and karel-rehor May 21, 2025 03:34

karel-rehor reviewed May 21, 2025

View reviewed changes

karel-rehor added 2 commits May 26, 2025 16:14

docs: minor edits in CHANGELOG.md

c107de0

test: fix locally failing test with envars.

88d07b6

karel-rehor requested changes May 26, 2025

View reviewed changes

influxdb_client_3/influxdb_client_error.py Outdated Show resolved Hide resolved

influxdb_client_3/influxdb_client_error.py Outdated Show resolved Hide resolved

feat: improve error handling

cf4f6ba

NguyenHoangSon96 requested a review from karel-rehor May 27, 2025 03:21

karel-rehor requested changes May 27, 2025

View reviewed changes

influxdb_client_3/exceptions/__init__.py Show resolved Hide resolved

feat: improve error handling

a5f0a77

NguyenHoangSon96 requested a review from karel-rehor May 27, 2025 09:12

test: fix retry thread cycling in _write_batching unit test

e9b7ce4

karel-rehor approved these changes May 27, 2025

View reviewed changes

NguyenHoangSon96 merged commit 39bbea9 into main May 27, 2025
13 of 14 checks passed

NguyenHoangSon96 deleted the feat/improve-error-handling branch May 27, 2025 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve error handling#140

feat: improve error handling#140
NguyenHoangSon96 merged 10 commits intomainfrom
feat/improve-error-handling

NguyenHoangSon96 commented May 21, 2025 •

edited

Loading

Uh oh!

codecov bot commented May 21, 2025 •

edited

Loading

Uh oh!

karel-rehor left a comment

Uh oh!

karel-rehor left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

NguyenHoangSon96 commented May 27, 2025

Uh oh!

karel-rehor left a comment

Uh oh!

Uh oh!

karel-rehor left a comment

Uh oh!

NguyenHoangSon96 commented May 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

NguyenHoangSon96 commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed Changes

Checklist

Uh oh!

codecov bot commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

karel-rehor left a comment

Choose a reason for hiding this comment

Uh oh!

karel-rehor left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

NguyenHoangSon96 commented May 27, 2025

Uh oh!

karel-rehor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

karel-rehor left a comment

Choose a reason for hiding this comment

Uh oh!

NguyenHoangSon96 commented May 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NguyenHoangSon96 commented May 21, 2025 •

edited

Loading

codecov bot commented May 21, 2025 •

edited

Loading

karel-rehor left a comment •

edited

Loading