Skip to content

[BUG] Fix io_context lifecycle management issues (#400)#410

Merged
kcenon merged 4 commits into
mainfrom
feature/issue-400-io-context-lifecycle
Jan 9, 2026
Merged

[BUG] Fix io_context lifecycle management issues (#400)#410
kcenon merged 4 commits into
mainfrom
feature/issue-400-io-context-lifecycle

Conversation

@kcenon

@kcenon kcenon commented Jan 8, 2026

Copy link
Copy Markdown
Owner

Summary

Changes

The intentional leak pattern (no-op deleter) already applied to messaging_client's io_context prevents heap corruption during static destruction. This fix removes the CI environment skips that were previously necessary.

Test Changes

Removed CI skips from:

  • ErrorHandlingTest.ConnectToInvalidHost
  • ErrorHandlingTest.ConnectToInvalidPort
  • ErrorHandlingTest.ConnectionRefused
  • ErrorHandlingTest.RecoveryAfterConnectionFailure
  • ConnectionLifecycleTest.ClientConnectionToNonExistentServer
  • MultiConnectionLifecycleTest.SequentialConnections

Documentation

  • Updated CHANGELOG.md with fixed item entry
  • Updated CHANGELOG_KO.md with Korean translation

Test Results

All tests pass with CI=true environment variable:

[  PASSED  ] 16 tests. (error_handling_test)
[  PASSED  ] 17 tests. (connection_lifecycle_test)

Root Cause Analysis

The io_context lifecycle issues were caused by:

  1. Static destruction order during process exit
  2. GlobalLoggerRegistry (common_system) being destroyed before thread pool tasks complete
  3. Callbacks attempting to log after logging system was destroyed

The intentional leak pattern applied in messaging_client ensures:

  • io_context survives until process termination
  • No heap corruption when pending async handlers reference io_context internals
  • Safe callback execution during shutdown

Test plan

  • Build passes locally
  • All integration tests pass without CI skip
  • Tests pass with CI=true environment variable
  • No new warnings introduced

Closes #400

kcenon added 2 commits January 8, 2026 16:44
The intentional leak pattern applied in messaging_client now prevents
heap corruption during static destruction. This allows the previously
skipped tests to run safely in CI environments.

Changes:
- Remove TODO comments referencing Issues #315 and #348
- Remove GTEST_SKIP() for io_context lifecycle issues in CI
- Update comments to reference Issue #400 and the applied fix
- Tests now pass with CI=true environment variable

Affected tests:
- ErrorHandlingTest.ConnectToInvalidHost
- ErrorHandlingTest.ConnectToInvalidPort
- ErrorHandlingTest.ConnectionRefused
- ErrorHandlingTest.RecoveryAfterConnectionFailure
- ConnectionLifecycleTest.ClientConnectionToNonExistentServer
- MultiConnectionLifecycleTest.SequentialConnections
Document the re-enabled integration tests in both English and Korean
CHANGELOG files under the Fixed section.
@github-actions

github-actions Bot commented Jan 8, 2026

Copy link
Copy Markdown
Contributor

Performance Comparison

Base Branch Results

No base results

PR Branch Results

No PR results

kcenon added 2 commits January 9, 2026 23:57
Add wait_for_connection_attempt helper to test_helpers.h to ensure
async connection operations (resolve/connect) complete before test
cleanup. This prevents heap corruption that occurs when:

1. Test initiates async connection to non-existent server
2. Test exits immediately without waiting
3. TearDown calls stop_client() while async operations are in progress
4. Pending handlers access invalidated memory during cleanup

Modified tests:
- ErrorHandlingTest.ConnectToInvalidHost
- ErrorHandlingTest.ConnectToInvalidPort
- ErrorHandlingTest.ConnectionRefused
- ErrorHandlingTest.RecoveryAfterConnectionFailure
- ConnectionLifecycleTest.ClientConnectionToNonExistentServer
- MultiConnectionLifecycleTest.SequentialConnections

The fix uses error_callback to detect connection failures and waits
for either successful connection or error callback before allowing
test cleanup to proceed.

Fixes heap corruption in CI (Issue #400)
Document the test helper function and test modifications that prevent
heap corruption from incomplete async connection operations.
@github-actions

github-actions Bot commented Jan 9, 2026

Copy link
Copy Markdown
Contributor

Performance Comparison

Base Branch Results

No base results

PR Branch Results

No PR results

@kcenon kcenon merged commit a4d8129 into main Jan 9, 2026
41 checks passed
@kcenon kcenon deleted the feature/issue-400-io-context-lifecycle branch January 9, 2026 15:22
kcenon added a commit that referenced this pull request Apr 13, 2026
* [TEST] Remove io_context lifecycle CI skips (Issue #400)

The intentional leak pattern applied in messaging_client now prevents
heap corruption during static destruction. This allows the previously
skipped tests to run safely in CI environments.

Changes:
- Remove TODO comments referencing Issues #315 and #348
- Remove GTEST_SKIP() for io_context lifecycle issues in CI
- Update comments to reference Issue #400 and the applied fix
- Tests now pass with CI=true environment variable

Affected tests:
- ErrorHandlingTest.ConnectToInvalidHost
- ErrorHandlingTest.ConnectToInvalidPort
- ErrorHandlingTest.ConnectionRefused
- ErrorHandlingTest.RecoveryAfterConnectionFailure
- ConnectionLifecycleTest.ClientConnectionToNonExistentServer
- MultiConnectionLifecycleTest.SequentialConnections

* docs: update CHANGELOG for io_context lifecycle fix (#400)

Document the re-enabled integration tests in both English and Korean
CHANGELOG files under the Fixed section.

* fix(test): wait for async connection attempts to complete before cleanup

Add wait_for_connection_attempt helper to test_helpers.h to ensure
async connection operations (resolve/connect) complete before test
cleanup. This prevents heap corruption that occurs when:

1. Test initiates async connection to non-existent server
2. Test exits immediately without waiting
3. TearDown calls stop_client() while async operations are in progress
4. Pending handlers access invalidated memory during cleanup

Modified tests:
- ErrorHandlingTest.ConnectToInvalidHost
- ErrorHandlingTest.ConnectToInvalidPort
- ErrorHandlingTest.ConnectionRefused
- ErrorHandlingTest.RecoveryAfterConnectionFailure
- ConnectionLifecycleTest.ClientConnectionToNonExistentServer
- MultiConnectionLifecycleTest.SequentialConnections

The fix uses error_callback to detect connection failures and waits
for either successful connection or error callback before allowing
test cleanup to proceed.

Fixes heap corruption in CI (Issue #400)

* docs: update CHANGELOG with wait_for_connection_attempt helper

Document the test helper function and test modifications that prevent
heap corruption from incomplete async connection operations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Fix io_context lifecycle management issues

1 participant