Skip to content

Implement Fast Forward Grace Close Feature to Prevent Data Loss#5203

Merged
renecannao merged 9 commits intov3.0from
v3.0_FastForwardGracefulClose
Nov 20, 2025
Merged

Implement Fast Forward Grace Close Feature to Prevent Data Loss#5203
renecannao merged 9 commits intov3.0from
v3.0_FastForwardGracefulClose

Conversation

@renecannao
Copy link
Contributor

Summary

This pull request implements the fast forward grace close feature in ProxySQL. The feature prevents data loss by allowing pending client output buffers to drain before closing sessions when the backend closes unexpectedly during fast forward mode.

Problem

In fast forward mode, ProxySQL buffers packets to minimize latency. However, if the backend connection closes unexpectedly while client data is still being transmitted, the remaining data in the pipeline can be lost because the session terminates immediately. This can lead to incomplete data transmission and potential data loss in high-throughput scenarios.

Solution

Introduce a configurable grace period where, upon detecting an unexpected backend closure in fast forward mode, ProxySQL defers session closure to allow all pending client output buffers to drain. If the buffers empty within the grace timeout, the session closes cleanly; otherwise, it closes after the timeout to prevent indefinite hanging.

Implementation Details

New Configuration Variable

  • mysql_thread___fast_forward_grace_close_ms: Controls the grace period timeout (0-3600000 ms, default 5000 ms). A value of 0 disables the feature.

New Flags and State

  • Session flags: backend_closed_in_fast_forward and fast_forward_grace_start_time to track grace state and timer.
  • Data stream flag: defer_close_due_to_fast_forward to prevent immediate closure.

Core Logic Changes

  • MySQL_Data_Stream::read_from_net(): Detects backend EOF/POLLHUP in fast forward mode. If client output buffers are pending, sets grace flags and defers closure.
  • MySQL_Session::handler() (FAST_FORWARD state): Checks for backend closure, starts grace timer, and waits for buffers to drain or timeout.
  • MySQL_Data_Stream::set_pollout(): Adjusts polling during grace to avoid busy-waiting on closed sockets.

Code Flow

  1. Backend closes unexpectedly → EOF detected.
  2. Check if in fast forward and buffers pending → Initiate grace.
  3. Start timer, defer close.
  4. Wait for buffers to empty or timeout → Close session.

Testing

  • TAP Test: fast_forward_grace_close.cpp generates large binlog data on the backend, connects via ProxySQL, and reads it with throttling to trigger fast forward mode closure.
  • Validation: Ensures binlog reading completes without errors. Full validation of grace close requires simulating very slow client.
  • Integration: Added test to test/tap/groups/groups.json for inclusion in the TAP suite.
  • Compilation: Updated test/tap/tests/Makefile to link with MySQL client library for binlog APIs.

Files Changed

  • include/MySQL_Data_Stream.h: Added defer_close_due_to_fast_forward flag.
  • include/MySQL_Session.h: Added session flags.
  • include/MySQL_Thread.h: Added variable declaration.
  • include/proxysql_structs.h: Added variable extern.
  • lib/MySQL_Session.cpp: Grace logic in FAST_FORWARD handler.
  • lib/MySQL_Thread.cpp: Variable initialization and refresh.
  • lib/mysql_data_stream.cpp: EOF detection and polling management.
  • test/tap/tests/fast_forward_grace_close.cpp: New TAP test with extensive documentation.
  • test/tap/groups/groups.json: Added test entry.
  • test/tap/tests/Makefile: Updated for test compilation.

Backwards Compatibility

  • Fully backwards compatible: Feature defaults to enabled with 5000ms timeout.
  • Only affects fast forward mode; no impact on other modes.
  • Disabling (set to 0) restores original behavior.

Related Commits

  • 44aa606: Implement fast forward grace close feature to prevent data loss
  • ae93966: Add TAP test for fast forward grace close feature
  • dd60d26: Add fast_forward_grace_close test to groups.json
  • 846c4de: Replace spaces with tabs in fast_forward_grace_close.cpp
  • 3329a67: Add extensive documentation for fast forward grace close feature

This implementation is ready for review and addresses the data loss issue in fast forward mode while maintaining performance and compatibility.

Problem: In fast forward mode, ProxySQL forwards packets directly from client
to backend without buffering them. If the backend connection closes
unexpectedly (e.g., due to server crash, network failure, or other issues),
ProxySQL immediately closes the client session. This can result in data loss
because the client may have sent additional data that hasn't been fully
transmitted yet, as ProxySQL does not wait for the output buffers to drain.

Solution: Implement a configurable grace period for session closure in fast
forward mode. When the backend closes unexpectedly, instead of closing the
session immediately, ProxySQL waits for a configurable timeout
(fast_forward_grace_close_ms, default 5000ms) to allow any pending client
output data to be sent. During this grace period:
- If the client output buffers become empty, the session closes gracefully.
- If the timeout expires, the session closes anyway to prevent indefinite
  hanging.

Changes:
- Added global variable mysql_thread___fast_forward_grace_close_ms (0-3600000ms)
- Added session flags: backend_closed_in_fast_forward, fast_forward_grace_start_time
- Added data stream flag: defer_close_due_to_fast_forward
- Modified MySQL_Data_Stream::read_from_net() to detect backend EOF and initiate
  grace close if client buffers are not empty
- Modified MySQL_Session::handler() FAST_FORWARD case to implement grace close
  logic with timeout and buffer checks
- Added extensive inline documentation explaining the feature and its mechanics

This prevents data loss in fast forward scenarios while maintaining bounded
session lifetime.
- Rename and modify test to use MySQL C API mysql_binlog_* functions
- Implement throttled binlog reading with 5 iterations (no limit, 2s, 5s, 20s, 60s targets)
- Add diagnostics for debugging binlog fetch issues
- Set RPL options for file, position, server_id, and non-blocking flag
- Update Makefile to compile with MySQL client library
@gemini-code-assist
Copy link

Summary of Changes

Hello @renecannao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical data loss scenario in ProxySQL's fast forward mode. Previously, if a backend connection closed unexpectedly while client data was still being transmitted, the session would terminate immediately, leading to incomplete data transmission. The new "Fast Forward Grace Close" feature introduces a configurable grace period. During this period, ProxySQL defers the session closure, allowing any pending client output buffers to fully drain. The session will close either when all buffers are empty or after the configured grace timeout, ensuring data integrity in high-throughput environments.

Highlights

  • Data Loss Prevention: Introduces a "Fast Forward Grace Close" mechanism to prevent data loss when a backend connection unexpectedly closes during fast forward mode.
  • Configurable Grace Period: A new configuration variable, "mysql_thread___fast_forward_grace_close_ms", allows defining a grace period (0-3600000 ms, default 5000 ms) for sessions to drain pending client output buffers.
  • Core Logic Integration: Modifies "MySQL_Data_Stream::read_from_net()", "MySQL_Session::handler()", and "MySQL_Data_Stream::set_pollout()" to detect backend closure, initiate the grace timer, and manage polling to allow buffers to drain.
  • New Session State Flags: Adds "backend_closed_in_fast_forward" and "fast_forward_grace_start_time" to "MySQL_Session", and "defer_close_due_to_fast_forward" to "MySQL_Data_Stream" to manage the grace period state.
  • Comprehensive Testing: Includes a new TAP test ("fast_forward_grace_close.cpp") that simulates binlog data transmission with throttling to validate the grace close functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable feature to prevent data loss in fast-forward mode by implementing a grace period for session closure. The implementation is mostly sound, with new configuration variables, session state, and logic to handle the grace period. The inclusion of a TAP test is also a great addition. My review focuses on improving maintainability by reducing code duplication and removing leftover debug code to ensure the implementation is robust and clean for production use.

if (mybe->server_myds->status == MYSQL_SERVER_STATUS_OFFLINE_HARD || mybe->server_myds->fd == -1) {
if (!backend_closed_in_fast_forward) {
backend_closed_in_fast_forward = true;
cerr << __FILE__ << ":" << __LINE__ << " grace_start_time from " << fast_forward_grace_start_time << " to " << thread->curtime << endl;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This debug print to cerr should be removed. Such statements can negatively impact performance and clutter logs in production environments, and should not be present in release code.

 

Comment on lines +3805 to +3817
if (backend_closed_in_fast_forward) {
if (
( mybe->server_myds == nullptr || ( mybe->server_myds && mybe->server_myds->PSarrayIN->len == 0 ) )
&&
(client_myds->PSarrayOUT->len == 0 && (client_myds->queueOUT.head - client_myds->queueOUT.tail) == 0)
) {
// buffers empty, close
handler_ret = -1;
} else if (thread->curtime - fast_forward_grace_start_time > (unsigned long long)mysql_thread___fast_forward_grace_close_ms * 1000) {
// timeout, close
handler_ret = -1;
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This grace period logic for checking if buffers are empty or if a timeout has occurred is very similar to the logic in MySQL_Thread::process_data_on_data_stream. Duplicating this complex logic in multiple places increases the risk of bugs and makes maintenance harder. Consider creating a single helper function within the MySQL_Session class to encapsulate this check and call it from both locations.

Comment on lines +589 to +599
if (myds_type == MYDS_BACKEND && sess && sess->session_fast_forward && ssl_recv_bytes==0) {
if (PSarrayIN->len > 0 || sess->client_myds->PSarrayOUT->len > 0 || queue_data(sess->client_myds->queueOUT) > 0) {
if (sess->backend_closed_in_fast_forward == false) {
sess->backend_closed_in_fast_forward = true;
//cerr << __FILE__ << ":" << __LINE__ << " grace_start_time from " << sess->fast_forward_grace_start_time << " to " << sess->thread->curtime << endl;
sess->fast_forward_grace_start_time = sess->thread->curtime;
sess->client_myds->defer_close_due_to_fast_forward = true;
}
//return 0;
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to initiate the fast forward grace period is duplicated in three places within this function (for SSL EOF, non-SSL EOF, and POLLHUP). This makes the code harder to maintain. Consider refactoring this logic into a private helper function to avoid repetition.

Additionally, the commented-out //return 0; on line 597 (and in the other duplicated blocks) is confusing. If it's a remnant of a previous implementation, it should be removed to improve clarity.

//return 0;
}
}
proxy_debug(PROXY_DEBUG_NET, 5, "Received EOF, shutting down soft socket -- Session=%p, Datastream=%p", sess, this);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The format string for this proxy_debug call is missing a newline character (\n) at the end. This can cause log entries to be improperly formatted or interleaved with subsequent log messages. Please add a \n to the end of the format string. A similar issue exists on line 676.

proxy_debug(PROXY_DEBUG_NET, 5, "Received EOF, shutting down soft socket -- Session=%p, Datastream=%p\n", sess, this);

@renecannao
Copy link
Contributor Author

Retest this please

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
2 Security Hotspots

See analysis details on SonarQube Cloud

@renecannao renecannao merged commit 29a6b85 into v3.0 Nov 20, 2025
93 of 99 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant