-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[improve][pip] PIP-433: Optimize the conflicts of the replication and automatic creation mechanisms, including the automatic creation of topics and schemas #24485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The PIP also solves the issue #24417 |
nodece
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1
This PIP attempts to work around user misconfigurations rather than enforce proper system boundaries. As a result, it introduces hidden complexity and encourages fragile, non-transparent behavior. The main issues are:
-
Replicator relies on implicit topic auto-creation on the remote cluster, assuming the auto-creation policies (like number of partitions and topic type) match. But this does not solve the root problem. Remote cluster policies may change at any time, and relying on them does not eliminate the risk of topic-type mismatches.
-
Bypassing schema auto-update restrictions for replication is a workaround, not a proper solution.
If a remote cluster disables schema auto-updates, it reflects an explicit user decision to reject schema changes. Replicator should respect this setting. Injecting schemas automatically breaks the data contract and risks corrupting consumers.
A safer and more transparent approach would be to allow users to manually reset the cursor to a compatible schema point if they wish to resume replication. -
Automatically modifying remote cluster policies is a dangerous design.
It violates cluster isolation and can introduce unintended side effects. In large-scale, multi-tenant deployments, such implicit adjustments are risky and potentially disruptive. -
Instead of using automatic adjustments or bypass logic, replication should require explicit provisioning of topics and schemas on both clusters.
This keeps behavior predictable, under user control, and consistent with system boundaries.
In short, this PIP prioritizes tolerating misconfigurations over enforcing correct and safe behavior. That may offer short-term convenience, but it makes the system harder to reason about and less reliable over time.
A better direction would be to improve observability of the replicator (e.g., metrics, failure visibility), and help users fix configuration issues explicitly and safely.
Co-authored-by: Penghui Li <penghui@apache.org>
|
Could you rebase to master so the checkstyle for tests will be applied? |
Seems this comment is in order to leave under other PRs? |
… automatic creation mechanisms, including the automatic creation of topics and schemas (apache#24485) Co-authored-by: Penghui Li <penghui@apache.org>
… automatic creation mechanisms, including the automatic creation of topics and schemas (apache#24485) Co-authored-by: Penghui Li <penghui@apache.org>
Motivation & Modifications
Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: x