Skip to content

Conversation

@efcasado
Copy link
Contributor

@efcasado efcasado commented Sep 6, 2025

Fixes #24333

Motivation

The version of Debezium supported by the project is very old. Debezium 1.9.7 was released in October 25th 2022 and it is lagging two major versions behind the latest stable version, ie. Debezium 3.2.2, which was released in September 4th 2025.

The main motivation behind this change is to bridge this gap and offer the Pulsar community a more recent version of Debezium that can run natively on Pulsar without resorting to external dependencies such as debezium-server.

Modifications

  • Debezium has been updated to version 3.2.2 and all its dependencies, such as database drivers, have been updated to match the versions mentioned here.
  • In 2.x, Debezium renamed database.history to schema.history (see here)
  • Debezium 2.x changed a lot of connector property names (see here). Tests have been updated accordingly to reflect this change while preserving the previous semantics as much as possible.
  • MongoDB 3.x and Oplog are no longer supported
  • The mongo command line is no longer available in newer MongoDB versions and has been replaced by mongosh. Tests have been updated accordingly.
  • SQL Server connector’s configuration option database.dbname has been replaced with a new option called database.names (see here). Tests have been updated accordingly.
  • By default, JDBC connections to Microsoft SQL Server are protected by SSL encryption. If SSL is not enabled for a SQL Server database, or if you want to connect to the database without using SSL, you can disable SSL by setting the value of the database.encrypt property in connector configuration to false. (see here).
  • Removed wildfly override and OWASP vulnerability suppression that are no longer needed since we are running the latest stable version of Debezium, which is also more secure.

Verifying this change

  • Make sure that the change passes the CI checks.

This change is already covered by existing tests, such as:

  • Update the integration tests for all connectors to reflect the changes required by the Debezium 3.2.2 upgrade

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: bluelabs-eu#15

@github-actions
Copy link

github-actions bot commented Sep 6, 2025

@efcasado Please add the following content to your PR description and select a checkbox:

- [ ] `doc` <!-- Your PR contains doc changes -->
- [ ] `doc-required` <!-- Your PR changes impact docs and you will update later -->
- [ ] `doc-not-needed` <!-- Your PR changes do not impact docs -->
- [ ] `doc-complete` <!-- Docs have been already added -->

@github-actions github-actions bot added doc-not-needed Your PR changes do not impact docs and removed doc-label-missing labels Sep 6, 2025
@efcasado efcasado changed the title [improve][io] upgrade to debezium 3.2 [improve][io] upgrade to debezium 3.2.2 Sep 6, 2025
@efcasado efcasado marked this pull request as ready for review September 7, 2025 07:50
@github-actions github-actions bot added doc-required Your PR changes impact docs and you will update later. and removed doc-not-needed Your PR changes do not impact docs labels Sep 7, 2025
@efcasado
Copy link
Contributor Author

efcasado commented Sep 7, 2025

@AlvaroStream, @lhotari - I was able to spend some time on this and it seems all tests are now passing! When you get a chance, I would appreciate your review to make sure I didn’t miss anything.

The upgrade from 1.9.7 to 3.2.2 turned out to be fairly straightforward in terms of changes needed in pulsar-io and its components (e.g. debezium/core, kafka-connect-adaptor). The main challenge was verifying that all connectors still work correctly after the upgrade, since some tests required significant adjustments due to deprecations and related changes. Since we were lagging behind by a large number of version, it wasn't easy to see what settings had changed. Things should be easier in the future 🤞

If we move forward with merging this, we will likely need to update some documentation as well (for example, this page). I believe the updated configuration in the integration tests contain most of the information needed to update the docs / examples.

@efcasado
Copy link
Contributor Author

efcasado commented Sep 7, 2025

Also, I believe this pull requests renders this other pull request redundant?

@lhotari
Copy link
Member

lhotari commented Sep 7, 2025

@AlvaroStream, @lhotari - I was able to spend some time on this and it seems all tests are now passing! When you get a chance, I would appreciate your review to make sure I didn’t miss anything.

The upgrade from 1.9.7 to 3.2.2 turned out to be fairly straightforward in terms of changes needed in pulsar-io and its components (e.g. debezium/core, kafka-connect-adaptor). The main challenge was verifying that all connectors still work correctly after the upgrade, since some tests required significant adjustments due to deprecations and related changes. Since we were lagging behind by a large number of version, it wasn't easy to see what settings had changed. Things should be easier in the future 🤞

If we move forward with merging this, we will likely need to update some documentation as well (for example, this page). I believe the updated configuration in the integration tests contain most of the information needed to update the docs / examples.

Great work @efcasado ! I really appreciate the effort you have put into solving this. Thank you!

@codecov-commenter
Copy link

codecov-commenter commented Sep 7, 2025

Codecov Report

❌ Patch coverage is 27.58621% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.21%. Comparing base (77b34cc) to head (68efe3d).
⚠️ Report is 16 commits behind head on master.

Files with missing lines Patch % Lines
...apache/pulsar/io/debezium/PulsarSchemaHistory.java 53.33% 7 Missing ⚠️
.../org/apache/pulsar/io/debezium/DebeziumSource.java 0.00% 4 Missing ⚠️
...sar/io/debezium/mongodb/DebeziumMongoDbSource.java 0.00% 2 Missing ⚠️
.../pulsar/io/debezium/mssql/DebeziumMsSqlSource.java 0.00% 2 Missing ⚠️
.../pulsar/io/debezium/mysql/DebeziumMysqlSource.java 0.00% 2 Missing ⚠️
...ulsar/io/debezium/oracle/DebeziumOracleSource.java 0.00% 2 Missing ⚠️
...r/io/debezium/postgres/DebeziumPostgresSource.java 0.00% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #24712      +/-   ##
============================================
+ Coverage     74.12%   74.21%   +0.09%     
- Complexity    33068    33108      +40     
============================================
  Files          1895     1895              
  Lines        147979   147989      +10     
  Branches      17137    17137              
============================================
+ Hits         109693   109837     +144     
+ Misses        29524    29386     -138     
- Partials       8762     8766       +4     
Flag Coverage Δ
inttests 26.43% <ø> (-0.16%) ⬇️
systests 22.68% <0.00%> (+<0.01%) ⬆️
unittests 73.73% <42.10%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...java/org/apache/pulsar/io/debezium/SerDeUtils.java 42.85% <ø> (ø)
...sar/io/debezium/mongodb/DebeziumMongoDbSource.java 0.00% <0.00%> (ø)
.../pulsar/io/debezium/mssql/DebeziumMsSqlSource.java 0.00% <0.00%> (ø)
.../pulsar/io/debezium/mysql/DebeziumMysqlSource.java 0.00% <0.00%> (ø)
...ulsar/io/debezium/oracle/DebeziumOracleSource.java 0.00% <0.00%> (ø)
...r/io/debezium/postgres/DebeziumPostgresSource.java 0.00% <0.00%> (ø)
.../org/apache/pulsar/io/debezium/DebeziumSource.java 0.00% <0.00%> (ø)
...apache/pulsar/io/debezium/PulsarSchemaHistory.java 74.83% <53.33%> (ø)

... and 78 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments about dependency versions, other than that LGTM. Outstanding work!

efcasado and others added 4 commits September 7, 2025 18:57
Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari lhotari changed the title [improve][io] upgrade to debezium 3.2.2 [improve][io] Upgrade to Debezium 3.2.2 Sep 8, 2025
@lhotari lhotari merged commit c7dff63 into apache:master Sep 8, 2025
144 of 151 checks passed
@lhotari
Copy link
Member

lhotari commented Sep 8, 2025

@efcasado I've suggested to cherry-pick this to branch-4.1 so that we could include this in 4.1.1 release. I made the proposal on the dev mailing list: https://lists.apache.org/thread/jrb8n25k266x3o2y4bjfhom6btc21tr9 .
Do you have plans to cover the upgrade path in some docs that we could add to 4.1.1 release notes or in the Debezium plugin docs.
Current docs:
https://github.com/apache/pulsar-site/blob/main/docs/io-cdc-debezium.md -> https://pulsar.apache.org/docs/next/io-cdc-debezium/
https://github.com/apache/pulsar-site/blob/main/docs/io-debezium-source.md -> https://pulsar.apache.org/docs/next/io-debezium-source/

@efcasado
Copy link
Contributor Author

efcasado commented Sep 9, 2025

@lhotari, thank you for the review and for helping us push this over the finish line! 🙏

I can certainly help with updating the docs / examples. I will try to at least put a first draft in the coming days.

Technoboy- pushed a commit that referenced this pull request Sep 10, 2025
Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
KannarFr pushed a commit to CleverCloud/pulsar that referenced this pull request Sep 22, 2025
Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
walkinggo pushed a commit to walkinggo/pulsar that referenced this pull request Oct 8, 2025
Co-authored-by: Lari Hotari <lhotari@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked/branch-4.1 doc-required Your PR changes impact docs and you will update later. ready-to-test release/4.1.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Upgrade to later version of Debezium Postgres

4 participants