KAFKA-13748: Do not include file stream connectors in Connect's CLASSPATH and plugin.path by default#11908
Conversation
da7568e to
380c91f
Compare
|
Here's a run of all the Connect system tests with the proposed changes in the PR. @rhauch the 9 tests that have failed have done so after loading the classes successfully. The failures are not relevant to the changes here and the issues with the |
mimaison
left a comment
There was a problem hiding this comment.
Thanks @kkonstantine. I've not taken a look at the Python changes yet, but I left a couple of questions.
There was a problem hiding this comment.
Should we add the same to the Standalone section above?
Also we should update quickstart.html too.
There was a problem hiding this comment.
Thanks for the comment @mimaison
I've moved this bullet further up where we include properties for both standalone and distributed.
Regarding the quickstart, the reason I didn't change this file is that it's not shown anymore. Which might be something to fix, but possibly not in this PR.
There was a problem hiding this comment.
Thanks @kkonstantine.
Regarding the quickstart, it's currently not shown on the website, but this commit added it back. So once we sync the docs for 3.2, it should be back.
There was a problem hiding this comment.
Yeah, I agree that it'd be good for this PR to add this information to the Connect section of the quickstart, since that's been added back for 3.2.
There was a problem hiding this comment.
Thanks for the link @mimaison. In that case, it's worth updating the quick start here too. Let me do that
9fbefd3 to
aa443b4
Compare
|
I rebased to get the changes from #11933 and get a green run of system tests |
There was a problem hiding this comment.
Thanks, @kkonstantine. Looks pretty good, but I do have a few comments/suggestions below.
Also, it might be good for this PR to update the Connect section of the quickstart, since that's been added back for 3.2 (see thread above).
There was a problem hiding this comment.
Previous list item paragraphs are on a single line, and it's probably useful to keep that formatting style. Also, a few nits:
| <li><code>plugin.path</code> (default <code>empty</code>) - a list of paths that contain | |
| plugins (connectors, converters, transformations). For the purpose of quick starts users will have to add the path that contains the FileStreamSourceConnector and FileStreamSinkConnector packaged in <code>connect-file-"version".jar</code>, because these connectors are not included by default to the <code>CLASSPATH</code> or the <code>plugin.path</code> of the Connect worker.</li> | |
| <li><code>plugin.path</code> (default <code>empty</code>) - a list of paths that contain Connect plugins (connectors, converters, transformations). Before running quick starts, users must add the relative or absolute path that contains the example FileStreamSourceConnector and FileStreamSinkConnector packaged in <code>connect-file-"version".jar</code>, because these connectors are not included by default to the <code>CLASSPATH</code> or the <code>plugin.path</code> of the Connect worker.</li> |
I also have a few questions.
- Should the path include the JAR file or the directory that contains the JAR file?
- Would it be useful if we supplied an example
plugin.pathline to add?
I've built the distribution with this PR branch, and verified that the quickstart running the standalone worker fails if the file connectors are not in the plugin.path, but adding plugin.path=libs/connect-file-<version>.jar does work as long as the quickstart is run from the installation directory.
There was a problem hiding this comment.
Thanks for the comment @rhauch
One note that I have is that I wouldn't recommend adding the relative path even if it works, because it possibly doesn't lead to robust deployments. plugin.path works when following symlinks so that can help.
plugin.path works with uber-jars but I wouldn't necessarily consider connect-file-<version>.jar an uber jar. I'd recommend the mainstream way which is to add the parent directory of any directories containing connector jars. Let me try to see how an example looks like here, because that's already just a bullet point.
rhauch
left a comment
There was a problem hiding this comment.
Thanks, @kkonstantine. A long comment/suggestion below about how to reduce the barrier of running the quickstart.
There was a problem hiding this comment.
Will users that are trying this for the first time understand what this means, or will this be enough of a barrier to cause them to abandon the quickstart? I'm not sure, but it might be worth to be more explicit here.
First, since this is a quickstart I think it's okay to duplicate some information here just so that users don't have to go elsewhere to learn what they need to do to run the quickstart.
Second, the referenced plugin.path configuration section says (emphasis mine):
The list should consist of top level directories that include any combination of ...
And, to run the quickstarts we want them to add the path to that JAR file to the plugin.path configuration property, not add the path to the lib/ directory that contains the connect-file-<version>.JAR file, as the referenced documentation suggests to do.
So, I suggest we be much more explicit here, and say something like:
| First, make sure to add <code class="language-bash">connect-file-{{fullDotVersion}}.jar</code> to the <code>plugin.path</code> property in the Connect worker's configuration (see <a href="#connectconfigs_plugin.path">plugin.path</a> for examples). | |
| </p> | |
| First, instruct Connect to use the example file system connectors by editing the configuration for the Connect standalone worker to changing the following line from: | |
| </p> | |
| <pre class="brush: bash;"> | |
| #plugin.path=</pre> | |
| to: | |
| </p> | |
| <pre class="brush: bash;"> | |
| plugin.path=lib/connect-file-{{fullDotVersion}}.jar</pre> |
We could make this a little easier if we changed the connect-standalone.properties and connect-distributed.properties files to add a few lines to the existing commented out section, so that all a user has to do is uncomment one line. For example:
...
# Set to a list of filesystem paths separated by commas (,) to enable class loading isolation for plugins
# (connectors, converters, transformations). The list should consist of top level directories that include
# any combination of:
# a) directories immediately containing jars with plugins and their dependencies
# b) uber-jars with plugins and their dependencies
# c) directories immediately containing the package directory structure of classes of plugins and their dependencies
# Note: symlinks will be followed to discover dependencies or plugins.
# Examples:
# plugin.path=/usr/local/share/java,/usr/local/share/kafka/plugins,/opt/connectors,
#
# To run the quickstart that just uses the example file system connectors, uncomment this line:
# plugin.path=lib/connect-file-<version>.jar,
#
#plugin.path=
These are comments, and I think we can improve the comments without a KIP, especially since with this PR we're trying to limit the impact on the quickstarts.
There was a problem hiding this comment.
I should add that I don't think adding the comments to the config properties is really all that necessary -- it might simplify things a bit, but at the same time we don't want users to do that in production.
Not including it in the configs just means the quickstart needs a bit more content to explain how the user should modify the config. Plus, maybe a note that says not to use these in production.
There was a problem hiding this comment.
I agree. Better to add this example to the quick start itself rather than the config.
Again, my dilemma was regarding whether we should suggest a relative path + uber jar value which is not typical for production even on the quick start or direct users to the documentation of the property. I'm also a fan of quick starts that demonstrate a working example step-by-step. So, after your comment I've changed the quick start doc to contain an example. Let me know how it looks after that change when you get a chance.
rhauch
left a comment
There was a problem hiding this comment.
LGTM. Thanks, @kkonstantine!
…PATH and plugin.path by default
fdbe364 to
d1f8a10
Compare
|
Thanks @rhauch. Just rebased to get the latest changes and squash the last commit with a line edit. |
rhauch
left a comment
There was a problem hiding this comment.
Thanks, @kkonstantine. Still LGMT.
…PATH and plugin.path by default (#11908) With this change we stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector will still be shipped with the Apache Kafka distribution and will be available for explicit inclusion. The changes have been tested through the system tests and the existing unit and integration tests. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Randall Hauch <rhauch@gmail.com>
…PATH and plugin.path by default (#11908) With this change we stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector will still be shipped with the Apache Kafka distribution and will be available for explicit inclusion. The changes have been tested through the system tests and the existing unit and integration tests. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Randall Hauch <rhauch@gmail.com>
…PATH and plugin.path by default (#11908) With this change we stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector will still be shipped with the Apache Kafka distribution and will be available for explicit inclusion. The changes have been tested through the system tests and the existing unit and integration tests. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Randall Hauch <rhauch@gmail.com>
|
Thanks both. Merged to |
…PATH and plugin.path by default (apache#11908) With this change we stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector will still be shipped with the Apache Kafka distribution and will be available for explicit inclusion. The changes have been tested through the system tests and the existing unit and integration tests. Reviewers: Mickael Maison <mickael.maison@gmail.com>, Randall Hauch <rhauch@gmail.com>
…cs-11-may-2022 * apache-kafka/3.1: (51 commits) MINOR: reload4j build dependency fixes (apache#12144) KAFKA-13255: Use config.properties.exclude when mirroring topics (apache#11401) KAFKA-13794: Fix comparator of inflightBatchesBySequence in TransactionsManager (round 3) (apache#12096) KAFKA-13794: Follow up to fix producer batch comparator (apache#12006) fix: make sliding window works without grace period (#kafka-13739) (apache#11980) 3.1.1 release notes (apache#12001) KAFKA-13794; Fix comparator of `inflightBatchesBySequence` in `TransactionManager` (apache#11991) KAFKA-13782; Ensure correct partition added to txn after abort on full batch (apache#11995) KAFKA-13748: Do not include file stream connectors in Connect's CLASSPATH and plugin.path by default (apache#11908) KAFKA-13775: CVE-2020-36518 - Upgrade jackson-databind to 2.12.6.1 (apache#11962) ...
…cs-12-may-2022 * apache-kafka/3.0: (14 commits) fix: make sliding window works without grace period (#kafka-13739) (apache#11980) KAFKA-13794: Follow up to fix producer batch comparator (apache#12006) KAFKA-13794; Fix comparator of `inflightBatchesBySequence` in `TransactionManager` (apache#11991) KAFKA-13748: Do not include file stream connectors in Connect's CLASSPATH and plugin.path by default (apache#11908) KAFKA-13418: Support key updates with TLS 1.3 (apache#11966) KAFKA-13770: Restore compatibility with KafkaBasedLog using older Kafka brokers (apache#11946) KAFKA-13761: KafkaLog4jAppender deadlocks when idempotence is enabled (apache#11939) KAFKA-13759: Disable idempotence by default in producers instantiated by Connect (apache#11933) MINOR: Fix `ConsumerConfig.ISOLATION_LEVEL_DOC` (apache#11915) KAFKA-13750; Client Compatability KafkaTest uses invalid idempotency configs (apache#11909) ...
The purpose of this patch is to stop including the non-production grade connectors that are meant to be used for demos and quick starts by default in the CLASSPATH and plugin.path of Connect deployments. The package of these connector still be shipped with the Apache Kafka distribution and will be available for explicit inclusion.
The changes have been tested through the system tests and the existing unit and integration tests
Committer Checklist (excluded from commit message)