Skip to content

[improve][fn] Support configure compression type#19470

Merged
nlu90 merged 6 commits into
apache:masterfrom
jiangpengcheng:support_compression
Mar 17, 2023
Merged

[improve][fn] Support configure compression type#19470
nlu90 merged 6 commits into
apache:masterfrom
jiangpengcheng:support_compression

Conversation

@jiangpengcheng

Copy link
Copy Markdown
Contributor

Fixes #19462

Motivation

Modifications

Verifying this change

  • Make sure that the change passes the CI checks.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: jiangpengcheng#6

@github-actions github-actions Bot added the doc-required Your PR changes impact docs and you will update later. label Feb 9, 2023
@jiangpengcheng jiangpengcheng changed the title Support compression [improve][fn] Support compression Feb 9, 2023
@jiangpengcheng jiangpengcheng changed the title [improve][fn] Support compression [improve][fn] Support configure compression Feb 9, 2023
@jiangpengcheng jiangpengcheng changed the title [improve][fn] Support configure compression [improve][fn] Support configure compression type Feb 9, 2023
@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

1 similar comment
@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

@codecov-commenter

codecov-commenter commented Feb 10, 2023

Copy link
Copy Markdown

Codecov Report

Merging #19470 (10b22b4) into master (e286339) will decrease coverage by 28.96%.
The diff coverage is 9.49%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #19470       +/-   ##
=============================================
- Coverage     60.44%   31.49%   -28.96%     
+ Complexity     3494      293     -3201     
=============================================
  Files          1832     1672      -160     
  Lines        135153   132568     -2585     
  Branches      14871    15211      +340     
=============================================
- Hits          81693    41751    -39942     
- Misses        45869    84681    +38812     
+ Partials       7591     6136     -1455     
Flag Coverage Δ
inttests 25.81% <5.78%> (+1.14%) ⬆️
systests 25.13% <10.47%> (?)
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...lance/extensions/ExtensibleLoadManagerWrapper.java 0.00% <0.00%> (ø)
...ns/channel/ServiceUnitStateCompactionStrategy.java 8.16% <0.00%> (-5.18%) ⬇️
...lance/extensions/channel/ServiceUnitStateData.java 0.00% <0.00%> (ø)
.../pulsar/compaction/StrategicTwoPhaseCompactor.java 0.00% <0.00%> (-76.31%) ⬇️
...java/org/apache/pulsar/admin/cli/CmdFunctions.java 0.00% <0.00%> (-47.21%) ⬇️
...ar/functions/source/MultiConsumerPulsarSource.java 86.84% <0.00%> (-7.61%) ⬇️
...r/functions/source/SingleConsumerPulsarSource.java 76.66% <0.00%> (-23.34%) ⬇️
...xtensions/channel/ServiceUnitStateChannelImpl.java 0.58% <0.34%> (-0.08%) ⬇️
.../apache/pulsar/functions/utils/FunctionCommon.java 33.46% <12.50%> (-22.42%) ⬇️
...a/org/apache/pulsar/functions/sink/PulsarSink.java 47.24% <33.33%> (-7.39%) ⬇️
... and 1355 more

@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

@nlu90

nlu90 commented Feb 24, 2023

Copy link
Copy Markdown
Member

@jiangpengcheng
Since this PR involves protobuf change, let's send an email in the mailing list to let folks in the community know about it.

@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

3 similar comments
@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

@tisonkun tisonkun left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @jiangpengcheng! Generally looks good.

Comments below:

  1. I don't find changes to Go Functions, do you update it later or it's irrelevant?
  2. @momo-jun maybe we should update the document for this new feature. Although it can be a light one, the author marks it as doc-required. But I'm unsure what page is suitable for such changes.

@momo-jun

momo-jun commented Mar 8, 2023

Copy link
Copy Markdown
Contributor

@tisonkun thanks for considering the doc development.
@jiangpengcheng To scope the incremental doc updates, I think at least we have a new config that needs to be added to this topic. Can you please confirm and advise? It would be great if you could analyze the user impact of this improvement and involve the doc updates within this PR.

@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

@momo-jun a new field CompressionType should be added to ProducerConfig: https://pulsar.apache.org/docs/next/functions-cli/#producerconfig,

the type is String and available values are: None, LZ4, ZLIB, ZSTD and SNAPPY, and default to LZ4

@jiangpengcheng

jiangpengcheng commented Mar 9, 2023

Copy link
Copy Markdown
Contributor Author

Thanks for your contribution @jiangpengcheng! Generally looks good.

Comments below:

  1. I don't find changes to Go Functions, do you update it later or it's irrelevant?
  2. @momo-jun maybe we should update the document for this new feature. Although it can be a light one, the author marks it as doc-required. But I'm unsure what page is suitable for such changes.

added go support and suggestions to docs update

but the ci(Go function style check) complains about using the deprecated AutoAck, do you know how to ignore it?

@momo-jun

momo-jun commented Mar 9, 2023

Copy link
Copy Markdown
Contributor

the type is String and available values are: None, LZ4, ZLIB, ZSTD and SNAPPY, and default to LZ4

@jiangpengcheng thanks for the input. I've drafted apache/pulsar-site#461 to add the docs.
Only one concern - the default config of compression type on client side, e.g., Java producers, is None (no compression), which is not aligned with functions. Is there any risk here?

@nlu90

nlu90 commented Mar 9, 2023

Copy link
Copy Markdown
Member

/pulsarbot rerun-failure-checks

@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

the type is String and available values are: None, LZ4, ZLIB, ZSTD and SNAPPY, and default to LZ4

@jiangpengcheng thanks for the input. I've drafted apache/pulsar-site#461 to add the docs. Only one concern - the default config of compression type on client side, e.g., Java producers, is None (no compression), which is not aligned with functions. Is there any risk here?

There is no risk, just for back compatibility since current pulsar functions are using LZ4 as compression type.

One thing I did wrong is that the None should be NONE instead

@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

@nlu90 nlu90 closed this Mar 10, 2023
@nlu90 nlu90 reopened this Mar 10, 2023
@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

/pulsarbot rerun-failure-checks

@nlu90 nlu90 merged commit e0098ee into apache:master Mar 17, 2023
@michaeljmarshall

Copy link
Copy Markdown
Member

This updates the function proto. Shouldn't we have had a PIP?

@Technoboy- Technoboy- added this to the 3.0.0 milestone Mar 19, 2023
@nlu90

nlu90 commented Mar 20, 2023

Copy link
Copy Markdown
Member

Hi @michaeljmarshall, a mail thread sent on February 28 discussed the change.
PTAL: https://lists.apache.org/thread/8z849bhp79ob23kf1p21f6kq7rd6bljq

@michaeljmarshall

Copy link
Copy Markdown
Member

Thanks for that link @nlu90! Looks good to me.

@momo-jun momo-jun added doc-complete Your PR changes impact docs and the related docs have been already added. and removed doc-required Your PR changes impact docs and you will update later. labels Mar 22, 2023
lhotari added a commit to lhotari/pulsar that referenced this pull request Mar 23, 2023
- change was made in apache#19470
- master branch build fails with "SA1019: instanceConf.funcDetails.AutoAck is deprecated: Do not use."
@lhotari

lhotari commented Mar 23, 2023

Copy link
Copy Markdown
Member

This PR broke the master branch. Please review and merge #19904 asap to fix master.
The reason why this PR was mergeable is due to a CI bug #19905 .

flowchartsman added a commit to flowchartsman/pfunc that referenced this pull request May 29, 2023
- `go generate ./...` from repo root or
- go run internal/update_proto.go <release tag>
- fixes #7

Pending sync of changes to apache/pulsar#19470
@lhotari

lhotari commented Jun 20, 2024

Copy link
Copy Markdown
Member

the type is String and available values are: None, LZ4, ZLIB, ZSTD and SNAPPY, and default to LZ4

@jiangpengcheng thanks for the input. I've drafted apache/pulsar-site#461 to add the docs. Only one concern - the default config of compression type on client side, e.g., Java producers, is None (no compression), which is not aligned with functions. Is there any risk here?

There is no risk, just for back compatibility since current pulsar functions are using LZ4 as compression type.

One thing I did wrong is that the None should be NONE instead

@jiangpengcheng @nlu90 This PR introduced a change to the compression type default. Only when the producer config is provided, it will use the LZ4 compression default. The producer config is optional and that's why compression will now be NONE by default when the producer config isn't provided.
Producers are created in 2 locations in the Pulsar Functions code. The other location still has LZ4 compression by default.
There's another issue #22948 which I'm currently working on. I'm not changing the behavior of the compression default in that change, but unifying the configuration so that compression type can be configured also for the producers created in ContextImpl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-complete Your PR changes impact docs and the related docs have been already added. ready-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support configure compression type for pulsar functions

10 participants