Skip to content

[fix][fn] Add missing version field back to querystate API#21966

Merged
Technoboy- merged 6 commits into
apache:masterfrom
jiangpengcheng:add_missing_version_field
Jan 27, 2024
Merged

[fix][fn] Add missing version field back to querystate API#21966
Technoboy- merged 6 commits into
apache:masterfrom
jiangpengcheng:add_missing_version_field

Conversation

@jiangpengcheng

@jiangpengcheng jiangpengcheng commented Jan 25, 2024

Copy link
Copy Markdown
Contributor

Fixes #xyz

Main Issue: #xyz

PIP: #xyz

Motivation

#21597 introduce a change that always uses null for the version field for the querystate API. This may break users' behavior, need to add it back

Modifications

  1. Add a new class org.apache.pulsar.functions.api.state.StateValue, which contains value, version, and isNumber fields.
  2. Add two new methods to the org.apache.pulsar.functions.api.state.ByteBufferStateStore interface, both of which return a StateValue:
    1. default StateValue getStateValue(String key)
    2. default StateValue getStateValueAsync(String key)
  3. Use DefaultStateStore#getStateValue method instead of DefaultStateStore#get in org.apache.pulsar.functions.worker.rest.api.ComponentImpl to get the version field from the backend state store and return it to users.

Verifying this change

  • Make sure that the change passes the CI checks.

  • This change added tests and can be verified as follows:

    • Add unit tests for getStateValue and getStateValueAsync
    • Update integration test of PulsarStateTest#testPythonWordCountFunction, now it will check the version field too.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: jiangpengcheng#24

@github-actions github-actions Bot added the doc-not-needed Your PR changes do not impact docs label Jan 25, 2024
@jiangpengcheng jiangpengcheng force-pushed the add_missing_version_field branch from 1b256d9 to 6c0810f Compare January 25, 2024 03:25
@codelipenghui codelipenghui added the release/blocker Indicate the PR or issue that should block the release until it gets resolved label Jan 25, 2024
@codelipenghui codelipenghui added this to the 3.2.0 milestone Jan 25, 2024
@lhotari

lhotari commented Jan 25, 2024

Copy link
Copy Markdown
Member

Regarding https://apache-pulsar.slack.com/archives/C5ZSVEN4E/p1706147824703029?thread_ts=1705826989.958419&cid=C5ZSVEN4E ,

now the display changed to

{
  "key": "hello",
  "stringValue": "\u0000\u0000\u0000\u0000\u0000\u0000\u0000\n",
  "numberValue": 10
}

What caused the stringValue key to appear? Is there also a fix for that?

@jiangpengcheng

Copy link
Copy Markdown
Contributor Author

Regarding https://apache-pulsar.slack.com/archives/C5ZSVEN4E/p1706147824703029?thread_ts=1705826989.958419&cid=C5ZSVEN4E ,

now the display changed to

{
  "key": "hello",
  "stringValue": "\u0000\u0000\u0000\u0000\u0000\u0000\u0000\n",
  "numberValue": 10
}

What caused the stringValue key to appear? Is there also a fix for that?

It's partly resolved, if the backend state store can specify whether a state value is a number(BK supports this), it won't return stringValue and byteValue

But if a backend state store doesn't have such ability, we can't tell whether a state value is a number even its length is Long.BYTES, in this case, a stringValue or byteValue is returned with the numberValue

@lhotari

lhotari commented Jan 25, 2024

Copy link
Copy Markdown
Member

It's partly resolved, if the backend state store can specify whether a state value is a number(BK supports this), it won't return stringValue and byteValue

This happened with BK and 3.2.0 rc4. What does it mean that it's partly resolved?

@jiangpengcheng

jiangpengcheng commented Jan 26, 2024

Copy link
Copy Markdown
Contributor Author

It's partly resolved, if the backend state store can specify whether a state value is a number(BK supports this), it won't return stringValue and byteValue

This happened with BK and 3.2.0 rc4. What does it mean that it's partly resolved?

When using the BK state store, it won't return stringValue or byteValue if the value is a number, so it's fixed for BK.

But if users use another state store, this state store may not be able to specify whether a state value is a number or a string(such as the PulsarMetadataStateStoreImpl), then when the length of a state value's byte[] is equal to Long.BYTES, we can only return numberValue along with stringValue or byteValue because this state value can either be a number or string.

This is also the reason why the type of isNumber is Boolean, because not all state store has the ability to know whether a state value is a number or not.

I think this is fine, because before this PR and #21597, the querystate and putstate API were using the bookkeeper storage client directly, and users could not use other state stores, so it won't break user's behaviors since they are always using the BKStateStoreImpl.

@lhotari

lhotari commented Jan 26, 2024

Copy link
Copy Markdown
Member

/pulsarbot rerun-failure-checks

@Technoboy- Technoboy- merged commit 48e64b3 into apache:master Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs ready-to-test release/blocker Indicate the PR or issue that should block the release until it gets resolved

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants