[RFC] Define and document per-key ordering semantics for runners by pabloem · Pull Request #15378 · apache/beam

pabloem · 2021-08-24T18:25:28Z

There are various use cases that can be implemented on top of Beam and its runners, but that need certain guarantees regarding the ordered transport of data in between execution stages/steps of a Beam Pipeline.

This PR proposes the following:

Defining the concept of per-key ordered delivery.
ValidatesRunner tests that allow us to verify runner behavior for per-key ordered delivery.
Adding a row to the Beam Capability Matrix where runner support for this is documented.

With this, Beam users will be able to enable various order-dependent workloads on Beam/Dataflow with confidence that they need.

Results form ValidatesRunner tests:

Flink streaming VR tests: Build Scan
Flink Bartch VR tests: Jenkins output - not running any of the new tests, as it does not support key-ordered delivery.
Dataflow legacy VR tests: Build Scan
Direct VR tests: Build Scan
Samza tests: Build Scan

r: @apilloud
r: @kennknowles

Lang	ULR	Twister2
Go	---	---
Java
Python	---	---
XLang		---

Examples testing status on various runners

Lang	ULR	Dataflow	Flink	Samza	Spark	Twister2
Go	---	---	---	---	---	---	---
Java	---		---	---	---	---	---
Python	---	---	---	---	---	---	---
XLang	---	---	---	---	---	---	---

Post-Commit SDK/Transform Integration Tests Status (on master branch)

Go	Java	Python

Pre-Commit Tests Status (on master branch)

---	Java	Python	Go	Website	Whitespace	Typescript
Non-portable
Portable	---			---	---	---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

GitHub Actions Tests Status (on master branch)

See CI.md for more information about GitHub Actions CI.

pabloem · 2021-08-25T22:56:44Z

Run Dataflow ValidatesRunner

pabloem · 2021-08-26T05:20:56Z

Run Java Flink PortableValidatesRunner Streaming

pabloem · 2021-08-26T05:21:39Z

Run Java Samza PortableValidatesRunner

pabloem · 2021-08-26T19:40:48Z

Flink streaming VR tests: Build Scan
Flink Bartch VR tests: Jenkins output - not running any of the new tests, as it does not support key-ordered delivery.
Dataflow legacy VR tests: Build Scan
Direct VR tests: Build Scan
Samza tests:Build Scan

apilloud · 2021-08-27T18:17:47Z

Generated website changes are here: https://apache-beam-website-pull-requests.storage.googleapis.com/15378/documentation/runners/capability-matrix/index.html

apilloud

This documenting the existing state of the world so LGTM.

I would lean towards making this a bit more explicit: have the direct runner break ordering by default and have some way for the user to indicate a pipeline requires it. We can then fail the pipeline on runners that don't support ordering if it is required.

apilloud · 2021-08-27T19:24:22Z

sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PerKeyOrderingTest.java

+      matched = matched == null ? 0 : matched;
+      if (matched == -1) {
+        // When matched is set to -1, it means that we have met an error, and elements on this
+        // key are not matched anymore - thus we ignore all inputs.


nit: Possibly make this explicit with a return as well?

apilloud · 2021-08-27T19:52:38Z

sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PerKeyOrderingTest.java

+      if (matched == -1) {
+        // When matched is set to -1, it means that we have met an error, and elements on this
+        // key are not matched anymore - thus we ignore all inputs.
+      } else if (matched < this.perKeyElements.size()


It appears matched >= this.perKeyElements.size() should be its own "Got more elements than expected" error case and throw an exception.

apilloud · 2021-08-27T19:56:36Z

sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PerKeyOrderingTest.java

+        matchedElements.write(-1);
+        receiver.output(KV.of(elm.getKey(), false));
+      } else {
+        assert this.perKeyElements.get(matched).equals(elm.getValue())


nit: I don't think this assert is going to work, it is the case above?

the case above is denied (!...?)

Sorry if my comment wasn't clear, here is what I'm thinking: the case above is mached < this.perKeyElements.size() && !... the assert triggers on !..., so the assert actually triggers on matched >= this.perKeyElements.size().

After the latest update you should never be able to hit this assert, you can leave it as is.

Java assert is, I think, less preferable than a library that generates more elaborate error messages. Also, failures within a user DoFn may be swallowed by a runner. Can the tests be described in terms of a dead-letter output and PAssert?

apilloud · 2021-08-27T19:59:41Z

sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PerKeyOrderingTest.java

+        Lists.newArrayList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).stream()
+            .map(elm -> String.format("k%s", elm))
+            .collect(Collectors.toList());
+    PCollection<KV<String, Integer>> kvSeeds =


Should we test types other than Integer?

hmmm ... I could try to parameterize the test with different types -though IDK if the Parameterized JUnit runner integrates well with the ValidatesRunner framework. Do you have a suggestion? Should I try adding a more complex type instead of just ints?

I don't think you'll be able to parameterize this, you'll need different test methods for each type.

I believe runners see these as byte[] today, so the behavior here actually depends on coders, so changes in the coder will change the ordering. (Future relational work will likely make some runners aware of the coder contents as well.) The byte coder is the trivial passthrough type, so Byte might actually be the best for a basic "does the runner produce deterministic ordering" test.

apilloud · 2021-08-30T19:23:10Z

sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PerKeyOrderingTest.java

+  public void testSingleCallOrderingWithShuffle() {
+    // Here we test that the output of a single process call in a DoFn will be output in order
+    List<Integer> perKeyElements =
+        Lists.newArrayList(-8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8);


Sorry for the late addition: You might want to test some more extreme values. VarIntCoder is variable sized, so around those size boundaries might be good to test.

beam/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/VarIntCoder.java

Line 29 in 50fcf55

* A {@link Coder} that encodes {@link Integer Integers} using between 1 and 5 bytes. Negative

I tried that. All good.

kennknowles

I think that per-key-path ordered delivery is subtle enough it requires a pseudo-mathy treatment in a design doc. I might be behind on dev@ threads but want to check that this has happened. Tests are a useful executable "spec" but when we find a bug that wasn't covered by tests, we go back to the math docs to build the new correct test.

kennknowles · 2021-08-30T21:45:24Z

sdks/java/core/src/main/java/org/apache/beam/sdk/testing/UsesPerKeyOrderInStage.java

+
+/**
+ * Category tag for validation tests which rely on a runner providing per-key ordering in between
+ * transforms in the same stage. Tests tagged with {@link UsesPerKeyOrderInStage} should be run for


"Stage" is not a concept in the Beam model, unless you also define it elsewhere in this PR. (commenting as I go)

renamed this to bundle. WDYT?

kennknowles · 2021-08-30T21:46:38Z

sdks/java/core/src/main/java/org/apache/beam/sdk/testing/UsesPerKeyOrderedDelivery.java

+/**
+ * Category tag for validation tests which rely on a runner providing per-key ordering. Tests tagged
+ * with {@link UsesPerKeyOrderedDelivery} should be run for runners which support key-to-key
+ * ordering of elements across shuffle / stage boundaries.


"shuffle" is also not a model concept. For example runners can choose to move shuffles around so then the ordered delivery might change. Unless you can demonstrate/require that allowable optimizations preserve the ordered delivery you are looking for.

renamed to bundle and across bundle boundaries. WDYT?

kennknowles · 2021-08-30T21:46:56Z

sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java


    @Test
-    @Category(ValidatesRunner.class)
+    @Category({ValidatesRunner.class, UsesParDoLifecycle.class})


Seems unrelated.

kennknowles · 2021-08-30T21:49:08Z

sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/PerKeyOrderingTest.java

+        matchedElements.write(-1);
+        receiver.output(KV.of(elm.getKey(), false));
+      } else {
+        assert this.perKeyElements.get(matched).equals(elm.getValue())


Java assert is, I think, less preferable than a library that generates more elaborate error messages. Also, failures within a user DoFn may be swallowed by a runner. Can the tests be described in terms of a dead-letter output and PAssert?

pabloem · 2021-09-16T17:52:21Z

Run Website_Stage_GCS PreCommit

je-ik · 2021-09-29T08:20:17Z

website/www/site/content/en/documentation/runtime/model.md

+
+We say that the Beam runner supports **key-ordered delivery** if it guarantees
+that these two events will be observed downstream in the same order,
+independently of the kind of transmission.


Although it seems to be mentioned I think it would be good to restate here explicitly, that this holds if and only if the two PTransforms are directly connected - or at least there is no intermediate grouping transform in between. "A downstream PCollection" in the above paragraph might be interpreted as any downstream PCollection, which does not hold.

I was thinking about this comment and it occurs to me that you have to have key-limited parallelism in the producer of course. I don't think this is required anywhere in the model (state/timers require key+window limited parallelism) and also not mentioned here.

What I meant was that if we we have a chain of transforms as follows:

A -> B -> C

if all these transforms are stateful transforms, then there is no guarantee for ordering of elements emitted from A arriving at C. The only exception would be when there is no change in key, because then we can prove that the ordering will be preserved due to transitivity. If key between A and B changes, then there is no guarantee for ordering at C (even if the key changes back to the same as emitted from A).

Agree with both the above points. Should it explicitly say the following?

Key-ordered deliver is only guaranteed between immediately connected elements.

To enforce key ordering both producer and consumer require key limited parallelism.

Or are these statements not accurate?

Additionally, should it refer to stages or elements? When building a pipeline you don't necessarily know what stages will result from any fusion that occurs, so would you specify a requirement for key-ordered delivery between elements rather than stages?

Hi all! Sorry about the delay, but I've finally tried to address these comments. Can you please take a look and LMK what you think>

aaltay · 2021-10-07T20:39:10Z

@pabloem - Could you respond to the open comments?

hughack · 2021-10-12T02:27:12Z

website/www/site/data/capability_matrix.yaml

+            - class: dataflow
+              l1: "Partially"
+              l2:
+              l3: Dataflow performs different shuffling algorithms for batch and streaming. Dataflow guarantees key-ordered delivery in streaming, though not in batch.


Is there any reference that this is true? It is something i am trying to figure out at the moment.

hm we have not stated this in Dataflow documentation (I'm working on that), but it's true for streaming.

Ok, i guess it makes sense to define the concept here first.

hughack · 2021-10-12T02:37:52Z

website/www/site/content/en/documentation/runtime/model.md

    This may allow the runner to avoid serializing elements; instead, the runner
    can just pass the elements in memory.

+Passing elements between transforms that are running on the same worker is


These definitions are slightly different to the glossary ones for fusion and stage. Maybe just have a link in the last bullet point above saying runners might use a strategy called fusion which combines elements into stages?

pabloem

I've finally gotten around to addressing your comments. Can you please take another look?

@kennknowles @je-ik @hughack

I also wrote this doc to try to formalize a little more: https://docs.google.com/document/d/1_7WRJznXlOtWuVaHl_dpy8OZcx_M8BUmeWVA4G0-wEc/edit#

hughack · 2021-11-13T02:47:49Z

@pabloem - Yeah looks good, the only comment would be around @kennknowles point. I don't think "key-limited parallelism" is mentioned anywhere else in the Beam model. My understanding is it does happen, despite not explicitly being documented anywhere i can see. I'd be interested how Dataflow handles scaling the number of workers and if the guarantees still hold for any key that starts getting handled by a different worker.

je-ik

LGTM, but from higher level perspective, do we have any plans to support ordering regardless of the direct connection between the two PTransforms?

pabloem · 2021-11-16T21:36:58Z

@kennknowles PTAL : )

pabloem · 2021-11-29T19:57:05Z

@kennknowles PTAL - if no other comments by the end of the week, I'll merge form lazy consensus.

pabloem · 2021-12-03T17:52:55Z

moving forward as-is, but this is an area of active discussion, so please feel free to engage on this, and we cna work to keep this up to date.

lukecwik · 2022-01-01T00:25:32Z

#15378 broke spark PVR as well: https://ci-beam.apache.org/job/beam_PreCommit_Java_PVR_Spark_Batch_Cron/5140/ (build 5139 passes)

This Jenkins Job was renamed and is still failing.

PVR tests sickbayed in #16411

aaltay · 2022-01-05T04:47:34Z

@pabloem @ibzib - Should this be reverted? (For reference https://issues.apache.org/jira/browse/BEAM-13522 is the tracker.)

lukecwik · 2022-01-06T19:38:30Z

I don't think it should be reverted but better care around adding new test categories without verifying them should be done in the future.

ibzib · 2022-01-06T19:46:33Z

I've proposed making Flink/Spark validates runner tests precommits for runners/flink and runners/spark file changes respectively, as well as sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ since often new VR tests are added there without checking if they pass on these runners. https://issues.apache.org/jira/browse/BEAM-13521

aaltay · 2022-01-06T22:50:43Z

I agree with @lukecwik, and @ibzib's proposal sounds like a good idea to enforce that.

@ibzib - How do we proceed with your proposal? Are you planning to do that? Are you waiting for input on the proposal?

ibzib · 2022-01-06T22:58:00Z

We've converted portable Flink VR to a precommit already, I'll follow up with other jobs later.

aaltay · 2022-01-14T18:00:06Z

We've converted portable Flink VR to a precommit already, I'll follow up with other jobs later.

@ibzib, did the Spark VR also become a precommit?

ibzib · 2022-01-14T23:43:12Z

We've converted portable Flink VR to a precommit already, I'll follow up with other jobs later.

@ibzib, did the Spark VR also become a precommit?

See my comments on #16345

aaltay · 2022-01-15T02:17:39Z

We've converted portable Flink VR to a precommit already, I'll follow up with other jobs later.

@ibzib, did the Spark VR also become a precommit?

See my comments on #16345

Ack. Thank you.

pabloem force-pushed the ordering-semantics branch 2 times, most recently from 77198a9 to 23d0955 Compare August 24, 2021 18:37

Starting to define per-key ordering semantics for runners

68229bc

pabloem force-pushed the ordering-semantics branch from 23d0955 to 68229bc Compare August 25, 2021 03:52

Fixup to tests

779cba9

pabloem added 2 commits August 25, 2021 16:29

spotless

d7bd38e

Fixing up VR tests to run properly

cb053bf

pabloem force-pushed the ordering-semantics branch from 4a368c3 to cb053bf Compare August 26, 2021 04:59

Increase flink batch VR timeout

1b6437e

pabloem changed the title ~~[WIP] Starting to define per-key ordering semantics for runners~~ [WIP] Define and document per-key ordering semantics for runners Aug 26, 2021

pabloem marked this pull request as ready for review August 26, 2021 22:02

pabloem changed the title ~~[WIP] Define and document per-key ordering semantics for runners~~ [RFC] Define and document per-key ordering semantics for runners Aug 26, 2021

Update definition of key-ordered delivery

b846c68

apilloud approved these changes Aug 27, 2021

View reviewed changes

Addressing comments.

ed09e89

apilloud reviewed Aug 30, 2021

View reviewed changes

adding edge numbers

9b5315f

kennknowles requested changes Aug 30, 2021

View reviewed changes

je-ik reviewed Sep 29, 2021

View reviewed changes

hughack reviewed Oct 12, 2021

View reviewed changes

Addressing comments

f737710

pabloem commented Nov 12, 2021

View reviewed changes

je-ik approved these changes Nov 15, 2021

View reviewed changes

pabloem merged commit 862ece1 into apache:master Dec 3, 2021

pabloem deleted the ordering-semantics branch December 3, 2021 17:52

lostluck mentioned this pull request Aug 2, 2024

[prism] Java PerKeyOrderingTest - test failures (per key order not maintained) #32064

Closed

Conversation

pabloem commented Aug 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Examples testing status on various runners

Post-Commit SDK/Transform Integration Tests Status (on master branch)

Pre-Commit Tests Status (on master branch)

GitHub Actions Tests Status (on master branch)

Uh oh!

pabloem commented Aug 25, 2021

Uh oh!

pabloem commented Aug 26, 2021

Uh oh!

pabloem commented Aug 26, 2021

Uh oh!

pabloem commented Aug 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apilloud commented Aug 27, 2021

Uh oh!

apilloud left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kennknowles left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pabloem commented Sep 16, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aaltay commented Oct 7, 2021

pabloem commented Aug 24, 2021 •

edited

Loading

pabloem commented Aug 26, 2021 •

edited

Loading

pabloem left a comment •

edited

Loading