Skip to content

[Bug]: Spark2 runner fails to deserialize PipelineOptions due to NoSuchMethodError #23568

@mosche

Description

@mosche

What happened?

Spark 2.4.8 uses a fairly old version of Jackson 2.6.7, while Beam is far ahead at 2.13.3.
When attempting to deserialize Pipeline options on a Spark worker, it will fail with a NoSuchMethodError it attempts to use a newer Jackson API that doesn't exist in the older Spark version.

Note: The Spark 2 runner is already deprecated. So this is likely a NO-FIX and mostly for reference.

java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.type.TypeBindings.emptyBindings()Lcom/fasterxml/jackson/databind/type/TypeBindings;
	at org.apache.beam.sdk.options.PipelineOptionsFactory.createBeanProperty(PipelineOptionsFactory.java:1708)
	at org.apache.beam.sdk.options.PipelineOptionsFactory.computeDeserializerForMethod(PipelineOptionsFactory.java:1732)
	at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1660)
	at org.apache.beam.sdk.options.PipelineOptionsFactory.getDeserializerForMethod(PipelineOptionsFactory.java:1782)
	at org.apache.beam.sdk.options.PipelineOptionsFactory.deserializeNode(PipelineOptionsFactory.java:1806)
	at org.apache.beam.sdk.options.ProxyInvocationHandler.getValueFromJson(ProxyInvocationHandler.java:584)
	at org.apache.beam.sdk.options.ProxyInvocationHandler.getValueFromJson(ProxyInvocationHandler.java:579)
	at org.apache.beam.sdk.options.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:219)
	at com.sun.proxy.$Proxy42.getOptionsId(Unknown Source)
	at org.apache.beam.sdk.options.ValueProvider$RuntimeValueProvider.setRuntimeOptions(ValueProvider.java:247)
	at org.apache.beam.sdk.options.ProxyInvocationHandler$Deserializer.deserialize(ProxyInvocationHandler.java:885)
	at org.apache.beam.sdk.options.ProxyInvocationHandler$Deserializer.deserialize(ProxyInvocationHandler.java:866)
	at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3736)
	at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2726)
	at org.apache.beam.runners.core.construction.SerializablePipelineOptions.deserializeFromJson(SerializablePipelineOptions.java:76)
	at org.apache.beam.runners.core.construction.SerializablePipelineOptions.readObject(SerializablePipelineOptions.java:61)

Issue Priority

Priority: 3

Issue Component

Component: runner-spark

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions