Skip to content

[Task] [Java SDK]: Add getSchema to PubsubClient and implementing classes #24206

@damondouglas

Description

@damondouglas

What work does this Task describe?

This task adds instance method and two static methods to PubsubClient.

// org.apache.beam.sdk.io.gcp.pubsub.PubsubClientPubsubClient

import org.apache.beam.sdk.schemas.Schema;

public Schema getSchema(SchemaPath schemaPath) throws IOException;

To support the aforementioned, this task additionally adds:

  1. SchemaPath
public static class SchemaPath {
    // Supports the projects/<project>/schemas/<schema> resource path
}
  1. fromSchema methods
// org.apache.beam.sdk.io.gcp.pubsub.PubsubClientPubsubClient

import org.apache.beam.sdk.schemas.Schema;

static Schema fromPubsubSchema(com.google.api.services.pubsub.model.Schema pubsubSchema) { /* Converts Pub/Sub model Schema to Beam Schema; for use by PubsubJsonClient. */ }

static Schema fromPubsubSchema(com.google.pubsub.v1.Schema pubsubSchema) { /* Converts Pub/Sub model Schema to Beam Schema; for use by PubsubGrpcClient. */ }
  1. Override getSchema in PubsubClient subclasses
// org.apache.beam.sdk.io.gcp.pubsub.PubsubGrpcClient

import org.apache.beam.sdk.schemas.Schema;

@Override
public Schema getSchema(SchemaPath schemaPath) throws IOException { ... }
// org.apache.beam.sdk.io.gcp.pubsub.PubsubJsonClient

import org.apache.beam.sdk.schemas.Schema;

@Override
public Schema getSchema(SchemaPath schemaPath) throws IOException { ... }
// org.apache.beam.sdk.io.gcp.pubsub.PubsubTestClient

import org.apache.beam.sdk.schemas.Schema;

@Override
public Schema getSchema(SchemaPath schemaPath) throws IOException { ... }
  1. Add schema to PubsubTestClient.State
import org.apache.beam.sdk.schemas.Schema;

private static class State {
    /** Expected Pub/Sub mapped Beam Schema. */
    @Nullable Schema schema;
}

What value may result from this Task's output?

Querying from and converting the Pub/Sub Schema [1] to a Beam Schema[2] would allow us to validate that both schemas match to prevent potential errors. This supports the design goals of Pub/Sub schemas to facilitate a contract between publisher and subscriber and facilitate a single source of truth for inter-team production and consumption.

Ready and Done Measures

Ready

No blockers

Done

  • Tests detect errors in resource path generation for a Google Cloud resource path
  • Tests detect errors in converting from Beam Schema to Pub/Sub model Schema
  • Tests detect errors in converting from Beam Schema to Pub/Sub proto Schema
  • Tests detect errors in querying source Pub/Sub schema
  • Integration tests detect errors in querying source Pub/Sub schema from provisioned resources

Issue Priority

Priority: 3

Issue Component

Component: io-java-gcp

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions