Skip to content

Support AWS Kinesis Data Streams as a Source #1082

@dlvenable

Description

@dlvenable

Is your feature request related to a problem? Please describe.

Some pipeline authors want to retrieve events from Amazon Kinesis Data Streams.

Describe the solution you'd like

Create a kinesis_data_streams source plugin. The Kinesis Client Library (KCL) can manage much of the client needs. So I propose that the Data Prepper source use KCL for reading from Kinesis.

KCL uses DynamoDB to coordinate consumers. Because KCL uses DynamoDB and Kinesis presumes an AWS account anyway, I propose that Data Prepper uses DynamoDB for consumer coordination.

Data Prepper should support configuring the AWS resources and access to the AWS resources that KCL needs. And also configuring the Kinesis stream name.

Example configuration:

source:
  kinesis_data_streams:
    stream_name: MyStream
    coordination_table_name: MyDynamoDbTable

Additional context

https://javadoc.io/doc/software.amazon.kinesis/amazon-kinesis-client/latest/index.html

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Done

Status

New

Status

No status

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions