Skip to content

redshift-alpha: Rationalization of IAM roles creation for Lambdas execution #32089

@matteo-pirola

Description

@matteo-pirola

Describe the feature

Overview

I suggest a rationalization of the IAM roles created and used to execute the Lambda functions responsible of generating and executing Redshift queries to create or update resources.

As of today, the framework generates many "single-scope" roles leading to an explosion on the number of IAM roles in an account, making it easy to reach the IAM quota on roles per account (1000).

AS IS

Currently, when using the Table construct to create Redshift tables, a CloudFormation Stack with the following is generated:

  • A Resource called Query Redshift Database XYZ, consisting of a Lambda function and an IAM role for its execution
  • For each table, a Lambda function and an "associated" IAM role for its execution; each role has permission to only execute the function it is "associated" to

This means that, if I want to create n tables I will obtain a total of at least n+1 IAM roles (if I put them all in the same Stack).

Example code that I am using to create the tables:

import aws_cdk as cdk
from aws_cdk import Stack
from aws_cdk import aws_secretsmanager as secretsmanager
from aws_cdk.aws_redshift_alpha import Cluster, Column, Table, TableDistStyle, TableSortStyle
import os
import json
from constructs import Construct

class MyRedshiftTablesStack(Stack):
 
    def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
        super().__init__(scope, construct_id, **kwargs)
 
        secret = secretsmanager.Secret.from_secret_name_v2(
            self, "MySecretId", <secret_name>
        )
 

        cluster = Cluster.from_cluster_attributes(
            self,
            "MyCluster",
            cluster_name=<cluster_name>,
            cluster_endpoint_address=<cluster_endpoint_address>,
            cluster_endpoint_port=<cluster_endpoint_port>,
        )
 
        table_columns = [
            Column(
                name="column_1",
                data_type="integer",
                dist_key=False,
                comment="My first column"
            )
        ]

        dist_style = TableDistStyle["KEY"]
        removal_policy = cdk.RemovalPolicy["DESTROY"]
        sort_style = TableSortStyle["AUTO"]
 
        Table(
            self,
            "MyTableName",
            cluster=cluster,
            database_name="my_db",
            admin_user=secret,
            table_columns=table_columns,
            dist_style=dist_style,
            removal_policy=removal_policy,
            sort_style=sort_style,
            table_name="my_schema.my_table",
            table_comment="My first table"
        )

TO BE

I suggest allowing the developer choosing existing roles as defaults for the execution of the Lambda functions whenever possible.
This way it would be possible to find a good balance between segregation of permissions and role quantity on a case-by-case basis.

Use Case

This feature would be crucial whenever the number of Redshift tables created with cdk-redshift-alpha gets bigger. Indeed, having a number of IAM roles which linearly scales with the number of tables is not sustainable and will eventually lead to the reach of IAM quotas, also providing very few benefits overall.

Proposed Solution

I suggest allowing the developer use existing roles for the execution of the Lambda functions.
In order to achieve such thing one should modify the underlying Lambdas creation to allow for external role selection (e.g. passing it via the Table construct of the library).

Other Information

If the proposed suggestion is not technically possible, I suggest to rethink about the number of roles generated by the library e.g. by sharing a single IAM role among the Lambda functions created in the same Stack.

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.165.0

Environment details (OS name and version, etc.)

Linux EC2 - aws/codebuild/standard:7.0

Metadata

Metadata

Assignees

Labels

@aws-cdk/aws-redshiftRelated to Amazon RedshiftbugThis issue is a bug.effort/mediumMedium work item – several days of effortp1

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions