Skip to content

API to get all Column references in an Expr without cloning Columns #10505

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

This API is used in several optimizer passes to recursively find all column references in an expression

pub fn Expr::to_columns(&self) -> Result<HashSet<Column>, DataFusionError>

However, it requires clone() ing all Columns (which requires allocating and copying a string) even if the caller only needs to check if there is a reference. This is inefficient and leads to a slower optimizer

Describe the solution you'd like

I would like an API that does not require copying

Describe alternatives you've considered

One version might be this (returned hasset has &Expr not `Expr)

impl Expr {
  /// Return all referenced  `Column`s  anywhere in this `Expr` 
  ///
  /// For example 
  /// * `a+b` would return `{a,b}`
  /// * `a + 1 + a` would return `{a}`
  fn column_refs(&self) -> HashSet<&Expr>  { 
  ...
  }

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions