Skip to content

sql: support lightweight composite types for in-flight values #24866

@knz

Description

@knz

Forked off #24832.
Needed for #16971.

The PostgreSQL type system supports "composite types" which are really just tuples of values where the arity is constant (same for every row) and each column in the tuple has a static label (same for every row).

Note that CockroachDB already supports tuples (and tuple types) in scalar expressions but so far their columns/elements were anonymous.

Composite types show up mainly as follows.

Reification of tables as composite values

If an identifier in a scalar expression refers to one of the relational expressions in the current context (e.g. a table name in the FROM clause), then the result of evaluating that identifier in the scalar context is a composite expression, with the relexp's columns as elements.

For example:

> SELECT kv FROM kv;
   kv
---------
 (37,2)
 (94,3)
 (30,3)

> SELECT (kv, kv) AS woo FROM kv;
          woo
-----------------------
 ("(37,2)","(37,2)")
 ("(94,3)","(94,3)")
 ("(30,3)","(30,3)")

Note: in these examples, the result tables have just 1 column. This is a radically different path/concept than e.g. SELECT kv.* FROM kv (star expansion) which produces as many columns as the FROM table.

Implementation wise, the identifier to the relexpr must be replaced by a suitable tuple constructor, e.g. ROW(...) with some extended syntax to define labels in the type.

Access to tuple columns in scalar contexts

If a scalar expression E has composite type (..., "a", ...) (tuple with at least some column label "a"), then the expression (E).a is valid and refers to the element labeled "a" in the tuple E.

For example:

> SELECT (kv).v FROM kv;
 v
---
 2
 3
 3

Note: This is a mechanism that is distinct from the common column name resolution on data sources, e.g. kv.v. The relational expression is first reified as a tuple in the scalar context, and then one of the tuple columns is accessed.

Implementation wise, the expression of the form (E).label must be handled by a dedicated scalar expression node type, for example ColumnAccessExpr introduced in #24832.

Propagation across relational expression boundaries

Composite types are regular types, and propagate naturally across sub-queries.

For example:

>  SELECT (a).v FROM (SELECT kv AS a FROM kv) AS woo;
 v
---
 2
 3
 3

Implementation wise, CockroachDB must recognize composite types during type checking. We can perhaps use the existing types.Table for this purpose, some more investigation needed to check this.

"Star expansion" in composite expressions

If an expression of the form (E).* is present at the top-level of a render expression (a.k.a. "Select target expression", "projection expression" or SelectExpr in the current source code), then a special rule for star expansion kicks in:

  • the expression E is checked to have composite type (according to the rules above)
  • the render expr is replaced by a sequence of expressions of the form (@n).lbl1, (@n).lbl2, (@n).lbl3, ... where:
    • @n refers to the single IndexedVar of an underlying projection that will compute the composite expression just once
    • lbl1, lbl2, ... refer to the labels of the composite.

Note: this seems to be a separate mechanism than the regular star expansion.

Fancy example

This is the original example that motivates this issue:

> SELECT (i.keys).n FROM (SELECT information_schema._pg_expandarray(ARRAY[3,2,1]) AS keys) AS i;

This query really means the following:

> WITH i AS SELECT (tmp_tbl) AS keys -- generates a composite expr, rename it to `keys`
              FROM information_schema._pg_expandarray(ARRAY[3,2,1])) AS tmp_tbl
    SELECT (tmp_tuple).n  -- uses a tuple element access sub-expression
      FROM (SELECT keys AS tmp_tuple FROM i)

In particular in the scalar expression (i.keys).n:

  • i identifies a relational expression in a FROM clause
  • i.keys is a regular column reference in one of the FROM relational expressions, and designates the (single) column in table i
  • the column i.keys has composite type, with arity 2 and labels n and x
  • the expression (i.keys).n accesses the column named n in the tuple column i.keys.

So in order to support this query CockroachDB must be able to:

  1. lift an SRF in render position into the FROM clause and replace it by a composite constructor
  2. properly type composite expressions and propagate their types
  3. support the tuple element access expression for composite types

We can start the work by introducing the proper typing and evaluation rules and use types.Table for composite expressions. No need to support storing of composite values yet.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-sql-pgcompatSemantic compatibility with PostgreSQLC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions