Skip to content

CreateAggregateFunction skips the per-group callback for empty groups, diverging from SQLite's documented xFinal semantics #389

@pkit

Description

@pkit

Conn.CreateAggregateFunction registers an aggregate via the iter.Seq-based AggregateSeqFunction callback, but the callback is never invoked when a group has zero input rows. SQLite's C API guarantees the opposite: xFinal is always called exactly once per group, including the implicit single-group case of an ungrouped aggregate over an empty input.

As a result, custom aggregates registered through this API cannot produce the correct empty-input value for SQL-standard cases like COUNT(*) FROM <empty> (which must be 0, not NULL). The built-in count() and a manually-registered equivalent disagree, which is surprising for a library that otherwise tracks SQLite semantics faithfully.

Example:

package main

import (
      "fmt"
      "iter"

      "github.com/ncruces/go-sqlite3"
      _ "github.com/ncruces/go-sqlite3/embed"
)

func main() {
      db, err := sqlite3.Open(":memory:")
      if err != nil {
              panic(err)
      }
      defer db.Close()

      if err := db.CreateAggregateFunction("my_count", 0, 0,
              func(ctx *sqlite3.Context, seq iter.Seq[[]sqlite3.Value]) {
                      var n int64
                      for range seq {
                              n++
                      }
                      ctx.ResultInt64(n)
              }); err != nil {
              panic(err)
      }

      check := func(label, query string) {
              stmt, _, err := db.Prepare(query)
              if err != nil {
                      panic(err)
              }
              defer stmt.Close()
              stmt.Step()
              if stmt.ColumnType(0) == sqlite3.NULL {
                      fmt.Printf("%-20s -> NULL\n", label)
              } else {
                      fmt.Printf("%-20s -> %d\n", label, stmt.ColumnInt64(0))
              }
      }

      check("builtin count(*)", "SELECT count(*) FROM (SELECT 1) WHERE FALSE")
      check("custom my_count()", "SELECT my_count() FROM (SELECT 1) WHERE FALSE")
}

Output:

builtin count(*)     -> 0
custom my_count()    -> NULL

Results in weird bugs for the library users: goccy/bigquery-emulator#468

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions