Skip to content

fix(database_observability.postgres): Correctly handle table name casing when parsing postgres queries#5440

Merged
cristiangreco merged 4 commits intomainfrom
cristian/dbo11y-pg-parsed-table-name-case
Feb 10, 2026
Merged

fix(database_observability.postgres): Correctly handle table name casing when parsing postgres queries#5440
cristiangreco merged 4 commits intomainfrom
cristian/dbo11y-pg-parsed-table-name-case

Conversation

@cristiangreco
Copy link
Contributor

@cristiangreco cristiangreco commented Feb 4, 2026

Brief description of Pull Request

When extracting table names from sql queries for postgres, ensure that the casing of identifiers is handled according to postgres rules. Unquoted identifiers should be folded to lowercase, while quoted identifiers should preserve their case.

This change updates the normalizer in QueryDetails to retain quotation, and updates the TableRegistry validation logic to account for this behavior (the library behaviour seems inconsistent though, hence the lowercasing fallback). Also, if lowercasing is applied, the table is logged with lowercase name.

Pull Request Details

Issue(s) fixed by this Pull Request

Notes to the Reviewer

PR Checklist

  • Documentation added
  • Tests updated
  • Config converters updated

github.com/DataDog/datadog-api-client-go/v2 v2.51.0 // indirect
github.com/DataDog/datadog-go/v5 v5.8.2 // indirect
github.com/DataDog/go-sqllexer v0.1.10 // indirect
github.com/DataDog/go-sqllexer v0.1.12 // indirect
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by update of the library

@cristiangreco cristiangreco force-pushed the cristian/dbo11y-pg-parsed-table-name-case branch 2 times, most recently from fd03d08 to 6c1dc83 Compare February 4, 2026 16:54
Comment on lines +279 to +280
_, exists := tables[tableName]
return exists
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to avoid shadowing

@cristiangreco cristiangreco changed the title fix: Database_observability: fix handling table name casing in Postgres fix(database_observability): handle table name casing when parsing postgres queries Feb 4, 2026
@cristiangreco cristiangreco changed the title fix(database_observability): handle table name casing when parsing postgres queries fix(database_observability): Correctly handle table name casing when parsing postgres queries Feb 4, 2026
@cristiangreco cristiangreco force-pushed the cristian/dbo11y-pg-parsed-table-name-case branch from 6c1dc83 to 965b26b Compare February 6, 2026 14:13
@cristiangreco cristiangreco marked this pull request as ready for review February 9, 2026 11:53
@cristiangreco cristiangreco requested review from a team as code owners February 9, 2026 11:53
@github-actions
Copy link
Contributor

github-actions bot commented Feb 9, 2026

🔍 Dependency Review

github.com/DataDog/go-sqllexer v0.1.10 → v0.1.12 — ⚠️ Needs Review

Summary:

  • No breaking API removals detected between v0.1.10 and v0.1.12.
  • New Normalizer option to preserve identifier quotation/casing was introduced, which can change the case/format of collected table names if enabled.
  • If your code relies on case-insensitive matching or unquoted identifiers, enabling the new option may require additional normalization logic (as done in this PR).

What changed and why it matters:

  • New Normalizer option: WithKeepIdentifierQuotation(true) keeps double-quoted identifiers intact (e.g., public."MyTable"), preserving exact case for schema-qualified names. Without it, identifiers are normalized and quotes may be lost.
  • When this option is enabled, the strings returned by the Normalizer’s "collect tables" feature can include quoted identifiers and preserved casing for schema-qualified names. Consumers doing registry validation or logging may need to normalize identifiers to maintain expected behavior (e.g., PostgreSQL’s folding of unquoted identifiers to lowercase).

Releases between as-is and to-be:

  • v0.1.11: Added ability to preserve identifier quotation in Normalizer (option function on NewNormalizer). No method renames/removals observed.
  • v0.1.12: Follow-up fixes and improvements. No breaking API changes observed.

Evidence (in-project code updates needed due to the new behavior):

  • The upgrade adds WithKeepIdentifierQuotation(true) to sqllexer.NewNormalizer(...), which means downstream code now sees quoted identifiers for schema-qualified names and preserved case. To maintain consistent validation with a lowercased registry, additional normalization logic was introduced.

Required code updates (shown as concise diffs):

  1. Preserve identifier quotation during table parsing
- normalizer:       sqllexer.NewNormalizer(sqllexer.WithCollectTables(true), sqllexer.WithCollectComments(true)),
+ normalizer:       sqllexer.NewNormalizer(
+                       sqllexer.WithCollectTables(true),
+                       sqllexer.WithCollectComments(true),
+                       sqllexer.WithKeepIdentifierQuotation(true),
+                   ),
  1. Resolve and normalize table names when validating against the registry (handle PostgreSQL identifier folding and quoted/schema-qualified names)
-// IsValid returns whether or not a given database and parsed table name exists in the source-of-truth table registry
-func (tr *TableRegistry) IsValid(database database, parsedTableName string) bool {
+// IsValid returns whether a given database and parsed table name exists in the registry,
+// and also returns the resolved (possibly normalized) table name.
+func (tr *TableRegistry) IsValid(database database, parsedTableName string) (string, bool) {
     tr.mu.RLock()
     defer tr.mu.RUnlock()

     schemas, ok := tr.tables[database]
     if !ok {
-        return false
+        return parsedTableName, false
     }

     schemaName, tableName := parseSchemaQualifiedIfAny(parsedTableName)

     switch schemaName {
     case "":
         for _, tables := range schemas {
             if _, ok := tables[tableName]; ok {
-                return true
+                return string(tableName), true
             }
+            // Without schema, sqllexer doesn't preserve quotes; emulate PG folding
+            lowercaseName := table(strings.ToLower(string(tableName)))
+            if lowercaseName != tableName {
+                if _, ok := tables[lowercaseName]; ok {
+                    return string(lowercaseName), true
+                }
+            }
         }
     default:
         if tables, ok := schemas[schemaName]; ok {
-            _, ok := tables[tableName]
-            return ok
+            if _, exists := tables[tableName]; exists {
+                return string(schemaName) + "." + string(tableName), true
+            }
         }
     }
-    return false
+    return parsedTableName, false
 }
  1. Normalize schema-qualified identifiers according to PostgreSQL rules (strip quotes for quoted identifiers, lowercase unquoted)
-// parseSchemaQualifiedIfAny returns separated schema and table if the parsedTableName is schema-qualified, e.g. SELECT * FROM schema_name.table_name
+// parseSchemaQualifiedIfAny returns separated schema and table if the parsedTableName is schema-qualified.
+// It normalizes identifiers: quoted => preserve case; unquoted => lowercase (PG folding).
 func parseSchemaQualifiedIfAny(parsedTableName string) (schema, table) {
     parts := strings.SplitN(parsedTableName, ".", 2)
     if len(parts) == 2 {
-        return schema(parts[0]), table(parts[1])
+        return schema(formatPostgresIdentifier(parts[0])), table(formatPostgresIdentifier(parts[1]))
     }
     return "", table(parsedTableName)
 }

+func formatPostgresIdentifier(identifier string) string {
+    if len(identifier) >= 2 && identifier[0] == '"' && identifier[len(identifier)-1] == '"' {
+        return identifier[1 : len(identifier)-1]
+    }
+    return strings.ToLower(identifier)
+}
  1. Update call sites to handle the resolved name and to log the resolved name rather than the raw parsed one
-validated := false
+validated := false
+resolvedTable := table
 if c.tableRegistry != nil {
-    validated = c.tableRegistry.IsValid(databaseName, table)
+    resolvedTable, validated = c.tableRegistry.IsValid(databaseName, table)
 }

 c.entryHandler.Chan() <- database_observability.BuildLokiEntry(
     logging.LevelInfo,
     OP_QUERY_PARSED_TABLE_NAME,
-    fmt.Sprintf(`queryid="%s" datname="%s" table="%s" validated="%t"`, queryID, databaseName, table, validated),
+    fmt.Sprintf(`queryid="%s" datname="%s" table="%s" validated="%t"`, queryID, databaseName, resolvedTable, validated),
 )

Why these changes are necessary:

  • With WithKeepIdentifierQuotation(true), schema-qualified identifiers may include quotes and preserve case. To keep existing behavior (treat unquoted names case-insensitively per PostgreSQL and accurately match quoted names by exact case), the registry lookups need to normalize names and return a resolved, canonical table name for logging and downstream processing.

Test impact:

  • Add/adjust tests for:
    • Uppercase/mixed-case table names vs. lowercase registry.
    • Quoted schema/table names preserving case.
    • Schema-qualified inputs folding correctly when unquoted.

Example tests added in this PR:

  • Uppercase table matching lowercase registry:
    • SELECT * FROM SOME_TABLE → resolved/logged as some_table
  • Schema-qualified uppercase inputs:
    • SELECT * FROM PUBLIC.USERS → resolved/logged as public.users
  • Quoted identifiers:
    • public."MyTable" matches only exact-cased entry.

Notes from dependency releases (concise):

  • v0.1.11:
    • Added Normalizer option to keep identifier quotation.
    • No removals of existing APIs observed.
  • v0.1.12:
    • Minor fixes/improvements; no breaking API changes observed.

Action for maintainers:

  • If you enable WithKeepIdentifierQuotation(true) (recommended for accuracy with mixed/quoted identifiers), ensure you normalize identifiers as shown above. If you do not enable it, your existing behavior should remain compatible, but you will not have precise case/quote fidelity for schema-qualified names.

Notes

  • Only one dependency changed in go.mod: github.com/DataDog/go-sqllexer v0.1.10 → v0.1.12 (direct in root module, indirect in submodules).
  • No other dependency changes were assessed.

When extracting table names from sql queries for postgres, ensure that
the casing of identifiers is handled according to postgres rules.
Unquoted identifiers should be folded to lowercase, while quoted
identifiers should preserve their case.

This change updates the normalizer in QueryDetails to retain quotation,
and updates the TableRegistry validation logic to account for this
behavior (the library behaviour seems inconsistent though, hence
the lowercasing fallback).
@cristiangreco cristiangreco force-pushed the cristian/dbo11y-pg-parsed-table-name-case branch from 965b26b to 2c37693 Compare February 9, 2026 15:18
// normalizePostgresIdentifier handles PostgreSQL identifier case folding.
// Quoted identifiers (e.g., "MyTable") preserve their exact case after stripping quotes.
// Unquoted identifiers are folded to lowercase to match PostgreSQL's behavior.
func normalizePostgresIdentifier(identifier string) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is this really normalisation?

@cristiangreco cristiangreco changed the title fix(database_observability): Correctly handle table name casing when parsing postgres queries fix(database_observability.postgres): Correctly handle table name casing when parsing postgres queries Feb 10, 2026
@cristiangreco cristiangreco merged commit 7cca2b9 into main Feb 10, 2026
52 of 53 checks passed
@cristiangreco cristiangreco deleted the cristian/dbo11y-pg-parsed-table-name-case branch February 10, 2026 14:38
@grafana-alloybot grafana-alloybot bot mentioned this pull request Feb 10, 2026
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 25, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants