fix: escape delimiter characters in applyEscape to prevent SQL injection#2811
Conversation
…ent SQL injection The applyEscape() function wraps user-supplied template parameter values with delimiter characters (backticks, double quotes, single quotes, or square brackets) but does not escape those same delimiter characters within the value itself. This allows an attacker to break out of the quoted identifier and inject arbitrary SQL. For example, with backtick escaping configured: - Input: `users` OR 1=1--` - Before fix: `users` OR 1=1--`` (delimiter closed early, SQL injected) - After fix: `users`` OR 1=1--``` (backtick inside value is doubled) The fix applies standard SQL identifier escaping by doubling the delimiter character within the value before wrapping: - Backticks: ` → `` - Double quotes: " → "" - Single quotes: ' → '' - Square brackets: ] → ]] This matches the escaping conventions used by MySQL (backticks), PostgreSQL/ANSI SQL (double quotes), SQL Server (square brackets), and standard SQL (single quotes).
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical SQL injection vulnerability in the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly addresses a critical SQL injection vulnerability by properly escaping delimiter characters in the applyEscape function. The method used, doubling the delimiter characters, is the standard approach for SQL. However, the changes lack corresponding unit tests to validate the fix for injection scenarios. It is highly recommended to add test cases that cover inputs containing the delimiter characters to prevent regressions and ensure the fix is robust.
Adds 4 test cases that verify delimiter characters within values are properly escaped (doubled) to prevent SQL injection: - Backtick in value: users` OR 1=1-- → `users`` OR 1=1--` - Double quote in value: col" OR 1=1-- → "col"" OR 1=1--" - Single quote in value: val' OR 1=1-- → 'val'' OR 1=1--' - Square bracket in value: col] OR 1=1-- → [col]] OR 1=1--]
|
Thank you for the PR. We will review shortly. |
dd58d8b to
18e01ee
Compare
…to prevent injection (#3219) ## Summary This PR fixes identifier-injection vulnerabilities in three built-in tools that interpolate caller-supplied parameters directly into SQL without sanitisation. --- ### 1. `clickhouse-list-tables` — database identifier injection (original fix) `Tool.Invoke` interpolates the caller-supplied `database` parameter directly into a `SHOW TABLES FROM %s` statement: ```go query := fmt.Sprintf("SHOW TABLES FROM %s", database) ``` The existing comment acknowledges the risk. The fix adds a `validIdentifier` regex (`^[A-Za-z_][A-Za-z0-9_]*$`) that must match before interpolation. Concrete payloads on a default ClickHouse deployment: | `database` parameter | rendered query | effect | |---|---|---| | `default LIKE '%secret%'` | `SHOW TABLES FROM default LIKE '%secret%'` | filters listing by attacker-chosen pattern | | `system FORMAT JSONEachRow` | `SHOW TABLES FROM system FORMAT JSONEachRow` | enumerates `system.*` (system.users, etc.) | | `default INTO OUTFILE '/tmp/x.tsv'` | `SHOW TABLES FROM default INTO OUTFILE '/tmp/x.tsv'` | writes to disk on ClickHouse host | --- ### 2. `bigquery-forecast` — backtick + column-name injection Two injection classes: **a. Backtick injection in `history_data` (table identifier path)** ```go historyDataSource = fmt.Sprintf("TABLE `%s`", historyData) ``` When `allowed_datasets` is not configured (the default), no validation runs before this line. An attacker-supplied `history_data` containing a backtick closes the opening delimiter and appends arbitrary SQL: ``` history_data = "ds.tbl` UNION ALL SELECT secret FROM private.pii --" → TABLE `ds.tbl` UNION ALL SELECT secret FROM private.pii --` ``` **b. Column-name injection in `data_col`, `timestamp_col`, `id_cols`** ```go sql := fmt.Sprintf(`... data_col => '%s', timestamp_col => '%s' ...`, dataCol, timestampCol, ...) ``` Column names are interpolated as single-quoted strings with no validation. A value containing `'` breaks out of the string literal context. **Fix:** `ValidTableID` (restricts `history_data` to `[a-zA-Z0-9_]+` components with 1–2 dots) applied in the table-ID else-branch; `ValidColumnName` (`^[a-zA-Z_][a-zA-Z0-9_]*$`) applied to `data_col`, `timestamp_col`, and each element of `id_cols`. --- ### 3. `bigquery-analyze-contribution` — backtick + OPTIONS injection **a. Backtick injection in `input_data` (table identifier path)** ```go inputDataSource = fmt.Sprintf("SELECT * FROM `%s`", inputData) ``` Same root cause as bigquery-forecast: when `allowed_datasets` is absent, no validation before interpolation. **b. Column-name and OPTIONS injection** ```go options = append(options, fmt.Sprintf("IS_TEST_COL = '%s'", paramsMap["is_test_col"])) // and each element of dimension_id_cols: strCols = append(strCols, fmt.Sprintf("'%s'", c)) ``` `is_test_col` and `dimension_id_cols` are column names interpolated into BigQuery ML `OPTIONS()` string literals without validation. `contribution_metric` (e.g. `SUM(col)/COUNT(DISTINCT col2)`) goes into `OPTIONS(CONTRIBUTION_METRIC = '%s')`. A single-quote in this value breaks the OPTIONS string literal. **Fix:** `ValidTableID` for `input_data` in the table-ID else-branch; `ValidColumnName` for `is_test_col` and each `dimension_id_cols` element; single-quote check for `contribution_metric`. --- ## Shared validators in `bigquerycommon` `ValidTableID` and `ValidColumnName` are exported from the `bigquerycommon` package so both tools share the same rules. Tests are in `validators_test.go`. ## Test plan - [x] `go build ./...` clean - [x] `go test ./internal/tools/bigquery/... -v` passes (all existing tests + new `TestValidTableID`, `TestValidColumnName`) - [x] `go test ./internal/tools/clickhouse/clickhouselisttables/... -v` passes ## Not a dupe of #2811 / #779 Those cover the `templateParameter` flow (`internal/util/parameters/parameters.go:applyEscape`). The three tools fixed here use hard-coded `fmt.Sprintf` interpolation in their own `Invoke` methods — none of the templateParameter mitigations apply. --------- Co-authored-by: Yuan Teoh <45984206+Yuan325@users.noreply.github.com>
|
/gcbrun |
averikitsch
left a comment
There was a problem hiding this comment.
We may need to make this database specific.
|
/gcbrun |
Summary
The
applyEscape()function ininternal/util/parameters/parameters.gowraps user-supplied template parameter values with delimiter characters (backticks, double quotes, single quotes, or square brackets) but does not escape those same delimiter characters within the value itself. This allows an attacker to break out of the quoted identifier and inject arbitrary SQL.Security Impact
applyEscapeis the documented mitigation for SQL injection in template parameters:However, this mitigation is broken because delimiter characters inside the value are not escaped.
Example (backtick escaping)
Attack input:
users` WHERE 1=1 UNION SELECT * FROM credentials--Before fix (VULNERABLE):
The backtick in the input closes the identifier early, injecting arbitrary SQL.
After fix (SAFE):
The backtick inside the value is doubled, keeping the identifier intact.
The Fix
Standard SQL identifier escaping — double the delimiter character within the value before wrapping:
`` →``"" → ""'' → '']] → ]]Affected Code
internal/util/parameters/parameters.go, lines 783-796