Skip to content

Fixed #36795 -- Enforced quoting of all database object names.#20587

Merged
jacobtylerwalls merged 3 commits intodjango:mainfrom
charettes:ticket-36795
Mar 19, 2026
Merged

Fixed #36795 -- Enforced quoting of all database object names.#20587
jacobtylerwalls merged 3 commits intodjango:mainfrom
charettes:ticket-36795

Conversation

@charettes
Copy link
Copy Markdown
Member

Trac ticket number

ticket-36795

Branch description

Currenly trying to learn more about why we started not quoting users probided aliases as 9c52d56 which introduced the change has very little details about its origin.

AI Assistance Disclosure (REQUIRED)

  • No AI tools were used in preparing this PR.
  • If AI tools were used, I have disclosed which ones, and fully reviewed and verified their output.

@charettes
Copy link
Copy Markdown
Member Author

This is promising.

So from my Git archeology session it seems that the behaviour was introduced to support backends that care about the case of aliases such as Postgres.

If we are to move forward with this patch it means that mixing RawSQL references with annotations will have to adjust their quoting adequately (e.g. annotate(bAr=...).filter(RawSQL("bar = %s", (...))) but I think that's a worthy compromise given the benefits its provides from an SQL injection protection perspective.

Running tests on Oracle now to see if this can help with ticket-20226

@charettes
Copy link
Copy Markdown
Member Author

charettes commented Jan 27, 2026

The Oracle failures were very telling and explain ticket-20226.

As Shai pointed out on the ticket we'd have to break backward compatiblity to change that so this ticket will continue to focus solely on the quoting aspect.

It's kind of weird that we decided to default to always upper casing on Oracle to deal with SQL92 delimited (quoted) names handling of case while we decided not to do the same thing on Postgres which also follows the same standard.

Copy link
Copy Markdown
Member Author

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jacobtylerwalls let me know what you think of the approach so far. The commit history obvisouly needs some rework.

We'll want to keep disallowing some characters in users provided aliases until we adjust the quote_alias method of each backend to transform them (e.g. handle " as """, strip \x00) according to their respective docs) but the current patch should at least provide a fondation to ensure we don't have ever skip calling quote_name on aliases.

Copy link
Copy Markdown
Member

@jacobtylerwalls jacobtylerwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sparks a lot of joy. Makes total sense for a first step.

My understanding is that the rest of the plan is:

  • make quote_name_unless_alias() a no-op for connection.ops.quote_name() and deprecate it
  • document the need to possibly adjust casing of RawSQL expressions
  • [in follow ups:] begin undoing FORBIDDEN_ALIAS_PATTERN by auditing the quote_name() methods for each backend to see what needs special handling

Do you anticipate any issues with these .lower() comparisons?

def resolve_expression(
self, query=None, allow_joins=True, reuse=None, summarize=False, for_save=False
):
# Resolve parents fields used in raw SQL.
if query.model:
for parent in query.model._meta.all_parents:
for parent_field in parent._meta.local_fields:
if parent_field.column.lower() in self.sql.lower():
query.resolve_ref(
parent_field.name, allow_joins, reuse, summarize
)
break

@charettes
Copy link
Copy Markdown
Member Author

charettes commented Feb 1, 2026

@jacobtylerwalls

My understanding is that the rest of the plan is:

Yep that would be the plan, I'll spend some time re-organizing the commits to do the following

  1. Introduce SQLCompiler.quote_name, switch everything to it, document the impact to raw interfaces (extra, RawSQL) in the release notes.
  2. Deprecate quote_name_unless_alias and document it.

Adjusting quote_name to handle more chars should effectively be done in a follow up. I was thinking we could hash the tail of aliases the moment they include disallowed characters (per backend rules) and that would allow pretty much any user specified aliases to be provided and hopefully halt this torrent of reports.

Hashing the moment a disallowed character is provided is safe even if it results in a different name as long as it's deterministic. We don't have to worry about whether it's a user provided alias or not as the moment it includes disallowed characters it means an object could not have been created with this name in the first place and thus it must be a user provided alias.

Do you anticipate any issues with these .lower() comparisons?

def resolve_expression(
self, query=None, allow_joins=True, reuse=None, summarize=False, for_save=False
):
# Resolve parents fields used in raw SQL.
if query.model:
for parent in query.model._meta.all_parents:
for parent_field in parent._meta.local_fields:
if parent_field.column.lower() in self.sql.lower():
query.resolve_ref(
parent_field.name, allow_joins, reuse, summarize
)
break

Not that I can think of as even if the name is quoted it will stil be idenifiable through a containment check. In other words

'quoted_alias' in 'UPPER(quoted_alias) will still pass with 'quoted_alias' in 'UPPER("quoted_alias").

@charettes charettes changed the title Refs #36795 -- Always quote user-provided aliases Fixed #36795 -- Enforced quoting of all database object names. Feb 1, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 1, 2026

📊 Coverage Report for Changed Files

-------------
Diff Coverage
Diff: origin/main...HEAD, staged and unstaged changes
-------------
django/db/backends/mysql/compiler.py (0.0%): Missing lines 31
django/db/models/expressions.py (100%)
django/db/models/sql/compiler.py (91.7%): Missing lines 1489
django/db/models/sql/datastructures.py (100%)
-------------
Total:   17 lines
Missing: 2 lines
Coverage: 88%
-------------


Note: Missing lines are warnings only. Some lines may not be covered by SQLite tests as they are database-specific.

For more information about code coverage on pull requests, see the contributing documentation.

@jacobtylerwalls
Copy link
Copy Markdown
Member

Not that I can think of as even if the name is quoted it will stil be idenifiable through a containment check

To flesh out my thought, based on your hint I was anticipating that a user would need to change:

.annotate(bAr=...).filter(foo=RawSQL("bar = %s", (...)))

to:

.annotate(bAr=...).filter(foo=RawSQL("bAr = %s", (...)))

And then we would hit this line, where self.sql is now

(Pdb) self.sql
'bAr = %s'

But still maps to the same self.sql.lower() as before, meaning the containment check could return True when we don't want it to return True, given that we're introducing case sensitivity by quoting aliases:

# sortorder is a field on parent model
# Raw query on the proxy model using a uniquely cased alias SORTORDER should not cause parent field resolving?
>>> Proxy.objects.annotate(SORTORDER=F("pk")).annotate(foo=RawSQL("SORTORDER = %s", (1,))).query
# str()
SELECT ..., "tiles"."tileid" AS "SORTORDER", (SORTORDER = 1) AS "foo" FROM "tiles"

This still resolves sortorder field on the parent model even though SORTORDER is now case-distinct. So just a little unnecessary resolving? Don't know if this is a real problem, just hunting around for edge cases.

@charettes
Copy link
Copy Markdown
Member Author

charettes commented Feb 23, 2026

@jacobtylerwalls

to flesh out my thought, based on your hint I was anticipating that a user would need to change

Some Raw SQL references to annotations composed of upper cased characters might to be updated but I'll note that the SQL standard doesn't allow SELECT members to be referenced by follow up members of the same query.

Example in the SELECT clause (which would be follow up annotate call)s

and in the WHERE clause (which would be follow up filter call) except for SQLite which ignores the case anyway.

So I can't think of common cases where RawSQL would need to be adjusted in the way you describe

There are cases like a FilteredRelation annotation (which now has its JOIN name quoted) but these need to be referenced by a filter or annotate to be actually used (otherwise the join is not materialized) so I feel like the intersection of mixed case is very small.

As for the poor man implicit parent JOIN logic in RawSQL.as_sql I wouldn't worry too much about it as it can already be broken in multiple ways (e.g. imagine a parent field with a very generic name like rent colliding with the current_timestamp function usage in RawSQL) and the annotate.annotate example you provided is not realistic as a SELECT member cannot reference a previous alias by name unless I'm missing something.

@jacobtylerwalls
Copy link
Copy Markdown
Member

Thanks, that helps 👍

This still resolves sortorder field on the parent model even though SORTORDER is now case-distinct. So just a little unnecessary resolving? Don't know if this is a real problem, just hunting around for edge cases.

Looking again, this example was unnecessarily complicated. Reducing to just one .annotate() call that uses an uppercase variant (SORTORDER) of a parent model's column (sortorder), and then orders by it in an order_by() call with RawSQL: the unnecessary resolving I saw is already present on main, so we don't need to be concerned here.

Copy link
Copy Markdown
Member Author

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look into adding the missing release notes in the next few days.

@charettes charettes marked this pull request as ready for review March 13, 2026 05:00
@charettes
Copy link
Copy Markdown
Member Author

buildbot, test on oracle.

@shaib
Copy link
Copy Markdown
Member

shaib commented Mar 17, 2026

I share @jacobtylerwalls 's joy at the promise in this, but at first glance, I suspect there's potential for breaking backwards compatibility (mostly with the extra aliases). I still need to look deeper to say anything concrete.

@shaib
Copy link
Copy Markdown
Member

shaib commented Mar 17, 2026

Ok, it seems I was confused by the name extra_select in the deleted code portion. If I read the code correctly, it does not refer to fields added by extra(select="...", but to fields added to the query for order-and-distinct reasons, which weren't requested as result columns. Also if I get it right, the validity of such field names is already verified in other code paths, so it's not a problem.

I was worried that something like .extra(select={"Alpha": 16})[0].alpha might work (before the change) because the extra name would be placed, unquoted, in a case-insensitive context. Apparently that already doesn't happen.

@charettes
Copy link
Copy Markdown
Member Author

charettes commented Mar 17, 2026

I was worried that something like .extra(select={"Alpha": 16})[0].alpha might work (before the change) because the extra name would be placed, unquoted, in a case-insensitive context. Apparently that already doesn't happen.

yeah the actual annotation attribute name and the alias used at the SQL level are (thankfully) two distinct things.

if annotation_col_map:
for attr_name, col_pos in annotation_col_map.items():
setattr(obj, attr_name, row[col_pos])

Since the database adapters always return tuples we keep a annotation-index map and perform assignment so the Python code is not affected by the SQL level alias.

This means that once quoting is in place we could potentially lift the limitations we've enforced on annotate and friends to disallow some characters and hash them if they are too hard to quote without affecting the Python level interface. Obviously that would have an impact on RawSQL references but as discussed above most backends don't allow SELECT aliases to be referenced in WHERE so that would only leave order_by.

Copy link
Copy Markdown
Member

@jacobtylerwalls jacobtylerwalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this!

Copy link
Copy Markdown
Member

@felixxm felixxm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charettes Thanks 👍

This ensures all database identifiers are quoted independently of their orign
and most importantly that user provided aliases through annotate() and alias()
which paves the way for dropping the allow list of characters such aliases can
contain.

This will require adjustments to raw SQL interfaces such as RawSQL that might
make reference to ORM managed annotations as these will now be quoted.

The `SQLCompiler.quote_name_unless_alias` method is kept for now as an alias
for the newly introduced `.quote_name` method but will be duly deprecated in
a follow up commit.
…liases feature flag.

Now that user provided aliases are systematically quoted there is no need to
disallow the usage of the dollar sign on Postgres.
It has been superseded with .quote_name(), which ensures aliases are
always quoted.
@charettes
Copy link
Copy Markdown
Member Author

Thanks for reviews @shaib, @felixxm and the latest tweaks @jacobtylerwalls 🙇

@jacobtylerwalls jacobtylerwalls merged commit 1786cd8 into django:main Mar 19, 2026
42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants