88

What are the best practices for mitigating SQL injection attacks when using SQLAlchemy?

2
  • 45
    Using SQLAlchemy is the best practice. :-) Commented Aug 5, 2012 at 12:22
  • Naively using SQLAlchemy is not going to protect you per se. Use ORM Mapped Classes and do not use raw SQL! Commented Oct 8, 2025 at 14:29

5 Answers 5

114

tldr: Avoid raw SQL as much as possible.

The accepted answer is lazy and incorrect. The filter method accepts raw SQL, and if used in that way, is fully susceptible to SQL injection attacks. For instance, if you were to accept a value from a url and combine it with raw sql in the filter, you are open to attack:

session.query(MyClass).filter("foo={}".format(getArgs['val']))

using the above code and the below url, you would be injecting SQL in to your filter statement. The code above would return all rows in your database.

URL encoded:

https://example.com/?val=2%20or%201%20=%201

Easier to understand (URL decoded):

https://example.com/?val=2 or 1 = 1
Sign up to request clarification or add additional context in comments.

8 Comments

"unless you deliberately bypass SQLAlchemy's quoting mechanisms..." Yes entering raw sql is deliberately bypassing that quoting mechanism. So no, the above answer is not incorrect.
I disagree. That you can pass raw sql to the filter method is part of sqlalchemy, not some end-around hack...so its worth noting here as something to be aware of.
If I have to take in user input for a filter, what's the correct way to make sure the user isn't entering raw SQL to drop tables or any other unexpected behavior?
@divide_by_zero well use this session.query(MyClass).filter(MyClass.foo == "{}".format(getArgs['val'])) This will probably throw psycopg2.InternalError invalid syntax if you try to inject something
Why on earth would you do this instead of session.query(MyClass).filter(MyClass.foo==getArgs['val']) or session.query(MyClass).filter_by(foo=getArgs['val'])?
|
55

If you have any "special" characters (such as semicolons or apostrophes) in your data, they will be automatically quoted for you by the SQLEngine object, so you don't have to worry about quoting. This also means that unless you deliberately bypass SQLAlchemy's quoting mechanisms, SQL-injection attacks are basically impossible.

[per http://www.rmunn.com/sqlalchemy-tutorial/tutorial.html]

2 Comments

The answer says that the quote comes from "the" documentation, when it doesn't: it seems to come from a tutorial not associated with SQLAlchemy. Second, the quote is in context of part of the SQLAlchemy API that will correctly handle escaping, using an example that handles escaping. However, you can still use execute() or other literal data that will NOT be escaped by SQLAlchemy. Yes, in MOST cases SQLAlchemy will auto-escape, but if you are using literals or raw SQL, you can still shoot yourself in the foot.
I'm looking to find the specific lines of code in the SQLAlchemy code repo that substantiates what @carson ip references as SQLAlchemy's "quoting mechanisms" Any tips?
17

To add to @Tendrid answer. I did a little investigation using quiet naive approach. filter method has *criterion as its argument, several other ORM Query methods have similar argument.

In case of filter method *criterion argument ends up passed into _literal_as_text, which in case of string - marks it as safe sql (please correct me if I'm wrong). Therefore it makes it unsafe.

Here is outcome of ORM Query class method investigation with *criterion argument:

filter   - uses _literal_as_text (NOT SAFE)
having   - uses _literal_as_text (NOT SAFE)

distinct - uses _literal_as_label_reference (NOT SAFE)
group_by - uses _literal_as_label_reference (NOT SAFE)
order_by - uses _literal_as_label_reference (NOT SAFE)

join     - uses model attributes to resolve relation (SAFE)

Examples of possible method missuses (to keep it simple, string formatting is skipped):

db.session.query(User.login).group_by('login').having('count(id) > 4; select name from roles').all()
db.session.query(User.login).distinct('name) name from roles /*').order_by('*/').all()
db.session.query(User.login).order_by('users_login; select name from roles').all()
db.session.query(User.login).group_by('login union select name from roles').all()

Note that these methods are only unsafe if string literal is passed.

Comments

11

I'm inclined to agree with @Tendrid's answer.

If you write this:

session.query(MyClass).filter("foo={}".format(getArgs['val']))

... you are creating an injection vulnerability.

SqlAlchemy's approach is to use bound parameters to avoid these injection attacks. The way you're meant to use filter() is to write:

session.query(MyClass).filter(MyClass.foo == getArgs['va'])

As SqlAlchemy has overloaded python's operators like == to escape the SQL correctly (and avoid injection). See here

There is a warning about this buried in the documentation of SqlAlchemy here that says:

Always use bound parameters

As mentioned at the beginning of this section, textual SQL is not the usual way we work with SQLAlchemy. However, when using textual SQL, a Python literal value, even non-strings like integers or dates, should never be stringified into SQL string directly; a parameter should always be used. This is most famously known as how to avoid SQL injection attacks when the data is untrusted. However it also allows the SQLAlchemy dialects and/or DBAPI to correctly handle the incoming input for the backend. Outside of plain textual SQL use cases, SQLAlchemy’s Core Expression API otherwise ensures that Python literal values are passed as bound parameters where appropriate.

And there's a section in the glossary on bound parameters here

It says:

Bound parameters are the primary means in which data is passed to the DBAPI database driver. While the operation to be invoked is based on the SQL statement string, the data values themselves are passed separately, where the driver contains logic that will safely process these strings and pass them to the backend database server, which may either involve formatting the parameters into the SQL string itself, or passing them to the database using separate protocols.

The specific system by which the database driver does this should not matter to the caller; the point is that on the outside, data should always be passed separately and not as part of the SQL string itself. This is integral both to having adequate security against SQL injections as well as allowing the driver to have the best performance.

Basically that means this:

session.query(MyClass).filter("foo={}".format(getArgs['val']))

... is broken because you're passing the data to the filter() together with the SQL statement foo=<data>.

You're meant to always keep the statement and data separate, i.e.:

session.query(MyClass).filter(MyClass.foo == getArgs['va'])

or

session.query(MyClass).filter_by(foo=getArgs['va'])

As then SqlAlchemy can work it's magic and do the escaping with bound parameters.

2 Comments

"SqlAlchemy has overloaded python's operators like == to escape the SQL correctly" - no - SQLAlchemy delegates escaping of values to the underlying DB-API connection. SQLAlchemy constructs the SQL statement with placeholders for values, then passes the statement and values to the DB-API connections cursor.execute method. This is what the glossary quote is saying.
Erm, well. Yes, your comment is more complete. But you're proving the correctness of what I said. If SqlAlchemy hadn't overloaded the python operators like ==, then Python would evaluate the == before a method like filter() receives anything. And it would result in a boolean. The boolean won't result in a useful SQL query. So SqlAlchemy has overloaded the python operators like == to return Sql statements wrapped in python objects. That allows it to do escaping whenever it passes those statements down to the DB-API. We could edit my answer to include what you've said because it's informative.
0

The accepted answer is not clear enough. Here is what the official documentation states:

[...] data should always be passed separately and not as part of the SQL string itself. This is integral both to having adequate security against SQL injections as well as allowing the driver to have the best performance.

https://docs.sqlalchemy.org/en/20/glossary.html#term-bind-parameters

Meaning: SQLAlchemy queries are safe if you use ORM Mapped Classes instead of plain strings (raw SQL). You can find official documentation here.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.