- It is now possible to specify flow summaries in the format "MyPkg;Member[list_map];Argument[1].ListElement;Argument[0].Parameter[0];value"
- Deleted many models that used the old dataflow library, the new models can be found in the
python/ql/lib/semmle/python/frameworksfolder. - More precise modeling of several container functions (such as
sorted,reversed) and methods (such asset.add,list.append). - Added modeling of taint flow through the template argument of
flask.render_template_stringandflask.stream_template_string. - Deleted many deprecated predicates and classes with uppercase
API,HTTP,XSS,SQL, etc. in their names. Use the PascalCased versions instead. - Deleted the deprecated
getName()predicate from theContainerclass, usegetAbsolutePath()instead. - Deleted many deprecated module names that started with a lowercase letter, use the versions that start with an uppercase letter instead.
- Deleted many deprecated predicates in
PointsTo.qll. - Deleted many deprecated files from the
semmle.python.securitypackage. - Deleted the deprecated
BottleRoutePointToExtensionclass fromExtensions.qll. - Type tracking is now aware of flow summaries. This leads to a richer API graph, and may lead to more results in some queries.
No user-facing changes.
No user-facing changes.
- Type tracking is now aware of reads of captured variables (variables defined in an outer scope). This leads to a richer API graph, and may lead to more results in some queries.
- Added more content-flow/field-flow for dictionaries, by adding support for reads through
mydict.get("key")andmydict.setdefault("key", value), and store steps throughdict["key"] = valueandmydict.setdefault("key", value).
- Added support for querying the contents of YAML files.
- The recently introduced new data flow and taint tracking APIs have had a number of module and predicate renamings. The old APIs remain in place for now.
- Added modeling of SQL execution in the packages
sqlite3.dbapi2,cassandra-driver,aiosqlite, and the functionssqlite3.Connection.executescript/sqlite3.Cursor.executescriptandasyncpg.connection.connect(). - Fixed module resolution so we allow imports of definitions that have had an attribute assigned to it, such as
class Foo; Foo.bar = 42.
- Fixed some accidental predicate visibility in the backwards-compatible wrapper for data flow configurations. In particular,
DataFlow::hasFlowPath,DataFlow::hasFlow,DataFlow::hasFlowTo, andDataFlow::hasFlowToExprwere accidentally exposed in a single version.
No user-facing changes.
- Added support for merging two
PathGraphs via disjoint union to allow results from multiple data flow computations in a singlepath-problemquery.
- The main data flow and taint tracking APIs have been changed. The old APIs remain in place for now and translate to the new through a backwards-compatible wrapper. If multiple configurations are in scope simultaneously, then this may affect results slightly. The new API is quite similar to the old, but makes use of a configuration module instead of a configuration class.
- Deleted the deprecated
getPathandgetFolderpredicates from theXmlFileclass.
- We use a new analysis for the call-graph (determining which function is called). This can lead to changed results. In most cases this is much more accurate than the old call-graph that was based on points-to, but we do lose a few valid edges in the call-graph, especially around methods that are not defined inside its class.
- Fixed module resolution so we properly recognize definitions made within if-then-else statements.
- Added modeling of cryptographic operations in the
hmaclibrary.
- Python 2 is no longer supported for extracting databases using the CodeQL CLI. As a consequence,
the previously deprecated support for
pyxlandspitfiretemplates has also been removed. When extracting Python 2 code, having Python 2 installed is still recommended, as this ensures the correct version of the Python standard library is extracted.
- Fixed module resolution so we properly recognize that in
from <pkg> import *, where<pkg>is a package, the actual imports are made from the<pkg>/__init__.pyfile.
No user-facing changes.
No user-facing changes.
- The PAM authorization bypass due to incorrect usage (
py/pam-auth-bypass) query has been converted to a taint-tracking query, resulting in significantly fewer false positives.
- Added
subprocess.getoutputandsubprocess.getoutputstatusas new command injection sinks for the StdLib. - The data-flow library has been rewritten to no longer rely on the points-to analysis in order to resolve references to modules. Improvements in the module resolution can lead to more results.
- Deleted the deprecated
importNodepredicate from theDataFlowUtil.qllfile. - Deleted the deprecated features from
PEP249.qllthat were not inside thePEP249module. - Deleted the deprecated
werkzeugfrom theWerkzeugmodule inWerkzeug.qll. - Deleted the deprecated
methodResultpredicate fromPEP249::Cursor.
except*is now supported.- The result of
Try.getAHandlerandTry.getHandler(<index>)is no longer of typeExceptStmt, as handlers may also beExceptGroupStmts (After Python 3.11 introduced PEP 654). Instead, it is of the new typeExceptionHandlerof whichExceptStmtandExceptGroupStmtare subtypes. To support selecting only one type of handler,Try.getANormalHandlerandTry.getAGroupHandlerhave been added. Existing uses ofTry.getAHandlerfor which it is important to select only normal handlers, will need to be updated toTry.getANormalHandler.
No user-facing changes.
No user-facing changes.
- The ReDoS libraries in
semmle.code.python.security.regexphave been moved to a shared pack inside theshared/folder, and the previous location has been deprecated.
No user-facing changes.
- Fixed labels in the API graph pertaining to definitions of subscripts. Previously, these were found by
getMemberrather thangetASubscript. - Added edges for indices of subscripts to the API graph. Now a subscripted API node will have an edge to the API node for the index expression. So if
foois matched by API nodeA, then"key"infoo["key"]will be matched by the API nodeA.getIndex(). This can be used to track the origin of the index. - Added member predicate
getSubscriptAt(API::Node index)toAPI::Node. LikegetASubscript(), this will return an API node that matches a subscript of the node, but here it will be restricted to subscripts where the index matches theindexparameter. - Added convenience predicate
getSubscript("key")to obtain a subscript at a specific index, when the index happens to be a statically known string.
- Added the ability to refer to subscript operations in the API graph. It is now possible to write
response().getMember("cookies").getASubscript()to find code likeresp.cookies["key"](assumingresponsereturns an API node for response objects). - Added modeling of creating Flask responses with
flask.jsonify.
- Some unused predicates in
SsaDefinitions.qll,TObject.qll,protocols.qll, and thepointsto/folder have been deprecated. - Some classes/modules with upper-case acronyms in their name have been renamed to follow our style-guide. The old name still exists as a deprecated alias.
- Changed
CallNode.getArgByNamesuch that it has results for keyword arguments given after a dictionary unpacking argument, as thebar=2argument infunc(foo=1, **kwargs, bar=2). getStarArgmember-predicate onCallandCallNodehas been changed for calls that have multiple*argsarguments (for examplefunc(42, *my_args, *other_args)): Instead of producing no results, it will always have a result for the first such*argsargument.- Reads of global/non-local variables (without annotations) inside functions defined on classes now works properly in the case where the class had an attribute defined with the same name as the non-local variable.
- Fixed an issue in the taint tracking analysis where implicit reads were not allowed by default in sinks or additional taint steps that used flow states.
- Many classes/predicates/modules with upper-case acronyms in their name have been renamed to follow our style-guide. The old name still exists as a deprecated alias.
- The utility files previously in the
semmle.python.security.performancepackage have been moved to thesemmle.python.security.regexppackage.
The previous files still exist as deprecated aliases.
- Most deprecated predicates/classes/modules that have been deprecated for over a year have been deleted.
- Change
.getASubclass()onAPI::Nodeso it allows to follow subclasses even if the class has a class decorator.
- The documentation of API graphs (the
APImodule) has been expanded, and some of the members predicates ofAPI::Nodehave been renamed as follows:getAnImmediateUse->asSourcegetARhs->asSinkgetAUse->getAValueReachableFromSourcegetAValueReachingRhs->getAValueReachingSink
- Improved modeling of sensitive data sources, so common words like
certainandsecretaryare no longer considered a certificate and a secret (respectively).
- The
BarrierGuardclass has been deprecated. Such barriers and sanitizers can now instead be created using the newBarrierGuardparameterized module.
API::moduleImportno longer has any results for dotted names, such asAPI::moduleImport("foo.bar"). UsingAPI::moduleImport("foo.bar").getMember("baz").getACall()previously worked if the Python code wasfrom foo.bar import baz; baz(), but not if the code wasimport foo.bar; foo.bar.baz()-- we are making this change to ensure the approach that can handle all cases is always used.
- The imports made available from
import pythonare no longer exposed underDataFlow::after doingimport semmle.python.dataflow.new.DataFlow, for example usingDataFlow::Addwill now cause a compile error.
- The modeling of
request.filesin Flask has been fixed, so we now properly handle assignments to local variables (such asfiles = request.files; files['key'].filename). - Added taint propagation for
io.StringIOandio.BytesIO. This addition was originally submitted as part of an experimental query by @jorgectf.
- The signature of
allowImplicitReadonDataFlow::ConfigurationandTaintTracking::Configurationhas changed fromallowImplicitRead(DataFlow::Node node, DataFlow::Content c)toallowImplicitRead(DataFlow::Node node, DataFlow::ContentSet c).
- The recently added flow-state versions of
isBarrierIn,isBarrierOut,isSanitizerIn, andisSanitizerOutin the data flow and taint tracking libraries have been removed.
- Queries importing a data-flow configuration from
semmle.python.security.dataflowshould ensure that the imported file ends withQuery, and only import its top-level module. For example, a query that usedCommandInjection::Configurationfromsemmle.python.security.dataflow.CommandInjectionshould from now useConfigurationfromsemmle.python.security.dataflow.CommandInjectionQueryinstead.
- Added data-flow for Django ORM models that are saved in a database (no
models.ForeignKeysupport).
- Improved modeling of Flask
Responseobjects, so passing a response body with the keyword argumentresponseis now recognized.
- The flow state variants of
isBarrierandisAdditionalFlowStepare no longer exposed in the taint tracking library. TheisSanitizerandisAdditionalTaintSteppredicates should be used instead.
- Many classes/predicates/modules that had upper-case acronyms have been renamed to follow our style-guide. The old name still exists as a deprecated alias.
- Some modules that started with a lowercase letter have been renamed to follow our style-guide. The old name still exists as a deprecated alias.
- The data flow and taint tracking libraries have been extended with versions of
isBarrierIn,isBarrierOut, andisBarrierGuard, respectivelyisSanitizerIn,isSanitizerOut, andisSanitizerGuard, that support flow states.
- All deprecated predicates/classes/modules that have been deprecated for over a year have been deleted.
- Added new SSRF sinks for
httpx,pycurl,urllib,urllib2,urllib3, andlibtaxii. This improvement was submitted by @haby0. - The regular expression parser now groups sequences of normal characters. This reduces the number of instances of
RegExpNormalChar. - Fixed taint propagation for attribute assignment. In the assignment
x.foo = taintedwe no longer treat the entire objectxas tainted, just because the attributefoocontains tainted data. This leads to slightly fewer false positives. - Improved analysis of attributes for data-flow and taint tracking queries, so
getattr/setattrare supported, and a write to an attribute properly stops flow for the old value in that attribute. - Added post-update nodes (
DataFlow::PostUpdateNode) for arguments in calls that can't be resolved.
- The old points-to based modeling has been deprecated. Use the new type-tracking/API-graphs based modeling instead.
- Moved the files defining regex injection configuration and customization, instead of
import semmle.python.security.injection.RegexInjectionplease useimport semmle.python.security.dataflow.RegexInjection(the same forRegexInjectionCustomizations). - The
codeql/python-upgradesCodeQL pack has been removed. All upgrades scripts have been merged into thecodeql/python-allCodeQL pack.
- Added modeling of many functions from the
osmodule that uses file system paths, such asos.stat,os.chdir,os.mkdir, and so on. - Added modeling of the
tempfilemodule for creating temporary files and directories, such as the functionstempfile.NamedTemporaryFileandtempfile.TemporaryDirectory. - Extended the modeling of FastAPI such that custom subclasses of
fastapi.APIRouterare recognized. - Extended the modeling of FastAPI such that
fastapi.responses.FileResponseare consideredFileSystemAccess. - Added modeling of the
posixpath,ntpath, andgenericpathmodules for path operations (although these are not supposed to be used), resulting in new sinks. - Added modeling of
wsgiref.simple_serverapplications, leading to new remote flow sources.
- Added modeling of
os.stat,os.lstat,os.statvfs,os.fstat, andos.fstatvfs, which are new sinks for the Uncontrolled data used in path expression (py/path-injection) query. - Added modeling of the
posixpath,ntpath, andgenericpathmodules for path operations (although these are not supposed to be used), resulting in new sinks for the Uncontrolled data used in path expression (py/path-injection) query. - Added modeling of
wsgiref.simple_serverapplications, leading to new remote flow sources. - Added modeling of
aiopgfor sinks executing SQL. - Added modeling of HTTP requests and responses when using
flask_admin(Flask-AdminPyPI package), which leads to additional remote flow sources. - Added modeling of the PyPI package
toml, which provides encoding/decoding of TOML documents, leading to new taint-tracking steps.