Problem
There is an open sqlalchemy bug where the sqlalchemy.engine.url.URL class does not correctly url encode username and passwords when converting to a string.
This behavior can cause the xray patching of sqlalchemy Session and Connection execute functions to error when trying to urlparse the db url returned by sqlalchemy.
In my case, a closing square bracket ] in my password leads urlparse to interpret my db url as an invalid IPv6 address, but this could probably manifest as other errors depending on the unencoded characters.
Error parsing sql metadata.
Traceback (most recent call last):
File "path/to/aws_xray_sdk/ext/sqlalchemy_core/patch.py", line 22, in _sql_meta
url = urlparse(str(engine_instance.engine.url))
File "path/to/urllib/parse.py", line 384, in urlparse
splitresult = urlsplit(url, scheme, allow_fragments)
File "path/to/urllib/parse.py", line 486, in urlsplit
raise ValueError("Invalid IPv6 URL")
ValueError: Invalid IPv6 URL
Note: stack trace is for aws-xray-sdk 2.8.0, the linked code above is for current master head.
What I expected
No error logged from xray and full trace functionality.
My setup
aws-xray-sdk 2.8.0
Aurora Postgres 14 (but can affect any database)
pg8000 1.20.0 (but can affect any dialect)
Python 3.8
Steps to reproduce
CREATE DATABASE test_xray;
CREATE USER test_xray_user WITH ENCRYPTED PASSWORD 'test]password';
GRANT ALL PRIVILEGES ON DATABASE test_xray TO test_xray_user;
from sqlalchemy import create_engine
import urllib
from aws_xray_sdk.core import patch_all, xray_recorder
patch_all()
password = urllib.parse.quote_plus("test]password")
db_url = f"postgresql+pg8000://test_xray_user:{password}@myhost:5432/test_xray"
print(db_url)
# postgresql+pg8000://test_xray_user:test%5Dpassword@myhost:5432/test_xray
engine = create_engine(db_url)
# this is the upstream bug in sqlalchemy
print(engine.url)
# postgresql+pg8000://test_xray_user:test]password@myhost:5432/test_xray
with xray_recorder.in_segment("test_segment") as segment:
with engine.connect() as connection:
connection.execute("select 1")
This leads to the error in the Problem section being logged.
Current workaround
Avoid square brackets in passwords. There may be other characters that cause issues.
For RDS passwords managed by secrets manager, this can be configured in CDK using excludeCharacters for DatabaseCluster credentials, DatabaseSecrets, rotation definitions, etc.
Proposed fix -- starting point
The most correct fix is for the sqlalchemy bug to be fixed, but the maintainers stated on the linked bug that a fix will not be available for months.
This is a starting point for a fix that will be continue to work correctly even after the sqlalchemy bug is fixed upstream. It is NOT thoroughly tested.
$ git diff
diff --git a/aws_xray_sdk/ext/sqlalchemy_core/patch.py b/aws_xray_sdk/ext/sqlalchemy_core/patch.py
index 0551fe2..58df905 100644
--- a/aws_xray_sdk/ext/sqlalchemy_core/patch.py
+++ b/aws_xray_sdk/ext/sqlalchemy_core/patch.py
@@ -13,19 +13,21 @@ from aws_xray_sdk.ext.util import unwrap
def _sql_meta(engine_instance, args):
try:
- metadata = {}
- url = urlparse(str(engine_instance.engine.url))
+ # Workaround for https://github.com/sqlalchemy/sqlalchemy/issues/10662
+ # sqlalchemy.engine.url.URL's __repr__ does not url encode username nor password.
+ # This will continue to work once sqlalchemy fixes the bug.
+ sa_url = engine_instance.engine.url
+ username = sa_url.username
+ sa_url = sa_url._replace(username=None, password=None)
+ url = urlparse(str(sa_url))
+ name = url.netloc
+ if username:
+ # Restore url encoded username
+ quoted_username = urllib.parse.quote_plus(username)
+ url = url._replace(netloc='{}@{}'.format(quoted_username, url.netloc))
# Add Scheme to uses_netloc or // will be missing from url.
uses_netloc.append(url.scheme)
- if url.password is None:
- metadata['url'] = url.geturl()
- name = url.netloc
- else:
- # Strip password from URL
- host_info = url.netloc.rpartition('@')[-1]
- parts = url._replace(netloc='{}@{}'.format(url.username, host_info))
- metadata['url'] = parts.geturl()
- name = host_info
+ metadata['url'] = url.geturl()
metadata['user'] = url.username
metadata['database_type'] = engine_instance.engine.name
try:
Problem
There is an open sqlalchemy bug where the
sqlalchemy.engine.url.URLclass does not correctly url encode username and passwords when converting to a string.This behavior can cause the xray patching of sqlalchemy
SessionandConnectionexecutefunctions to error when trying to urlparse the db url returned by sqlalchemy.In my case, a closing square bracket
]in my password leads urlparse to interpret my db url as an invalid IPv6 address, but this could probably manifest as other errors depending on the unencoded characters.Note: stack trace is for aws-xray-sdk 2.8.0, the linked code above is for current master head.
What I expected
No error logged from xray and full trace functionality.
My setup
aws-xray-sdk 2.8.0
Aurora Postgres 14 (but can affect any database)
pg8000 1.20.0 (but can affect any dialect)
Python 3.8
Steps to reproduce
This leads to the error in the Problem section being logged.
Current workaround
Avoid square brackets in passwords. There may be other characters that cause issues.
For RDS passwords managed by secrets manager, this can be configured in CDK using
excludeCharactersforDatabaseClustercredentials, DatabaseSecrets, rotation definitions, etc.Proposed fix -- starting point
The most correct fix is for the sqlalchemy bug to be fixed, but the maintainers stated on the linked bug that a fix will not be available for months.
This is a starting point for a fix that will be continue to work correctly even after the sqlalchemy bug is fixed upstream. It is NOT thoroughly tested.