gh-125346: Fix decoding with non-standard Base64 alphabet#141128
gh-125346: Fix decoding with non-standard Base64 alphabet#141128serhiy-storchaka merged 11 commits intopython:mainfrom
Conversation
The "+" and "/" characters are no longer recognized as the part of the Base64 alphabet in base64.urlsafe_b64decode() and base64.b64decode() the altchars argument that does not contain them.
|
@serhiy-storchaka Thanks for this, can you link this PR to either the original issue or a new issue for tracking purposes. |
sethmlarson
left a comment
There was a problem hiding this comment.
I'm really concerned about the subtle breakages that this change could cause, especially because the default behavior is to throw away characters that aren't in the current alphabet. If the default behavior was to raise an error I would feel better about this change.
Makes me wonder if we should target validate=True with this behavior change (because IMO, the silent dropping of invalid characters is in itself a concerning behavior) and then long-term move to having validate be enabled by default?
serhiy-storchaka
left a comment
There was a problem hiding this comment.
This worries me too. We can keep the old behavior but emit a warning if characters + or / occur in Base64 data with the alternative alphabet.
But urlsafe_b64decode() does not have the validate parameter.
I wonder if for |
|
But is not passing |
|
@serhiy-storchaka Yes maybe you're right that using |
|
If this is only for 3.15, then making validate=True by default will make the code more reliable by default. |
…ternative alphabet is used (pythonGH-141128) Emit a warning in base64.urlsafe_b64decode() and base64.b64decode() when the "+" or "/" characters occur in the Base64 data with alternative alphabet if they are not the part of the alternative alphabet. It is a DeprecationWarning in the strict mode (will be error) and a FutureWarning in non-strict mode (will be ignored).
|
Sorry to comment on a merged pull request but I'd rather ask the question here before opening an issue: The altchars len validation moved from a simple assert to a ValueError. While I think it makes sense it's not being documented as being changed. Should it be ? Line 61 in 77c06f3 Does either of this concerns make sense ? If so I'll open a new issue. |
|
feel free to open a new PR referencing the same issue for little cleanup followups like that. We don't need to document ValueError or document that one as that is a normal error response to an API not being used quite right. |
The "+" and "/" characters are no longer recognized as the part of the Base64 alphabet in base64.urlsafe_b64decode() and base64.b64decode() the altchars argument that does not contain them.