What did you do?
python3 -c 'import ujson; ujson.dumps(["aaaa", "\x00" * 10921])'
What did you expect to happen?
No crash
What actually happened?
SIGSEGV
What versions are you using?
- OS: Debian Sid
- Python: 3.10.1
- UltraJSON: 316d384
Background
The input here is constructed to exactly hit the buffer boundary. To better see what's going on, I added some fprintf statements in the Buffer_Reserve macro:
diff --git a/lib/ultrajsonenc.c b/lib/ultrajsonenc.c
index a9f3ef1..874d332 100644
--- a/lib/ultrajsonenc.c
+++ b/lib/ultrajsonenc.c
@@ -488,8 +488,10 @@ static int Buffer_EscapeStringValidated (JSOBJ obj, JSONObjectEncoder *enc, cons
}
#define Buffer_Reserve(__enc, __len) \
+ fprintf(stderr, "reserve %zu, remaining %zu\n", (size_t) (__len), (size_t) ((__enc)->end - (__enc)->offset)); \
if ( (size_t) ((__enc)->end - (__enc)->offset) < (size_t) (__len)) \
{ \
+ fprintf(stderr, "realloc\n"); \
Buffer_Realloc((__enc), (__len));\
} \
With the command above, the output is this, with comments of what they correspond to:
- reserve 258, remaining 65536 – initial call for encoding the list; evidently, the initial buffer size is 64 KiB (coming from
objToJSON)
- reserve 258, remaining 65535 – pre-
name call on first list element; as there is no name, this call is useless
- reserve 26, remaining 65535 –
"aaaa" reservation
- reserve 258, remaining 65528 – pre-
name call on second list element
- reserve 65528, remaining 65528 –
"\x00" * 10921 reservation; note that this exactly consumes the rest of the buffer. Following this, the ] gets written beyond the end of the buffer.
- reserve 1, remaining 18446744073709551615 – terminating
NUL reservation, overflow in the remaining buffer size calculation
- Segmentation fault
As you might expect, everything is fine when the first string in the input list is one character shorter or longer because the reallocation condition is then triggered instead of overrunning the buffer. Of course, the list doesn't have to terminate after that long NUL string, and anything following that will overrun the buffer as well, e.g. python3 -c 'import ujson; ujson.dumps(["aaaa", "\x00" * 10921, 42])'.
The exact same thing is also possible with a dict, of course. For example, python3 -c 'import ujson; ujson.dumps({"a": None, "b": "\x00" * 10920})'. Just like before, the long NUL string exactly consumes the remaining buffer, and then the trailing } and NUL writes overrun it.
What did you do?
python3 -c 'import ujson; ujson.dumps(["aaaa", "\x00" * 10921])'What did you expect to happen?
No crash
What actually happened?
SIGSEGV
What versions are you using?
Background
The input here is constructed to exactly hit the buffer boundary. To better see what's going on, I added some
fprintfstatements in theBuffer_Reservemacro:With the command above, the output is this, with comments of what they correspond to:
objToJSON)namecall on first list element; as there is noname, this call is useless"aaaa"reservationnamecall on second list element"\x00" * 10921reservation; note that this exactly consumes the rest of the buffer. Following this, the]gets written beyond the end of the buffer.NULreservation, overflow in the remaining buffer size calculationAs you might expect, everything is fine when the first string in the input list is one character shorter or longer because the reallocation condition is then triggered instead of overrunning the buffer. Of course, the list doesn't have to terminate after that long NUL string, and anything following that will overrun the buffer as well, e.g.
python3 -c 'import ujson; ujson.dumps(["aaaa", "\x00" * 10921, 42])'.The exact same thing is also possible with a dict, of course. For example,
python3 -c 'import ujson; ujson.dumps({"a": None, "b": "\x00" * 10920})'. Just like before, the long NUL string exactly consumes the remaining buffer, and then the trailing}andNULwrites overrun it.