Conversation
Use expand_fast_array and direct array assignment instead of JS_CreateDataPropertyUint32 for better performance when creating arrays from value arrays. Ref: bellard/quickjs@f4951ef
- Made JS_GetGlobalVar and JS_SetGlobalVar inline with fast paths - Removed OP_check_var and OP_put_var_strict opcodes - Simplified optimize_scope_make_global_ref to not use strict mode special code - Simplified bytecode optimizations that referenced removed opcodes - Updated microbench.js to use normal functions instead of eval - Regenerated bytecode for updated opcode indices Note: This removes full compliance with the spec for strict mode variable assignment so that they are as fast as in non strict mode (V8, SpiderMonkey and JavaScriptCore do the same). Ref: bellard/quickjs@2c90110
Add inline fast paths for common property access cases: - For OP_get_field and OP_get_field2: Walk prototype chain directly for non-exotic objects with normal data properties - For OP_put_field: Set property directly for writable data properties Falls back to slow path for exotic objects and special property types. Ref: bellard/quickjs@57f8ec0
Add inline fast paths for OP_post_inc and OP_post_dec when the operand is an integer. Fall back to slow path for overflow cases (INT32_MAX for increment, INT32_MIN for decrement) and non-integer values. Ref: bellard/quickjs@e5de89f
Make string_buffer_putc() an inline function with fast paths for common cases: - Direct write for characters < 0x10000 in wide mode - Surrogate pair handling for characters >= 0x10000 with buffer space - Direct write for 8-bit characters in narrow mode Rename string_buffer_putc_slow to string_buffer_putc16_slow and add new string_buffer_putc_slow for full Unicode handling. Ref: bellard/quickjs@79f3ae2
Replace expensive prototype chain traversal with flag checking. Instead of iterating through prototypes to verify no numeric properties exist, the code now: - Adds std_array_prototype field to JSContext to track whether Array.prototype is "normal" (no small index get/set properties) - Adds is_prototype flag to JSObject to identify prototype objects - Removes has_small_array_index from JSShape (now handled differently) - Sets std_array_prototype = false when Array.prototype or Object.prototype is modified in relevant ways - Uses the flag in JS_SetPropertyValue() and OP_put_array_el for fast path decisions This trades one boolean flag check for iterating multiple prototype objects during common array append operations. Ref: bellard/quickjs@c8a8cf5
Add fast path for OP_get_length that directly accesses the length property without calling JS_GetProperty. This mirrors the optimization already done for OP_get_field and OP_get_field2. When the object has a simple length property (not a getter/setter), the value is retrieved directly by walking the prototype chain. Ref: bellard/quickjs@3e5f2bb
Add fast path for push() on fast arrays that bypasses standard object handling and directly manipulates the array's internal value store. The optimization activates when: - The array is a fast array with standard prototype - The array is extensible - The length property is an integer and writable - The new length doesn't overflow When conditions are met, elements are bulk-inserted directly into the internal values array without property lookup overhead. Ref: bellard/quickjs@9a421b3
Instead of using goto to jump to slow path on int32 overflow, directly convert to float64 inline. This improves instruction cache locality and reduces branching overhead. The change affects: - OP_add: inline float conversion on overflow - OP_add_loc: inline float conversion on overflow - OP_sub: inline float conversion on overflow Ref: bellard/quickjs@3d0cc29
154516f to
8177bd9
Compare
- Rename dbuf_realloc to dbuf_claim with clearer semantics: allocate 'len' more bytes relative to current size instead of absolute size - Add overflow protection in dbuf_claim - Change allocation growth from (size * 3 / 2) to (size + size / 2) with overflow checks - Remove unused dbuf_write function - Update all call sites across quickjs.c, libregexp.c, libunicode.c Ref: bellard/quickjs@0d4cd2d
Optimize destructuring by avoiding the creation of reference objects when there are no 'with' statements in the scope chain (which is always the case in strict mode). This uses depth=0 for direct variable access instead of depth=2 with reference creation. Additional optimizations: - has_with_scope() now skips checking in strict mode (no 'with' allowed) - In non-strict mode, modifying a function name is now ignored (OP_scope_put_var with JS_VAR_FUNCTION_NAME emits OP_drop) Note: This removes full compliance with the spec for lvalue resolution when direct eval is present in compound assignments. V8 and other engines behave the same way. Ref: bellard/quickjs@e015918
39f0d99 to
3f6275f
Compare
|
@bnoordhuis This is now ready to review. Not sure how you want to go about it though, it grew quite a bit 😅 |
Replace is_local/is_arg boolean fields in JSClosureVar with a single closure_type enum (JSClosureTypeEnum) that supports 8 distinct types: - JS_CLOSURE_LOCAL: local variable in parent function - JS_CLOSURE_ARG: argument variable in parent function - JS_CLOSURE_REF: closure variable reference in parent function - JS_CLOSURE_GLOBAL_REF: global variable reference - JS_CLOSURE_GLOBAL_DECL: global variable declaration (eval) - JS_CLOSURE_GLOBAL: global variable (eval) - JS_CLOSURE_MODULE_DECL: module variable definition (eval) - JS_CLOSURE_MODULE_IMPORT: module import definition (eval) Ref: bellard/quickjs@a6816be
- Add pre-computed JSShape objects for arguments, mapped_arguments - Use fast_array mode with var_refs for JS_CLASS_MAPPED_ARGUMENTS - Arguments object elements now alias function parameters via JSVarRef - Add js_mapped_arguments_finalizer and js_mapped_arguments_mark - Modify JS_NewObjectFromShape to accept props parameter for initialization - Add js_create_var_ref function for creating detached var_refs - Add var_refs to GC immediately in get_var_ref (instead of at close time) The mapped arguments optimization allows arguments[i] to directly reference function parameters, enabling changes to propagate bidirectionally in non-strict mode functions. Ref: bellard/quickjs@9f11034
|
Gentle ping @bnoordhuis :-) |
|
Lo siento, señor Corretgé, no lo había olvidado pero no tengo mucho tiempo la semana pasada o hoy. ¡A lo mejor mañana! (Sí, he practicando español.) |
bnoordhuis
left a comment
There was a problem hiding this comment.
In general LGTM:
-
I'm somewhat skeptical that all that open-coding of property lookups is beneficial. Did you check how it affects benchmarks and release build size?
-
That last commit optimizing
argumentsis only useful in sloppy mode, right? Doesn't seem very useful, is it to game some benchmarks that don't run in strict mode?
| /* fast path */ | ||
| p = JS_VALUE_GET_OBJ(ctx->global_obj); | ||
| prs = find_own_property(&pr, p, prop); | ||
| if (prs) { | ||
| if (likely((prs->flags & JS_PROP_TMASK) == 0)) | ||
| return js_dup(pr->u.value); | ||
| } |
There was a problem hiding this comment.
This fast path is a deoptimization for things like getters on the global object, because they're looked up here but not acted on, then JS_GetPropertyInternal does the same lookup again.
(Also true for auto-init properties but that's a one-time cost so not so significant.)
Getting rid of two bytecode opcodes is nice though.
There was a problem hiding this comment.
I suppose this was more of an observation so I'm keeping it as-is.
There was a problem hiding this comment.
Well, a real-world example that's pessimized by this change is performance.now() as it's implemented in node or firefox, where performance is a getter on the global object.
The irony when a performance optimization slows down performance.now, right?
|
|
||
| static JSValue JS_GetGlobalVar(JSContext *ctx, JSAtom prop, | ||
| bool throw_ref_error) | ||
| static inline JSValue JS_GetGlobalVar(JSContext *ctx, JSAtom prop, |
There was a problem hiding this comment.
-
inlineis currently probably a no-op; there's only one call site so if the compiler thinks it's eligible for inlining, it's going to do it anyway -
if additional call sites arise, you probably don't want it inlined because of code bloat/duplication (not that a compiler is obliged to actually inline it, it's only a hint)
-
if the one call site for JS_GetGlobalVar were to get removed,
inlinestops the compiler from issuing an unused function warning
|
Thanks for the review Ben! I'll take a look! |
|
great job, looks like the same
|
|
This is useful, thanks! How are you taking those, so I can test locally? |
https://github.com/gengjiawen/js-engines-playground/blob/038a911a228b1b96fa73eb92aec82a4701d3251a/benchmark/benchmark.js#L5 replace first two with local binary. use this to get all mainstream engines |
|
LoL, I had forgotten I adapted your runner 2 years ago already :-) https://github.com/quickjs-ng/benchmarks?tab=readme-ov-file#v8 Results (best of 3) on my machine:
I'm going to address Ben's points now and check the results. |
Benchmarks are up (compounded as a whole) and the binary size if up by ~30KB, so I think we're good.
The commit did a few things, one is that, another is to make the creation of |
|
Benchmarks after:
Within the noise margin (my system is not necessarily iddle :-)). I'm merging after the CI is 💚 , assuming you agree here @bnoordhuis since you approved. |
30KB is actually a lot for embedded system, can this be a optional ? |
|
Not really. It's either we unroll or we don't. You are welcome to try and undo the parts that Ben mentioned to see if you get the same numbers. In the grand scheme of things, the engine will get bigger and new JS features are implemented. For such tight constraints microquickjs might be a better fit. |
See each individual commit for details.