Skip to content

Commit c9cfb3d

Browse files
authored
Bytecode parity (#7504)
* Match CPython LOAD_SPECIAL stack semantics for with/async-with LOAD_SPECIAL now pushes (callable, self_or_null) matching CPython's CALL convention, instead of a single bound method: - Function descriptors: push (func, self) - Plain attributes: push (bound, NULL) Updated all with-statement paths: - Entry: add SWAP 3 after SWAP 2, remove PUSH_NULL before CALL 0 - Normal exit: remove PUSH_NULL before CALL 3 - Exception handler (WITH_EXCEPT_START): read exit_func at TOS-4 and self_or_null at TOS-3 - Suppress block: 3 POP_TOPs after POP_EXCEPT (was 2) - FBlock exit (preserve_tos): SWAP 3 + SWAP 2 rotation - UnwindAction::With: remove PUSH_NULL Stack effects updated: LoadSpecial (2,1), WithExceptStart (7,6) * Normalize LOAD_FAST_CHECK and JUMP_BACKWARD_NO_INTERRUPT Add LOAD_FAST_CHECK → LOAD_FAST and JUMP_BACKWARD_NO_INTERRUPT → JUMP_BACKWARD to opname normalization in dis_dump.py. These are optimization variants with identical semantics. * Add EXTENDED_ARG to SKIP_OPS, normalize LOAD_FAST_CHECK and JUMP_BACKWARD_NO_INTERRUPT * Remove duplicate return-None when block already has return Skip duplicate_end_returns for blocks that already end with LOAD_CONST + RETURN_VALUE. Run DCE + unreachable elimination after duplication to remove the now-unreachable original return block. * Improve __static_attributes__ collection accuracy - Support tuple/list unpacking targets: (self.x, self.y) = val - Skip @staticmethod and @classmethod decorated methods - Use scan_target_for_attrs helper for recursive target scanning * Use method mode for function-local import attribute calls Function-local imports (scope is Local+IMPORTED) should use method mode LOAD_ATTR like regular names, not plain mode. Only module/class scope imports use plain LOAD_ATTR + PUSH_NULL. * Optimize constant iterable before GET_ITER to LOAD_CONST tuple Convert BUILD_LIST/SET 0 + LOAD_CONST + LIST_EXTEND/SET_UPDATE + GET_ITER to just LOAD_CONST (tuple) + GET_ITER, matching CPython's optimization for constant list/set literals in for-loop iterables. Also fix is_name_imported to use method mode for function-local imports, and improve __static_attributes__ accuracy (skip @classmethod/@staticmethod, handle tuple/list unpacking targets). * Fix cell variable ordering: parameters first, then alphabetical CPython orders cell variables with parameter cells first (in parameter definition order), then non-parameter cells sorted alphabetically. Previously all cells were sorted alphabetically. Also add for-loop iterable optimization: constant BUILD_LIST/SET before GET_ITER is folded to just LOAD_CONST tuple. * Emit COPY_FREE_VARS before MAKE_CELL matching CPython order CPython emits COPY_FREE_VARS first, then MAKE_CELL instructions. Previously RustPython emitted them in reverse order. * Fix RESUME AfterYield encoding to match CPython 3.14 (value 5) CPython 3.14 uses RESUME arg=5 for after-yield, not 1. Also reorder COPY_FREE_VARS before MAKE_CELL and fix cell variable ordering (parameters first, then alphabetical). * Address code review feedback from #7481 - Set is_generator flag for generator expressions in scan_comprehension - Fix posonlyargs priority in collect_static_attributes first param - Add match statement support to scan_store_attrs - Fix stale decorator stack comment - Reorder NOP removal after fold_unary_negative for better collection folding * Fold constant list/set/tuple literals in compiler When all elements of a list/set/tuple literal are constants and there are 3+ elements, fold them into a single constant: - list: BUILD_LIST 0 + LOAD_CONST (tuple) + LIST_EXTEND 1 - set: BUILD_SET 0 + LOAD_CONST (tuple) + SET_UPDATE 1 - tuple: LOAD_CONST (tuple) This matches CPython's compiler optimization and fixes the most common bytecode difference (92/200 sampled files). Also add bytecode comparison scripts (dis_dump.py, compare_bytecode.py) for systematic parity tracking. * Use BUILD_MAP 0 + MAP_ADD for large dicts (>= 16 pairs) Match CPython's compiler behavior: dicts with 16+ key-value pairs use BUILD_MAP 0 followed by MAP_ADD for each pair, instead of pushing all keys/values on the stack and calling BUILD_MAP N. * Fix clippy warnings and cargo fmt * fix surrogate
1 parent e1ecb87 commit c9cfb3d

10 files changed

+371
-200
lines changed

crates/codegen/src/compile.rs

Lines changed: 211 additions & 106 deletions
Large diffs are not rendered by default.

crates/codegen/src/ir.rs

Lines changed: 63 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -191,9 +191,12 @@ impl CodeInfo {
191191
) -> crate::InternalResult<CodeObject> {
192192
// Constant folding passes
193193
self.fold_unary_negative();
194+
self.remove_nops(); // remove NOPs from unary folding so tuple/list/set see contiguous LOADs
194195
self.fold_tuple_constants();
195196
self.fold_list_constants();
196197
self.fold_set_constants();
198+
self.remove_nops(); // remove NOPs from collection folding
199+
self.fold_const_iterable_for_iter();
197200
self.convert_to_load_small_int();
198201
self.remove_unused_consts();
199202
self.remove_nops();
@@ -214,6 +217,8 @@ impl CodeInfo {
214217
self.dce(); // re-run within-block DCE after normalize_jumps creates new instructions
215218
self.eliminate_unreachable_blocks();
216219
duplicate_end_returns(&mut self.blocks);
220+
self.dce(); // truncate after terminal in blocks that got return duplicated
221+
self.eliminate_unreachable_blocks(); // remove now-unreachable last block
217222
self.optimize_load_global_push_null();
218223

219224
let max_stackdepth = self.max_stackdepth()?;
@@ -876,6 +881,49 @@ impl CodeInfo {
876881
}
877882
}
878883

884+
/// Convert constant list/set construction before GET_ITER to just LOAD_CONST tuple.
885+
/// BUILD_LIST 0 + LOAD_CONST (tuple) + LIST_EXTEND 1 + GET_ITER
886+
/// → LOAD_CONST (tuple) + GET_ITER
887+
/// Also handles BUILD_SET 0 + LOAD_CONST + SET_UPDATE 1 + GET_ITER.
888+
fn fold_const_iterable_for_iter(&mut self) {
889+
for block in &mut self.blocks {
890+
let mut i = 0;
891+
while i + 3 < block.instructions.len() {
892+
let is_build = matches!(
893+
block.instructions[i].instr.real(),
894+
Some(Instruction::BuildList { .. } | Instruction::BuildSet { .. })
895+
) && u32::from(block.instructions[i].arg) == 0;
896+
897+
let is_const = matches!(
898+
block.instructions[i + 1].instr.real(),
899+
Some(Instruction::LoadConst { .. })
900+
);
901+
902+
let is_extend = matches!(
903+
block.instructions[i + 2].instr.real(),
904+
Some(Instruction::ListExtend { .. } | Instruction::SetUpdate { .. })
905+
) && u32::from(block.instructions[i + 2].arg) == 1;
906+
907+
let is_iter = matches!(
908+
block.instructions[i + 3].instr.real(),
909+
Some(Instruction::GetIter)
910+
);
911+
912+
if is_build && is_const && is_extend && is_iter {
913+
// Replace: BUILD_X 0 → NOP, keep LOAD_CONST, LIST_EXTEND → NOP
914+
let loc = block.instructions[i].location;
915+
block.instructions[i].instr = Instruction::Nop.into();
916+
block.instructions[i].location = loc;
917+
block.instructions[i + 2].instr = Instruction::Nop.into();
918+
block.instructions[i + 2].location = loc;
919+
i += 4;
920+
} else {
921+
i += 1;
922+
}
923+
}
924+
}
925+
}
926+
879927
/// Fold constant set literals: LOAD_CONST* + BUILD_SET N →
880928
/// BUILD_SET 0 + LOAD_CONST (frozenset-as-tuple) + SET_UPDATE 1
881929
fn fold_set_constants(&mut self) {
@@ -1987,6 +2035,7 @@ fn duplicate_end_returns(blocks: &mut [Block]) {
19872035
// Check if the last block ends with LOAD_CONST + RETURN_VALUE (the implicit return)
19882036
let last_insts = &blocks[last_block.idx()].instructions;
19892037
// Only apply when the last block is EXACTLY a return-None epilogue
2038+
// AND the return instructions have no explicit line number (lineno <= 0)
19902039
let is_return_block = last_insts.len() == 2
19912040
&& matches!(
19922041
last_insts[0].instr,
@@ -2010,12 +2059,22 @@ fn duplicate_end_returns(blocks: &mut [Block]) {
20102059
let block = &blocks[current.idx()];
20112060
if current != last_block && block.next == last_block && !block.cold && !block.except_handler
20122061
{
2013-
let has_fallthrough = block
2014-
.instructions
2015-
.last()
2062+
let last_ins = block.instructions.last();
2063+
let has_fallthrough = last_ins
20162064
.map(|ins| !ins.instr.is_scope_exit() && !ins.instr.is_unconditional_jump())
20172065
.unwrap_or(true);
2018-
if has_fallthrough {
2066+
// Don't duplicate if block already ends with the same return pattern
2067+
let already_has_return = block.instructions.len() >= 2 && {
2068+
let n = block.instructions.len();
2069+
matches!(
2070+
block.instructions[n - 2].instr,
2071+
AnyInstruction::Real(Instruction::LoadConst { .. })
2072+
) && matches!(
2073+
block.instructions[n - 1].instr,
2074+
AnyInstruction::Real(Instruction::ReturnValue)
2075+
)
2076+
};
2077+
if has_fallthrough && !already_has_return {
20192078
blocks_to_fix.push(current);
20202079
}
20212080
}

crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ands.snap

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_mixed.snap

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ors.snap

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap

Lines changed: 66 additions & 65 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/codegen/src/symboltable.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2136,6 +2136,8 @@ impl SymbolTableBuilder {
21362136
CompilerScope::Comprehension,
21372137
self.line_index_start(range),
21382138
);
2139+
// Generator expressions need the is_generator flag
2140+
self.tables.last_mut().unwrap().is_generator = is_generator;
21392141

21402142
// PEP 709: Mark non-generator comprehensions for inlining,
21412143
// but only inside function-like scopes (fastlocals).

crates/compiler-core/src/bytecode/instruction.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1020,7 +1020,7 @@ impl InstructionMetadata for Instruction {
10201020
Self::LoadLocals => (1, 0),
10211021
Self::LoadName { .. } => (1, 0),
10221022
Self::LoadSmallInt { .. } => (1, 0),
1023-
Self::LoadSpecial { .. } => (1, 1),
1023+
Self::LoadSpecial { .. } => (2, 1),
10241024
Self::LoadSuperAttr { .. } => (1 + (oparg & 1), 3),
10251025
Self::LoadSuperAttrAttr => (1, 3),
10261026
Self::LoadSuperAttrMethod => (2, 3),
@@ -1085,7 +1085,7 @@ impl InstructionMetadata for Instruction {
10851085
Self::UnpackSequenceList => (oparg, 1),
10861086
Self::UnpackSequenceTuple => (oparg, 1),
10871087
Self::UnpackSequenceTwoTuple => (2, 1),
1088-
Self::WithExceptStart => (6, 5),
1088+
Self::WithExceptStart => (7, 6),
10891089
Self::YieldValue { .. } => (1, 1),
10901090
};
10911091

crates/compiler-core/src/bytecode/oparg.rs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -290,7 +290,7 @@ impl From<u32> for ResumeType {
290290
fn from(value: u32) -> Self {
291291
match value {
292292
0 => Self::AtFuncStart,
293-
1 => Self::AfterYield,
293+
5 => Self::AfterYield,
294294
2 => Self::AfterYieldFrom,
295295
3 => Self::AfterAwait,
296296
_ => Self::Other(value),
@@ -302,7 +302,7 @@ impl From<ResumeType> for u32 {
302302
fn from(typ: ResumeType) -> Self {
303303
match typ {
304304
ResumeType::AtFuncStart => 0,
305-
ResumeType::AfterYield => 1,
305+
ResumeType::AfterYield => 5,
306306
ResumeType::AfterYieldFrom => 2,
307307
ResumeType::AfterAwait => 3,
308308
ResumeType::Other(v) => v,

0 commit comments

Comments
 (0)