-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Bytecode pseudo opcodes #6715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bytecode pseudo opcodes #6715
Conversation
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR refactors the instruction representation system by introducing a unified AnyInstruction wrapper that encompasses both RealInstruction and PseudoInstruction variants. The codegen pipeline now distinguishes between pseudo-ops and real instructions, converting pseudo-instructions to real ones during emission with corresponding argument encoding adjustments. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes The changes introduce a substantial type-system refactoring affecting instruction representation across the codebase. While individual file changes are often straightforward conversions, the cumulative scope spans multiple interconnected systems (bytecode core, codegen, stdlib, VM), requiring understanding of how pseudo/real instruction separation impacts emission pipelines and type conversions. The heterogeneity of changes—some mechanical (type updates in frame.rs), others logic-bearing (emit API generalization, predicate updates)—demands careful review across each affected subsystem. Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
scripts/generate_opcode_metadata.py (1)
40-49: Fix enum body extraction: current regex can truncate on struct-variant braces.On Line 41, the pattern can stop at the first
\n}it sees, which is common inside multi-line struct variants, producing incomplete opcode lists.Proposed fix (brace-matching instead of regex)
def extract_enum_body(contents: str, enum_name: str) -> str: - res = re.search(f"pub enum {enum_name} " + r"\{(.+?)\n\}", contents, re.DOTALL) - if not res: - raise ValueError(f"Could not find {enum_name} enum") - - return "\n".join( - line.split("//")[0].strip() # Remove any comment. i.e. "foo // some comment" - for line in res.group(1).splitlines() - if not line.strip().startswith("//") # Ignore comment lines - ) + # Find the enum header first, then extract the `{ ... }` body via brace matching. + header = re.search(rf"\bpub\s+enum\s+{re.escape(enum_name)}\b", contents) + if not header: + raise ValueError(f"Could not find {enum_name} enum") + + start = contents.find("{", header.end()) + if start == -1: + raise ValueError(f"Could not find opening '{{' for {enum_name} enum") + + depth = 0 + end = None + for i in range(start, len(contents)): + ch = contents[i] + if ch == "{": + depth += 1 + elif ch == "}": + depth -= 1 + if depth == 0: + end = i + break + if end is None: + raise ValueError(f"Could not find closing '}}' for {enum_name} enum") + + body = contents[start + 1 : end] + return "\n".join( + line.split("//")[0].strip() + for line in body.splitlines() + if not line.strip().startswith("//") + )
🤖 Fix all issues with AI agents
In @crates/jit/src/instructions.rs:
- Around line 644-650: The SWAP implementation in RealInstruction::Swap uses
zero-based math so `SWAP 1` incorrectly swaps TOS with second-from-top; treat
the operand as 1-based by changing the index arithmetic: compute j so that
SWAP(1) yields j == i (no-op), e.g. use j = len - index.get(arg) as usize
instead of len - 1 - index.get(arg) as usize, and keep i = len - 1; ensure you
reference RealInstruction::Swap and the index.get(arg) call when updating the
code.
In @crates/stdlib/src/opcode.rs:
- Around line 82-99: The predicates misclassify some opcodes: update has_name to
also treat RealInstruction::LoadSuperAttr { .. } as a name-indexing opcode (it
indexes co_names via packed oparg) by adding it to the match arm alongside other
RealInstruction::Load*/Store*/Delete*/Import* variants, and update has_jump to
include RealInstruction::JumpForward, RealInstruction::JumpBackward, and
RealInstruction::JumpBackwardNoInterrupt in its jump match arm; modify the match
expressions inside both functions (the pattern matching on
Self::try_from(opcode).map(|op| op.inner())) to include these missing
RealInstruction variants so those predicates accurately reflect all name and
jump opcodes.
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)
873-884:fmt_disforPseudoInstructionwill panic if invoked.The
unimplemented!()macro will cause a runtime panic if any code path attempts to display a pseudo instruction for debugging/disassembly. SinceInstructionMetadatais implemented for bothRealInstructionandPseudoInstruction, andInstruction::fmt_disdelegates via theinst_either!macro, callingdisplay()on anInstruction::Pseudovariant will panic.Consider providing a basic implementation:
♻️ Suggested implementation
fn fmt_dis( &self, - _arg: OpArg, - _f: &mut fmt::Formatter<'_>, - _ctx: &impl InstrDisplayContext, + arg: OpArg, + f: &mut fmt::Formatter<'_>, + _ctx: &impl InstrDisplayContext, _expand_code_objects: bool, - _pad: usize, + pad: usize, _level: usize, ) -> fmt::Result { - unimplemented!() + match self { + Self::Jump { target } => write!(f, "{:pad$}({})", "JUMP", target.get(arg)), + Self::JumpNoInterrupt { target } => write!(f, "{:pad$}({})", "JUMP_NO_INTERRUPT", target.get(arg)), + Self::LoadAttrMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ATTR_METHOD", idx.get(arg)), + Self::LoadSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_SUPER_METHOD", idx.get(arg)), + Self::LoadZeroSuperAttr { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_ATTR", idx.get(arg)), + Self::LoadZeroSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_METHOD", idx.get(arg)), + Self::PopBlock => write!(f, "POP_BLOCK"), + Self::SetupCleanup => write!(f, "SETUP_CLEANUP"), + Self::SetupFinally => write!(f, "SETUP_FINALLY"), + Self::SetupWith => write!(f, "SETUP_WITH"), + Self::StoreFastMaybeNull => write!(f, "STORE_FAST_MAYBE_NULL"), + Self::Reserved258 => write!(f, "RESERVED_258"), + } }
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
Lib/_opcode_metadata.pyis excluded by!Lib/**Lib/test/test__opcode.pyis excluded by!Lib/**
📒 Files selected for processing (9)
crates/codegen/src/compile.rscrates/codegen/src/ir.rscrates/compiler-core/src/bytecode.rscrates/compiler-core/src/bytecode/instruction.rscrates/jit/src/instructions.rscrates/jit/tests/common.rscrates/stdlib/src/opcode.rscrates/vm/src/frame.rsscripts/generate_opcode_metadata.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/frame.rscrates/jit/src/instructions.rscrates/jit/tests/common.rscrates/compiler-core/src/bytecode.rscrates/codegen/src/ir.rscrates/compiler-core/src/bytecode/instruction.rscrates/stdlib/src/opcode.rs
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py: In most cases, Python code should not be edited; bug fixes should be made through Rust code modifications only
Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
scripts/generate_opcode_metadata.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: When modifying bytecode instructions, perform a full clean build by running `rm -r target/debug/build/rustpython-* && find . | grep -E '\.pyc$' | xargs rm -r`
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: When modifying bytecode instructions, perform a full clean build by running `rm -r target/debug/build/rustpython-* && find . | grep -E '\.pyc$' | xargs rm -r`
Applied to files:
crates/jit/tests/common.rscrates/codegen/src/ir.rscrates/compiler-core/src/bytecode/instruction.rs
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/codegen/src/ir.rs
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: RustPython is a Python 3 interpreter written in Rust implementing Python 3.13.0+ compatibility with complete Python environment in Rust (not CPython bindings), clean implementation without compatibility hacks, and cross-platform support including WebAssembly compilation
Applied to files:
crates/codegen/src/ir.rs
🧬 Code graph analysis (2)
crates/vm/src/frame.rs (1)
crates/compiler-core/src/bytecode.rs (2)
get(472-478)get(575-577)
crates/codegen/src/ir.rs (1)
crates/compiler-core/src/bytecode.rs (7)
encode_exception_table(53-65)encode_load_attr_arg(95-97)encode_load_super_attr_arg(109-111)marker(557-559)new(40-48)new(562-564)new(704-706)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run snippets and cpython tests (windows-2025)
- GitHub Check: Check Rust code with clippy
🔇 Additional comments (20)
scripts/generate_opcode_metadata.py (1)
53-56: Multi-enum aggregation looks good.Pulling both
RealInstructionandPseudoInstructioninto a single stream matches the PR’s split, and keeps downstreamOpcode.from_str(...)unchanged.crates/vm/src/frame.rs (3)
394-401: Reraise detection updated correctly toRealInstruction.The
RAISE_VARARGS+RERAISEcheck mapping on Line 395-400 looks consistent and keeps the “don’t add traceback for reraises” behavior intact.
428-463: Yield-from/await detection migration toRealInstructionlooks consistent.The
Send/Resume/previousYieldValuechecks align with the intended “lasti points to RESUME (after YIELD_VALUE)” model.
552-560: Good:ExecutingFramenow only dispatches concrete opcodes (RealInstruction).This prevents pseudo opcodes from reaching the VM execution loop, which is exactly what the split is aiming for.
Also applies to: 587-1760
crates/jit/tests/common.rs (2)
87-124:RealInstructionmigration +BuildMapkey/value popping order looks right.Line 115-121 now consumes both value and key (string) per entry; this is the expected stack effect for map construction.
137-166:SET_FUNCTION_ATTRIBUTEhandling is clearer and safer for tests.Only applying
ANNOTATIONS(and otherwise preserving the function) keeps the harness minimal while matching the new bytecode flow.crates/jit/src/instructions.rs (1)
168-222: Unreachable-code tracking updated cleanly forRealInstructionjump/return ops.Marking code after unconditional jumps/returns unreachable is a good fit for the current lowering strategy.
crates/stdlib/src/opcode.rs (1)
23-34: Opcode decoding viau16 -> Instructionis a good adaptation to pseudo opcodes.This keeps
is_validand friends consistent with the new split and avoids ad-hoc range checks.Also applies to: 53-61
crates/compiler-core/src/bytecode.rs (3)
17-19: Good API surface update: re-exporting the split instruction types unblocks downstream users.
470-478: OpArgState now returnsRealInstructionas expected for CodeUnit decoding.The
ExtendedArgstate machine stays intact with the new enum.
694-707: CodeUnit storingRealInstructionmatches the “no pseudos at runtime” model.Keeps the 2-byte CodeUnit layout stable while making the type-level guarantee explicit.
crates/codegen/src/ir.rs (4)
4-14: LGTM!The import additions correctly bring in the new instruction types (
RealInstruction,PseudoInstruction,InstructionMetadata) needed for the refactored bytecode representation.
199-243: LGTM!The pseudo-to-real instruction conversion logic correctly handles all attribute loading variants with proper encoding:
LoadAttrMethodencodes with method flagLoadSuper*variants encode with appropriateload_methodandhas_classflagsPopBlockis converted toNopas expectedThe use of
Instruction::Pseudo(...)andInstruction::Real(...)pattern matching with.into()for conversions is clean.
304-308: LGTM!Correctly uses
RealInstruction::ExtendedArgfor emitting extended argument bytes in the final bytecode.
280-296: No action needed —JumpNoInterruptis never emitted during compilation.
JumpNoInterruptis defined as a placeholder in thePseudoInstructionenum but is never instantiated anywhere in the codebase. All code generation paths emit onlyPseudoInstruction::Jump, which is explicitly handled at line 280-296 to convert to eitherJumpForwardorJumpBackwardbased on offset direction. Other pseudo instructions (LoadAttrMethod,PopBlock,LoadSuperMethod, etc.) are converted in earlier code paths before reaching this conversion point. Theexpect_real()fallback is therefore safe in practice.crates/compiler-core/src/bytecode/instruction.rs (5)
17-281: LGTM!The
RealInstructionenum is well-structured with:
- Clear separation between no-argument (opcode < 44) and with-argument instructions
- CPython 3.13 opcode alignment
- RustPython-specific extensions in a designated range (119-135)
- Correct
repr(u8)for size efficiency, verified by the compile-time assertion
291-349: LGTM!The
TryFrom<u8>implementation correctly validates opcodes by:
- Checking the contiguous CPython range (0-118)
- Explicitly handling non-contiguous opcodes (
Resumeat 149,LoadClosureat 253)- Listing valid custom opcodes explicitly to handle gaps (125-127 unused)
The TODO at line 303-304 for moving
LoadClosureto pseudos is noted.
968-1019: LGTM!The
InstructionMetadatatrait is well-designed with:
- Clear method signatures for instruction analysis
- Helpful documentation with examples
- Clean default implementation for
display()using a localFmtFnhelperThe trait provides a unified interface for both real and pseudo instructions.
925-952: LGTM!The
inst_either!macro cleanly delegatesInstructionMetadatatrait methods to the underlyingRealorPseudovariant, reducing boilerplate while maintaining type safety.
954-966: LGTM!The
Instructionhelper methods are well-designed:
as_real()provides safe optional conversionexpect_real()provides convenient unwrapping for code paths that have already ensured the instruction is realThe
const fnqualifier onexpect_realrequiresOption::expectconst stability, which became available in Rust 1.83.
crates/jit/src/instructions.rs
Outdated
| RealInstruction::Swap { index } => { | ||
| let len = self.stack.len(); | ||
| let i = len - 1; | ||
| let j = len - 1 - index.get(arg) as usize; | ||
| self.stack.swap(i, j); | ||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix SWAP semantics: index math is off by one vs CPython/RustPython stack convention.
On Line 647, j = len - 1 - index makes SWAP 1 swap TOS with the second-from-top, but SWAP(1) should swap TOS with itself (no-op). This can miscompile valid code.
Proposed fix (treat SWAP index as 1-based)
RealInstruction::Swap { index } => {
let len = self.stack.len();
- let i = len - 1;
- let j = len - 1 - index.get(arg) as usize;
+ let i = len - 1;
+ let n = index.get(arg) as usize; // 1-based
+ if n == 0 || n > len {
+ return Err(JitCompileError::BadBytecode);
+ }
+ let j = len - n;
self.stack.swap(i, j);
Ok(())
}🤖 Prompt for AI Agents
In @crates/jit/src/instructions.rs around lines 644 - 650, The SWAP
implementation in RealInstruction::Swap uses zero-based math so `SWAP 1`
incorrectly swaps TOS with second-from-top; treat the operand as 1-based by
changing the index arithmetic: compute j so that SWAP(1) yields j == i (no-op),
e.g. use j = len - index.get(arg) as usize instead of len - 1 - index.get(arg)
as usize, and keep i = len - 1; ensure you reference RealInstruction::Swap and
the index.get(arg) call when updating the code.
| pub fn has_name(opcode: i32) -> bool { | ||
| Self::is_valid(opcode) | ||
| && matches!( | ||
| Instruction::try_from(opcode as u8), | ||
| Ok(Instruction::DeleteAttr { .. } | ||
| | Instruction::DeleteGlobal(_) | ||
| | Instruction::DeleteName(_) | ||
| | Instruction::ImportFrom { .. } | ||
| | Instruction::ImportName { .. } | ||
| | Instruction::LoadAttr { .. } | ||
| | Instruction::LoadGlobal(_) | ||
| | Instruction::LoadAttrMethod { .. } | ||
| | Instruction::LoadName(_) | ||
| | Instruction::StoreAttr { .. } | ||
| | Instruction::StoreGlobal(_) | ||
| | Instruction::StoreName(_)) | ||
| ) | ||
| matches!( | ||
| Self::try_from(opcode).map(|op| op.inner()), | ||
| Ok(Instruction::Real( | ||
| RealInstruction::DeleteAttr { .. } | ||
| | RealInstruction::DeleteGlobal(_) | ||
| | RealInstruction::DeleteName(_) | ||
| | RealInstruction::ImportFrom { .. } | ||
| | RealInstruction::ImportName { .. } | ||
| | RealInstruction::LoadAttr { .. } | ||
| | RealInstruction::LoadGlobal(_) | ||
| | RealInstruction::LoadName(_) | ||
| | RealInstruction::StoreAttr { .. } | ||
| | RealInstruction::StoreGlobal(_) | ||
| | RealInstruction::StoreName(_) | ||
| ) | Instruction::Pseudo(PseudoInstruction::LoadAttrMethod { .. })) | ||
| ) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix opcode classification gaps: add missing real jump ops (and likely LoadSuperAttr) to predicates.
has_jump(Line 103-118) should includeRealInstruction::JumpForward/JumpBackward/JumpBackwardNoInterrupt.has_name(Line 82-99) likely needsRealInstruction::LoadSuperAttr { .. }since it indexes intoco_namesvia packed oparg.
Proposed fix (extend match arms)
pub fn has_name(opcode: i32) -> bool {
matches!(
Self::try_from(opcode).map(|op| op.inner()),
Ok(Instruction::Real(
RealInstruction::DeleteAttr { .. }
| RealInstruction::DeleteGlobal(_)
| RealInstruction::DeleteName(_)
| RealInstruction::ImportFrom { .. }
| RealInstruction::ImportName { .. }
| RealInstruction::LoadAttr { .. }
| RealInstruction::LoadGlobal(_)
| RealInstruction::LoadName(_)
+ | RealInstruction::LoadSuperAttr { .. }
| RealInstruction::StoreAttr { .. }
| RealInstruction::StoreGlobal(_)
| RealInstruction::StoreName(_)
) | Instruction::Pseudo(PseudoInstruction::LoadAttrMethod { .. }))
)
}
pub fn has_jump(opcode: i32) -> bool {
matches!(
Self::try_from(opcode).map(|op| op.inner()),
Ok(Instruction::Real(
RealInstruction::Break { .. }
| RealInstruction::Continue { .. }
| RealInstruction::ForIter { .. }
+ | RealInstruction::JumpForward { .. }
+ | RealInstruction::JumpBackward { .. }
+ | RealInstruction::JumpBackwardNoInterrupt { .. }
| RealInstruction::JumpIfFalseOrPop { .. }
| RealInstruction::JumpIfNotExcMatch(_)
| RealInstruction::JumpIfTrueOrPop { .. }
| RealInstruction::PopJumpIfFalse { .. }
| RealInstruction::PopJumpIfTrue { .. }
| RealInstruction::Send { .. }
) | Instruction::Pseudo(PseudoInstruction::Jump { .. }))
)
}Also applies to: 103-118
🤖 Prompt for AI Agents
In @crates/stdlib/src/opcode.rs around lines 82 - 99, The predicates misclassify
some opcodes: update has_name to also treat RealInstruction::LoadSuperAttr { ..
} as a name-indexing opcode (it indexes co_names via packed oparg) by adding it
to the match arm alongside other RealInstruction::Load*/Store*/Delete*/Import*
variants, and update has_jump to include RealInstruction::JumpForward,
RealInstruction::JumpBackward, and RealInstruction::JumpBackwardNoInterrupt in
its jump match arm; modify the match expressions inside both functions (the
pattern matching on Self::try_from(opcode).map(|op| op.inner())) to include
these missing RealInstruction variants so those predicates accurately reflect
all name and jump opcodes.
youknowone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I am still looking in
| /// use rustpython_compiler_core::bytecode::{Arg, Instruction, Label}; | ||
| /// let (target, jump_arg) = Arg::new(Label(0xF)); | ||
| /// let jump_instruction = Instruction::Jump { target }; | ||
| /// assert_eq!(jump_instruction.stack_effect(jump_arg, true), 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs are lost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I've put it in the InstructionMetadata trait, but I guess it makes more sense to put it under the impl
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, sorry. doc will be enough on traits
| const _: () = assert!(mem::size_of::<PseudoInstruction>() == 2); | ||
|
|
||
| #[derive(Debug, Copy, Clone, PartialEq, Eq)] | ||
| pub enum Instruction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this type fit in u16? Otherwise union could be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ain't sure what you mean, could you please explain?
And yes, this type fit in a u16 it can hold a RealInstruction (u8) or a PseudoInstruction (u16)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add #[repr(u16)] or static_assertions::assert_eq_size to ensure compiler actually recognize it or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to add #[repr(u16)], but core::mem::size_of::<Instruction>() gives 4.
Why do we care about the the size of Instruction? It's not intended to be marshaled directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you are right! I didn't catch that.
Still union looks more fit than Instruction because they are just separated types sharing a same space. Will that be too much painful to use union than enum for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, you are right! I didn't catch that. Still union looks more fit than Instruction because they are just separated types sharing a same space. Will that be too much painful to use union than enum for this?
IMO yes it will. And we wouldn't gain a lot from it tbh as the instruction loop will only have RealInstruction anyway (which is what we had until now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want me to remove the #[repr(u16)] so the compiler can have some further optimizations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, if this is only used by compiler anyway, probably that could be better idea.
I was originally thinking
union Instruction {
real: RealInstruction,
pseudo: PseudoInstruction,
raw: u16,
}
impl Instruction {
fn real(&self) -> Option<RealInstruction> {
if self.raw < ... {
Some(...)
} else {
...
}
}
}btw, if The pseudo instruction is only used for compiler, vm only will have RealInstruction. That looks a bit awkward. How about keeping the RealInstruction name as Instruction, and the wrapper as AnyInstruction?
enum AnyInstruction {
Real(Instruction),
Pseudo(PseudoInstruction),
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, if this is only used by compiler anyway, probably that could be better idea.
I was originally thinking
union Instruction { real: RealInstruction, pseudo: PseudoInstruction, raw: u16, } impl Instruction { fn real(&self) -> Option<RealInstruction> { if self.raw < ... { Some(...) } else { ... } } }
While it's cool (and doable) I don't think we will gain much from doing that.
btw, if The pseudo instruction is only used for compiler, vm only will have
RealInstruction. That looks a bit awkward. How about keeping the RealInstruction name as Instruction, and the wrapper as AnyInstruction?enum AnyInstruction { Real(Instruction), Pseudo(PseudoInstruction), }
Sure, I can do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)
429-429:LoadAttrstack effect should account for method flag.The stack effect of
LOAD_ATTRdepends on theis_methodflag encoded in the oparg:
- Non-method: pops object, pushes attribute → net 0
- Method: pops object, pushes (method, self_or_null) → net +1
Currently returns unconditional
0, which is incorrect for the method case. This could cause incorrect stack depth calculations.Proposed fix
- Self::LoadAttr { .. } => 0, + Self::LoadAttr { idx } => { + let (_, is_method) = decode_load_attr_arg(idx.get(arg)); + if is_method { 1 } else { 0 } + }
🤖 Fix all issues with AI agents
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 893-904: The fmt_dis implementation for PseudoInstruction
currently calls unimplemented!() and will panic if Instruction::Pseudo is ever
formatted via InstructionMetadata; replace the panic with a defensive
implementation (or at minimum a clearer todo!) by updating the fmt_dis method on
the PseudoInstruction/Instruction::Pseudo branch: either call
todo!("PseudoInstruction display not yet implemented") to provide a clearer
error or implement a minimal formatter that writes a placeholder string (e.g.,
the pseudo opcode name and any relevant fields) to the provided fmt::Formatter,
respecting the signature of fmt_dis and using the
_ctx/_expand_code_objects/_pad/_level params as no-ops if not needed. Ensure the
change is made in the fmt_dis function so formatting of Instruction::Pseudo no
longer panics.
🧹 Nitpick comments (2)
crates/compiler-core/src/bytecode/instruction.rs (2)
982-986:expect_realpanics in const context - message won't be visible.
expect_realusesOption::expectwhich panics with a message, but since this is aconst fn, the panic message may not be displayed in all contexts. Consider whetherunwrap()suffices or if a non-const version with better diagnostics is needed for runtime usage.
908-912: Actual memory footprint ofInstructionenum is unverified.
RealInstructionandPseudoInstructionare correctly sized at 1 and 2 bytes respectively, but there is no size assertion for theInstructionenum itself. Without verification of actual compiled size, the claim that it "will likely be 4 bytes" is speculative. More importantly, there's no evidence in the codebase that memory footprint is a performance concern for bytecode arrays—suggesting a union-based approach trades type safety for unverified space savings and would require clear performance metrics to justify.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
crates/compiler-core/src/bytecode/instruction.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/compiler-core/src/bytecode/instruction.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run snippets and cpython tests on wasm-wasi
🔇 Additional comments (6)
crates/compiler-core/src/bytecode/instruction.rs (6)
15-280: LGTM!The
RealInstructionenum is well-structured with explicit discriminants matching CPython 3.13 opcodes. The gaps are intentional and documented. The TODO forLoadClosureis noted.
292-350: LGTM!The
TryFrom<u8>validation correctly handles:
- Contiguous CPython range (0-118)
- Non-contiguous
Resume(149) andLoadClosure(253)- Explicit custom ops list that properly excludes gaps (125-127, 132-133)
The unsafe transmute is sound after range validation.
846-862: LGTM!The range validation is correct since all discriminants 256-267 are contiguous. The unsafe transmute is sound after this check.
876-891: LGTM!Stack effects are correctly calculated:
LoadAttrMethod: +1 (pop obj, push method + self_or_null)LoadSuperMethod/LoadZeroSuperMethod: -1 (pop 3, push 2)LoadZeroSuperAttr: -2 (pop 3, push 1)
945-1020: LGTM!The
inst_either!macro cleanly forwards trait method calls to the underlyingReal/Pseudovariants. TheInstructionMetadatatrait is well-defined with thedisplayhelper providing ergonomic formatting.
1077-1104: LGTM!The encode/decode helpers are symmetric and correctly implement the bit-packing scheme:
load_attr_arg: bit 0 for method flag, remaining bits for name indexload_super_attr_arg: bit 0 for load_method, bit 1 for has_class, remaining bits for name index
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| fn fmt_dis( | ||
| &self, | ||
| _arg: OpArg, | ||
| _f: &mut fmt::Formatter<'_>, | ||
| _ctx: &impl InstrDisplayContext, | ||
| _expand_code_objects: bool, | ||
| _pad: usize, | ||
| _level: usize, | ||
| ) -> fmt::Result { | ||
| unimplemented!() | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmt_dis for PseudoInstruction panics unconditionally.
This unimplemented!() will panic if Instruction::Pseudo is ever formatted via the InstructionMetadata trait (e.g., during debug/disassembly output). Consider providing a minimal implementation or at least using todo!("PseudoInstruction display not yet implemented") for a clearer error message.
Minimal implementation suggestion
fn fmt_dis(
&self,
- _arg: OpArg,
- _f: &mut fmt::Formatter<'_>,
- _ctx: &impl InstrDisplayContext,
+ arg: OpArg,
+ f: &mut fmt::Formatter<'_>,
+ _ctx: &impl InstrDisplayContext,
_expand_code_objects: bool,
- _pad: usize,
+ pad: usize,
_level: usize,
) -> fmt::Result {
- unimplemented!()
+ match self {
+ Self::Jump { target } => write!(f, "{:pad$}({})", "JUMP", target.get(arg)),
+ Self::JumpNoInterrupt { target } => write!(f, "{:pad$}({})", "JUMP_NO_INTERRUPT", target.get(arg)),
+ Self::LoadAttrMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ATTR_METHOD", idx.get(arg)),
+ Self::LoadSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_SUPER_METHOD", idx.get(arg)),
+ Self::LoadZeroSuperAttr { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_ATTR", idx.get(arg)),
+ Self::LoadZeroSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_METHOD", idx.get(arg)),
+ Self::PopBlock => write!(f, "POP_BLOCK"),
+ _ => write!(f, "PSEUDO_PLACEHOLDER"),
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn fmt_dis( | |
| &self, | |
| _arg: OpArg, | |
| _f: &mut fmt::Formatter<'_>, | |
| _ctx: &impl InstrDisplayContext, | |
| _expand_code_objects: bool, | |
| _pad: usize, | |
| _level: usize, | |
| ) -> fmt::Result { | |
| unimplemented!() | |
| } | |
| } | |
| fn fmt_dis( | |
| &self, | |
| arg: OpArg, | |
| f: &mut fmt::Formatter<'_>, | |
| _ctx: &impl InstrDisplayContext, | |
| _expand_code_objects: bool, | |
| pad: usize, | |
| _level: usize, | |
| ) -> fmt::Result { | |
| match self { | |
| Self::Jump { target } => write!(f, "{:pad$}({})", "JUMP", target.get(arg)), | |
| Self::JumpNoInterrupt { target } => write!(f, "{:pad$}({})", "JUMP_NO_INTERRUPT", target.get(arg)), | |
| Self::LoadAttrMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ATTR_METHOD", idx.get(arg)), | |
| Self::LoadSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_SUPER_METHOD", idx.get(arg)), | |
| Self::LoadZeroSuperAttr { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_ATTR", idx.get(arg)), | |
| Self::LoadZeroSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_METHOD", idx.get(arg)), | |
| Self::PopBlock => write!(f, "POP_BLOCK"), | |
| _ => write!(f, "PSEUDO_PLACEHOLDER"), | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 893 - 904, The
fmt_dis implementation for PseudoInstruction currently calls unimplemented!()
and will panic if Instruction::Pseudo is ever formatted via InstructionMetadata;
replace the panic with a defensive implementation (or at minimum a clearer
todo!) by updating the fmt_dis method on the
PseudoInstruction/Instruction::Pseudo branch: either call
todo!("PseudoInstruction display not yet implemented") to provide a clearer
error or implement a minimal formatter that writes a placeholder string (e.g.,
the pseudo opcode name and any relevant fields) to the provided fmt::Formatter,
respecting the signature of fmt_dis and using the
_ctx/_expand_code_objects/_pad/_level params as no-ops if not needed. Ensure the
change is made in the fmt_dis function so formatting of Instruction::Pseudo no
longer panics.
| /// A Single bytecode instruction. | ||
| /// Instructions are ordered to match CPython 3.13 opcode numbers exactly. | ||
| /// HAVE_ARGUMENT = 44: opcodes 0-43 have no argument, 44+ have arguments. | ||
| #[derive(Debug, Copy, Clone, PartialEq, Eq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed Eq & PartialEq as they are only used once in the code. It's a very tiny optimization for binary size. I can revert it if it's a concern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 847-857: PseudoInstruction currently ignores JumpNoInterrupt in
both label_arg and unconditional_branch; update the impl of InstructionMetadata
for PseudoInstruction so label_arg returns Some(*l) when matching
Self::JumpNoInterrupt { target: l } (similar to Jump) and update
unconditional_branch to include Self::JumpNoInterrupt in the matches! macro
(alongside Self::Jump { .. }) so JumpNoInterrupt is treated as an unconditional
branch for label resolution and control-flow analysis.
♻️ Duplicate comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)
876-887:fmt_disforPseudoInstructionstill panics unconditionally.This concern was raised in a previous review. The
unimplemented!()will panic ifInstruction::Pseudois formatted viaInstructionMetadata. Consider providing a minimal implementation or usingtodo!("PseudoInstruction display not yet implemented")for a clearer error message.
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)
829-845: Range-based transmute assumes contiguous discriminants.The
TryFrom<u16>implementation relies on all discriminants being contiguous (256-267). Currently this holds, but if future changes introduce gaps (e.g., removingReserved258), the transmute would accept invalid values.Consider either:
- Adding a comment documenting this invariant
- Using an explicit match or lookup similar to
RealInstruction'scustom_opsapproach📝 Suggested documentation
impl TryFrom<u16> for PseudoInstruction { type Error = MarshalError; #[inline] fn try_from(value: u16) -> Result<Self, MarshalError> { + // SAFETY: All discriminants from Jump (256) to StoreFastMaybeNull (267) + // must be contiguous for this range check to be valid. let start = u16::from(Self::Jump { target: Arg::marker(), });
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
crates/compiler-core/src/bytecode/instruction.rscrates/compiler-core/src/bytecode/oparg.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/compiler-core/src/bytecode/oparg.rscrates/compiler-core/src/bytecode/instruction.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests (windows-2025)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check Rust code with clippy
🔇 Additional comments (7)
crates/compiler-core/src/bytecode/oparg.rs (2)
5-5: LGTM! Consistent migration toRealInstruction.The import and type changes correctly reflect that
OpArgState::getoperates on real (emitted) bytecode whereExtendedArghandling is relevant. The return type change aligns with the PR's architectural split.Also applies to: 81-86
367-370: Documentation example correctly updated.The doc example now properly references
RealInstructionand the import path, maintaining consistency with the new instruction hierarchy.crates/compiler-core/src/bytecode/instruction.rs (5)
18-20: LGTM! Well-structured opcode enum with correct size guarantee.The
#[repr(u8)]with size assertion ensures the enum fits in a single byte, which is correct sinceArg<T>is zero-sized (wrapsPhantomData). The explicit opcode numbering matches CPython 3.13.The
LoadClosureTODO at line 279 is noted - this appears to be planned technical debt.Also applies to: 279-282
292-350: Correct opcode validation with explicit gap handling.The implementation correctly validates opcode ranges before the unsafe transmute. The explicit
custom_opsarray properly excludes the undefined gap at 125-127, preventing invalid opcodes from being accepted.
927-954: Clean delegation pattern withinst_eithermacro.The macro correctly forwards
InstructionMetadatamethod calls to the underlyingRealInstructionorPseudoInstruction. This keeps the wrapper thin and avoids code duplication.
970-1021: Well-documentedInstructionMetadatatrait.The trait documentation includes clear examples demonstrating usage with
RealInstruction. Thedisplayhelper usingFmtFnis a clean pattern for on-demand formatting.
1078-1105: LGTM! Correct encode/decode helpers for attribute opcodes.The bit-packing schemes are correctly implemented:
load_attr_arg: bit 0 = method flag, bits 1+ = name indexload_super_attr_arg: bit 0 = load_method, bit 1 = has_class, bits 2+ = name indexBoth encode/decode pairs are symmetric and use
const fnappropriately.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| impl InstructionMetadata for PseudoInstruction { | ||
| fn label_arg(&self) -> Option<Arg<Label>> { | ||
| match self { | ||
| Self::Jump { target: l } => Some(*l), | ||
| _ => None, | ||
| } | ||
| } | ||
|
|
||
| fn unconditional_branch(&self) -> bool { | ||
| matches!(self, Self::Jump { .. },) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing JumpNoInterrupt handling in label_arg and unconditional_branch.
PseudoInstruction::JumpNoInterrupt has a target: Arg<Label> field, but:
label_argreturnsNonefor it (line 851 catches it in_ => None)unconditional_branchdoesn't match it
In contrast, RealInstruction::JumpBackwardNoInterrupt is handled in both methods (lines 357, 377). This inconsistency could cause label resolution or control-flow analysis to miss this instruction's jump target.
🐛 Proposed fix
fn label_arg(&self) -> Option<Arg<Label>> {
match self {
- Self::Jump { target: l } => Some(*l),
+ Self::Jump { target: l } | Self::JumpNoInterrupt { target: l } => Some(*l),
_ => None,
}
}
fn unconditional_branch(&self) -> bool {
- matches!(self, Self::Jump { .. },)
+ matches!(self, Self::Jump { .. } | Self::JumpNoInterrupt { .. })
}🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 847 - 857,
PseudoInstruction currently ignores JumpNoInterrupt in both label_arg and
unconditional_branch; update the impl of InstructionMetadata for
PseudoInstruction so label_arg returns Some(*l) when matching
Self::JumpNoInterrupt { target: l } (similar to Jump) and update
unconditional_branch to include Self::JumpNoInterrupt in the matches! macro
(alongside Self::Jump { .. }) so JumpNoInterrupt is treated as an unconditional
branch for label resolution and control-flow analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/codegen/src/ir.rs`:
- Around line 260-267: The match arm that currently groups
PseudoInstruction::JumpNoInterrupt with placeholders and calls unimplemented!
must be fixed: add an explicit PseudoInstruction::JumpNoInterrupt { target } arm
and implement its conversion similar to PseudoInstruction::Jump (the conversion
block around the existing Jump handling at lines ~307-323), using the target
field to produce the equivalent bytecode/IR for a non-interrupting jump; update
the Jump conversion logic to handle both PseudoInstruction::Jump and
PseudoInstruction::JumpNoInterrupt (e.g., by branching on which enum variant and
mapping to the same jump emission code path but preserving the non-interrupt
semantics), or alternatively ensure elsewhere in the emitter that
JumpNoInterrupt is never produced — but prefer adding the explicit conversion
for JumpNoInterrupt in the match and conversion block so it no longer panics.
♻️ Duplicate comments (2)
crates/compiler-core/src/bytecode/instruction.rs (2)
847-857:JumpNoInterruptstill missing fromlabel_argandunconditional_branch.The previous review flagged that
PseudoInstruction::JumpNoInterrupthas atarget: Arg<Label>field but is not handled in either method. This is inconsistent with howRealInstruction::JumpBackwardNoInterruptis handled (lines 357, 377). Label resolution and control-flow analysis will miss this instruction's jump target.🐛 Proposed fix
fn label_arg(&self) -> Option<Arg<Label>> { match self { - Self::Jump { target: l } => Some(*l), + Self::Jump { target: l } | Self::JumpNoInterrupt { target: l } => Some(*l), _ => None, } } fn unconditional_branch(&self) -> bool { - matches!(self, Self::Jump { .. },) + matches!(self, Self::Jump { .. } | Self::JumpNoInterrupt { .. }) }
876-887:fmt_disforPseudoInstructionstill panics unconditionally.The previous review flagged this
unimplemented!(). IfInstruction::Pseudois ever formatted viaInstructionMetadata(e.g., during debug output inmax_stackdepth), this will panic. Consider at minimum usingtodo!("PseudoInstruction display not yet implemented")for a clearer error message, or provide a basic implementation.💡 Minimal implementation suggestion
fn fmt_dis( &self, - _arg: OpArg, - _f: &mut fmt::Formatter<'_>, + arg: OpArg, + f: &mut fmt::Formatter<'_>, _ctx: &impl InstrDisplayContext, _expand_code_objects: bool, - _pad: usize, + pad: usize, _level: usize, ) -> fmt::Result { - unimplemented!() + match self { + Self::Jump { target } => write!(f, "{:pad$}({})", "JUMP", target.get(arg)), + Self::JumpNoInterrupt { target } => write!(f, "{:pad$}({})", "JUMP_NO_INTERRUPT", target.get(arg)), + Self::LoadAttrMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ATTR_METHOD", idx.get(arg)), + Self::LoadSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_SUPER_METHOD", idx.get(arg)), + Self::LoadZeroSuperAttr { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_ATTR", idx.get(arg)), + Self::LoadZeroSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_METHOD", idx.get(arg)), + Self::PopBlock => write!(f, "POP_BLOCK"), + _ => write!(f, "{:?}", self), + } }
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode/instruction.rs (1)
279-279: Consider tracking the TODO forLoadClosuremigration.The TODO indicates
LoadClosureshould be moved toPseudoInstruction. This appears to be a deliberate deferral for a future refactoring pass. Consider creating an issue to track this.Would you like me to open an issue to track the
LoadClosuremigration to pseudo instructions?
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
crates/codegen/src/ir.rscrates/compiler-core/src/bytecode/instruction.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/codegen/src/ir.rscrates/compiler-core/src/bytecode/instruction.rs
🧠 Learnings (3)
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/codegen/src/ir.rs
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: RustPython is a Python 3 interpreter written in Rust implementing Python 3.13.0+ compatibility with complete Python environment in Rust (not CPython bindings), clean implementation without compatibility hacks, and cross-platform support including WebAssembly compilation
Applied to files:
crates/codegen/src/ir.rs
📚 Learning: 2025-12-27T14:03:49.034Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.034Z
Learning: When modifying bytecode instructions, perform a full clean build by running `rm -r target/debug/build/rustpython-* && find . | grep -E '\.pyc$' | xargs rm -r`
Applied to files:
crates/codegen/src/ir.rs
🧬 Code graph analysis (1)
crates/codegen/src/ir.rs (2)
crates/compiler-core/src/bytecode.rs (3)
encode_exception_table(63-75)new(50-58)new(313-315)crates/compiler-core/src/bytecode/instruction.rs (4)
encode_load_attr_arg(1104-1106)encode_load_super_attr_arg(1118-1120)marker(1052-1054)new(1057-1059)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check the WASM package and demo
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Run tests under miri
- GitHub Check: Run snippets and cpython tests (windows-2025)
🔇 Additional comments (11)
crates/compiler-core/src/bytecode/instruction.rs (6)
829-845: LGTM!The
TryFrom<u16>implementation correctly validates the contiguous range (256-267) before transmuting. All discriminant values within this range are defined in the enum.
916-925: LGTM!The
TryFrom<u16>implementation correctly dispatches: values fitting inu8are tried asRealInstructionfirst, while values > 255 are treated asPseudoInstruction. This properly handles the opcode space separation.
927-954: LGTM!The
inst_either!macro provides clean dispatch abstraction for theInstructionMetadatatrait implementation, forwarding calls to the appropriate variant.
994-1045: LGTM!The
InstructionMetadatatrait is well-designed with clear documentation and code examples. Thedisplayhelper method usingFmtFnis a clean pattern for lazy formatting.
1102-1114: LGTM!The
encode_load_attr_arganddecode_load_attr_argfunctions correctly implement symmetric bit-packing: bit 0 stores the method flag, remaining bits store the name index.
978-991: No action needed. The project's MSRV is 1.91.0 (found inCargo.toml), which is well beyond Rust 1.57.0 when const panic support was stabilized. Theconst fnwithexpect()pattern used inexpect_realandexpect_pseudois fully compatible with the project's minimum supported Rust version.Likely an incorrect or invalid review comment.
crates/codegen/src/ir.rs (5)
4-14: LGTM!The imports correctly include the new types:
Instruction,InstructionMetadata,PseudoInstruction,RealInstruction, and the encoding helpersencode_load_attr_arg,encode_load_super_attr_arg.
307-323: LGTM!The JUMP pseudo-to-real conversion correctly determines jump direction based on offset comparison: forward jumps when
target_offset > current_offset, backward otherwise.
330-334: LGTM!The extended argument emission correctly uses
RealInstruction::ExtendedArgfor multi-byte oparg encoding, and the finalCodeUnitis properly constructed with the real instruction and low byte.
204-224: LGTM!The
RealInstruction::LoadAttrandRealInstruction::LoadSuperAttrcases correctly encode their opargs:
LoadAttr: method flag = false (plain attribute load)LoadSuperAttr: method = false, class = true (2-argsuper(cls, self)call)This is consistent with the pseudo instruction conversions below where method flags are set appropriately.
228-256: LGTM!The pseudo instruction conversions correctly map flag combinations:
LoadAttrMethod→LoadAttrwithis_method=truePopBlock→Nop(effectively removed)LoadSuperMethod→LoadSuperAttrwith(method=true, class=true)LoadZeroSuperAttr→LoadSuperAttrwith(method=false, class=false)LoadZeroSuperMethod→LoadSuperAttrwith(method=true, class=false)This correctly implements the documented semantics where "Zero" variants represent 0-arg
super()calls.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| PseudoInstruction::JumpNoInterrupt { .. } | ||
| | PseudoInstruction::Reserved258 | ||
| | PseudoInstruction::SetupCleanup | ||
| | PseudoInstruction::SetupFinally | ||
| | PseudoInstruction::SetupWith | ||
| | PseudoInstruction::StoreFastMaybeNull => { | ||
| unimplemented!("Got a placeholder pseudo instruction ({instr:?})") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JumpNoInterrupt and other placeholders will panic at runtime.
If any of these placeholder pseudo instructions are emitted during compilation, the code will panic with unimplemented!(). Notably, JumpNoInterrupt is defined with a target field but is not handled for conversion.
Consider either:
- Implementing conversion for
JumpNoInterrupt(similar toJumpat lines 307-323) - Ensuring these instructions are never emitted by the compiler
🐛 Proposed fix for JumpNoInterrupt
Add handling in the match at lines 257-267 and in the JUMP conversion block:
PseudoInstruction::Jump { .. } => {
// PseudoInstruction::Jump instructions are handled later
}
- PseudoInstruction::JumpNoInterrupt { .. }
+ PseudoInstruction::JumpNoInterrupt { .. } => {
+ // PseudoInstruction::JumpNoInterrupt instructions are handled later
+ }
| PseudoInstruction::Reserved258And update the conversion block (lines 307-323) to handle JumpNoInterrupt:
let op = match info.instr {
Instruction::Pseudo(PseudoInstruction::Jump { .. })
+ | Instruction::Pseudo(PseudoInstruction::JumpNoInterrupt { .. })
if target != BlockIdx::NULL =>
{
let target_offset = block_to_offset[target.idx()].0;
if target_offset > current_offset {
- RealInstruction::JumpForward {
+ // For JumpNoInterrupt, use JumpBackwardNoInterrupt
+ if matches!(info.instr, Instruction::Pseudo(PseudoInstruction::JumpNoInterrupt { .. })) {
+ // JumpNoInterrupt should only jump backward
+ RealInstruction::JumpBackwardNoInterrupt { target: Arg::marker() }
+ } else {
+ RealInstruction::JumpForward { target: Arg::marker() }
+ }🤖 Prompt for AI Agents
In `@crates/codegen/src/ir.rs` around lines 260 - 267, The match arm that
currently groups PseudoInstruction::JumpNoInterrupt with placeholders and calls
unimplemented! must be fixed: add an explicit PseudoInstruction::JumpNoInterrupt
{ target } arm and implement its conversion similar to PseudoInstruction::Jump
(the conversion block around the existing Jump handling at lines ~307-323),
using the target field to produce the equivalent bytecode/IR for a
non-interrupting jump; update the Jump conversion logic to handle both
PseudoInstruction::Jump and PseudoInstruction::JumpNoInterrupt (e.g., by
branching on which enum variant and mapping to the same jump emission code path
but preserving the non-interrupt semantics), or alternatively ensure elsewhere
in the emitter that JumpNoInterrupt is never produced — but prefer adding the
explicit conversion for JumpNoInterrupt in the match and conversion block so it
no longer panics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Line 781: The LoadClosure display currently emits only the opcode name
(Self::LoadClosure(_) => w!(LOAD_CLOSURE)); update its formatter to print the
closure index and resolved name like the other indexed loads (e.g.,
LoadDeref/LoadFast). In the match arm for LoadClosure, extract the contained
index, call the same name-lookup helper used by other cases (the code that
formats names for e.g. LoadDeref/LoadFast), and pass both the opcode and the
resolved name/index into the w! formatter so the printed output shows which
closure index/name is being loaded.
♻️ Duplicate comments (2)
crates/compiler-core/src/bytecode/instruction.rs (2)
847-857:JumpNoInterruptstill missing fromlabel_argandunconditional_branch.
PseudoInstruction::JumpNoInterrupthas atarget: Arg<Label>field, but it's not handled inlabel_arg(caught by_ => None) and not matched inunconditional_branch. This could cause label resolution or control-flow analysis to miss this instruction's jump target.🐛 Proposed fix
fn label_arg(&self) -> Option<Arg<Label>> { match self { - Self::Jump { target: l } => Some(*l), + Self::Jump { target: l } | Self::JumpNoInterrupt { target: l } => Some(*l), _ => None, } } fn unconditional_branch(&self) -> bool { - matches!(self, Self::Jump { .. },) + matches!(self, Self::Jump { .. } | Self::JumpNoInterrupt { .. }) }
876-886:fmt_disforPseudoInstructionpanics unconditionally.This
unimplemented!()will panic ifInstruction::Pseudois ever formatted via theInstructionMetadatatrait (e.g., during debug/disassembly output). At minimum, usetodo!()for a clearer message, or provide a basic implementation.Minimal implementation suggestion
fn fmt_dis( &self, - _arg: OpArg, - _f: &mut fmt::Formatter<'_>, + arg: OpArg, + f: &mut fmt::Formatter<'_>, _ctx: &impl InstrDisplayContext, _expand_code_objects: bool, - _pad: usize, + pad: usize, _level: usize, ) -> fmt::Result { - unimplemented!() + match self { + Self::Jump { target } => write!(f, "{:pad$}({})", "JUMP", target.get(arg)), + Self::JumpNoInterrupt { target } => write!(f, "{:pad$}({})", "JUMP_NO_INTERRUPT", target.get(arg)), + Self::LoadAttrMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ATTR_METHOD", idx.get(arg)), + Self::LoadSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_SUPER_METHOD", idx.get(arg)), + Self::LoadZeroSuperAttr { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_ATTR", idx.get(arg)), + Self::LoadZeroSuperMethod { idx } => write!(f, "{:pad$}({})", "LOAD_ZERO_SUPER_METHOD", idx.get(arg)), + Self::PopBlock => write!(f, "POP_BLOCK"), + _ => write!(f, "PSEUDO({:?})", self), + } }
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
crates/compiler-core/src/bytecode/instruction.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/compiler-core/src/bytecode/instruction.rs
🧬 Code graph analysis (1)
crates/compiler-core/src/bytecode/instruction.rs (1)
crates/compiler-core/src/bytecode/oparg.rs (7)
from(25-27)from(68-70)fmt(31-33)fmt(140-150)fmt(256-258)fmt(466-496)fmt(579-581)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run tests under miri
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-2025)
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
🔇 Additional comments (7)
crates/compiler-core/src/bytecode/instruction.rs (7)
18-20: LGTM on the enum structure.The
RealInstructionenum is well-organized with explicit discriminants matching CPython 3.13 opcodes. The removal ofEq/PartialEqderives is acknowledged per the author's comment about binary size optimization.
304-306: Correctly handles LoadClosure's non-contiguous opcode.The explicit check for
load_closureat opcode 253 is necessary since it falls outside both the CPython range (0-118) and the RustPython custom ops range (119-135).
787-845: LGTM onPseudoInstructionstructure.The enum correctly uses
#[repr(u16)]with discriminants starting at 256, cleanly separating pseudo opcodes from real opcodes. TheTryFrom<u16>validation correctly handles the contiguous range.
915-924: LGTM onTryFrom<u16>forInstruction.The logic correctly routes values that fit in
u8(0-255) toRealInstructionand larger values (256+) toPseudoInstruction, maintaining clean separation between the two instruction spaces.
926-953: LGTM on theinst_either!macro andInstructionMetadataimpl.The macro cleanly dispatches trait method calls to the appropriate variant, avoiding code duplication. The implementation correctly forwards all
InstructionMetadatamethods.
993-1044: LGTM onInstructionMetadatatrait definition.The trait is well-documented with examples that correctly reference
RealInstruction. Thedisplay()helper method provides good ergonomics for formatting instructions.
1101-1128: LGTM on encode/decode helpers.The
encode_load_attr_arg/decode_load_attr_argandencode_load_super_attr_arg/decode_load_super_attr_argfunctions are symmetric and correctly implement the bit-packing schemes for these opcodes.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/codegen/src/ir.rs (1)
305-323: Guard againstPseudoInstruction::Jump { .. }withtarget == NULLpanicking.If
targetis unexpectedlyNULL, the match falls through toother.expect_real()and panics. Even adebug_assert!(target != BlockIdx::NULL)(or a structured internal error) would make this easier to diagnose.
🤖 Fix all issues with AI agents
In `@crates/compiler-core/src/bytecode/instruction.rs`:
- Around line 15-21: The doc comment for the Instruction type has a
grammar/style issue: change "A Single bytecode instruction that are executed by
the VM." to a grammatically correct, stylistically consistent sentence such as
"A single bytecode instruction that is executed by the VM." Update the comment
above the #[derive(Clone, Copy, Debug)] (the Instruction documentation block) to
use lowercase "single" and singular verb "is executed" while leaving the rest of
the doc (CPython reference lines) intact.
- Around line 282-284: The comment above the LoadClosure variant has a
typo/unfinished parenthesis; update the comment to a clean, concise note (e.g.,
"Pseudos — should be moved to PseudoInstruction enum.") and remove the stray
parenthesis so it reads properly, and keep the existing TODO about moving
LoadClosure(Arg<NameIdx>) = 253 into the PseudoInstruction enum.
- Around line 977-995: The panic messages in expect_real and expect_pseudo
reference the wrong enum name; update the expect calls in the const fns
expect_real and expect_pseudo to mention AnyInstruction::Real and
AnyInstruction::Pseudo respectively (e.g., replace "Expected Instruction::Real,
found Instruction::Pseudo" with "Expected AnyInstruction::Real, found
AnyInstruction::Pseudo" and similarly for the other) so crash reports correctly
identify the type variant.
- Around line 1106-1118: Add a bounds check to encode_load_attr_arg to make
overflow loud: assert that name_idx fits in 31 bits (e.g. name_idx < (1 << 31)
or name_idx <= (u32::MAX >> 1)) before doing (name_idx << 1) and keep the
is_method bool casting; use debug_assert! for non-production overhead.
Optionally add a debug_assert in decode_load_attr_arg to validate that
re-encoding the returned (name_idx, is_method) yields the original oparg to
catch corruption during development.
In `@crates/vm/src/frame.rs`:
- Around line 345-346: Update the misleading comment above the Err(exception)
match arm to refer to the executed VM "Instruction" (or "the executed
instruction") instead of "AnyInstruction"; locate the comment immediately
preceding the Err(exception) branch in frame.rs (the match arm handling
exceptions) and change the text to something like "Instruction raised an
exception" or "The executed instruction raised an exception" so it accurately
describes the error path.
♻️ Duplicate comments (5)
crates/jit/src/instructions.rs (1)
644-650: FixSWAPsemantics (off-by-one).
SWAP 1should swap TOS with itself (no-op), butlen - 1 - indexswaps with the second-from-top.Proposed fix
Instruction::Swap { index } => { let len = self.stack.len(); let i = len - 1; - let j = len - 1 - index.get(arg) as usize; + let n = index.get(arg) as usize; // 1-based + if n == 0 || n > len { + return Err(JitCompileError::BadBytecode); + } + let j = len - n; self.stack.swap(i, j); Ok(()) }crates/compiler-core/src/bytecode/instruction.rs (2)
785-786:LOAD_CLOSUREdisassembly omits its index/name.
This makes traces/disassembly much harder to interpret.Proposed change
- Self::LoadClosure(_) => w!(LOAD_CLOSURE), + Self::LoadClosure(idx) => w!(LOAD_CLOSURE, cell_name = idx),
852-892: Pseudo jump metadata + formatting are incomplete (will miss labels / can panic).
JumpNoInterruptshould participate inlabel_arg()andunconditional_branch()likeJump.fmt_discurrently panics viaunimplemented!()if pseudos ever get formatted.Suggested fix
impl InstructionMetadata for PseudoInstruction { fn label_arg(&self) -> Option<Arg<Label>> { match self { - Self::Jump { target: l } => Some(*l), + Self::Jump { target: l } | Self::JumpNoInterrupt { target: l } => Some(*l), _ => None, } } fn unconditional_branch(&self) -> bool { - matches!(self, Self::Jump { .. },) + matches!(self, Self::Jump { .. } | Self::JumpNoInterrupt { .. }) } fn fmt_dis(/* ... */) -> fmt::Result { - unimplemented!() + // At minimum, avoid panicking during debug/disassembly paths. + todo!("PseudoInstruction disassembly not implemented yet") } }crates/stdlib/src/opcode.rs (1)
82-99: Predicate classification gaps: missing real jump ops +LoadSuperAttrname usage.
has_jumpshould includeInstruction::JumpForward/JumpBackward/JumpBackwardNoInterrupt.has_nameshould includeInstruction::LoadSuperAttr { .. }.Proposed fix
pub fn has_name(opcode: i32) -> bool { matches!( Self::try_from(opcode).map(|op| op.inner()), Ok(AnyInstruction::Real( Instruction::DeleteAttr { .. } | Instruction::DeleteGlobal(_) | Instruction::DeleteName(_) | Instruction::ImportFrom { .. } | Instruction::ImportName { .. } | Instruction::LoadAttr { .. } | Instruction::LoadGlobal(_) | Instruction::LoadName(_) + | Instruction::LoadSuperAttr { .. } | Instruction::StoreAttr { .. } | Instruction::StoreGlobal(_) | Instruction::StoreName(_) ) | AnyInstruction::Pseudo(PseudoInstruction::LoadAttrMethod { .. })) ) } pub fn has_jump(opcode: i32) -> bool { matches!( Self::try_from(opcode).map(|op| op.inner()), Ok(AnyInstruction::Real( Instruction::Break { .. } | Instruction::Continue { .. } | Instruction::ForIter { .. } + | Instruction::JumpForward { .. } + | Instruction::JumpBackward { .. } + | Instruction::JumpBackwardNoInterrupt { .. } | Instruction::JumpIfFalseOrPop { .. } | Instruction::JumpIfNotExcMatch(_) | Instruction::JumpIfTrueOrPop { .. } | Instruction::PopJumpIfFalse { .. } | Instruction::PopJumpIfTrue { .. } | Instruction::Send { .. } ) | AnyInstruction::Pseudo(PseudoInstruction::Jump { .. })) ) }Also applies to: 103-118
crates/codegen/src/ir.rs (1)
199-268:JumpNoInterruptis still a runtime panic + has no lowering path.Right now:
- It’s grouped into the
unimplemented!()placeholder bucket.- The later “convert JUMP pseudo to real” block only handles
PseudoInstruction::Jump.If codegen ever emits
JumpNoInterrupt, compilation will panic infinalize_code.Direction for a fix
- Treat
JumpNoInterruptlikeJumpin the early pseudo pass (do notunimplemented!()it).- Lower it alongside
Jump, likely toInstruction::JumpBackwardNoInterruptwhentarget_offset <= current_offset(and decide what to do for forward jumps: either forbid via assertion or map toJumpForward).- PseudoInstruction::JumpNoInterrupt { .. } - | PseudoInstruction::Reserved258 + PseudoInstruction::JumpNoInterrupt { .. } => { + // handled later like Jump (lowering step) + } + PseudoInstruction::Reserved258 | PseudoInstruction::SetupCleanup | PseudoInstruction::SetupFinally | PseudoInstruction::SetupWith | PseudoInstruction::StoreFastMaybeNull => { unimplemented!("Got a placeholder pseudo instruction ({instr:?})") }And in the lowering match:
- AnyInstruction::Pseudo(PseudoInstruction::Jump { .. }) + AnyInstruction::Pseudo(PseudoInstruction::Jump { .. } | PseudoInstruction::JumpNoInterrupt { .. }) if target != BlockIdx::NULL => { let target_offset = block_to_offset[target.idx()].0; if target_offset > current_offset { - Instruction::JumpForward { target: Arg::marker() } + Instruction::JumpForward { target: Arg::marker() } // or forbid for JumpNoInterrupt } else { - Instruction::JumpBackward { target: Arg::marker() } + if matches!(info.instr, AnyInstruction::Pseudo(PseudoInstruction::JumpNoInterrupt { .. })) { + Instruction::JumpBackwardNoInterrupt { target: Arg::marker() } + } else { + Instruction::JumpBackward { target: Arg::marker() } + } } }Also applies to: 305-323
🧹 Nitpick comments (6)
crates/compiler-core/src/bytecode/instruction.rs (2)
288-294: Unsafetransmuteis OK here, but keep the safety story airtight.Given
#[repr(u8)]+size_of::<Instruction>() == 1+ explicit value validation inTryFrom<u8>, this looks fine. Consider adding a short note that all payloads are ZST (Arg<T>) so there’s no invalid-data concern beyond discriminant validity.Also applies to: 349-350
296-354:TryFrom<u8>whitelist looks correct; consider reducing “special cases” drift.The explicit
custom_opslist avoids gaps (good). The lingeringLoadClosureexception is a maintenance footgun—if it must stay for now, I’d at least add a comment pointing to the tracking issue/PR that will move it to pseudos.crates/codegen/src/compile.rs (4)
277-297:emit!macro refactor is clean; consider adding one “label arg” example in comments/tests.The new macro shapes (struct/tuple/no-arg) map well onto
emit_arg/emit_no_arg. Since label emission now depends on theEmitArg<bytecode::Label> for BlockIdximpl, a tiny comment or targeted test snapshot for a label-bearing instruction would make regressions easier to spot.
6103-6166: Comprehension init_collection AnyInstruction plumbing works, but call sites got noisier (marker()+.into()).Right now every list/set/dict comp has to build an
AnyInstructionmanually withOpArgMarker::marker()and.into(). If feasible, consider makingcompile_comprehensiongeneric overI: Into<AnyInstruction>forinit_collectionto keep call sites closer to the pre-refactor style.Also applies to: 6522-6714
6744-6932: Inlined comprehension cleanup path: ensure it’s correct wheninit_collectionis None (generator case).The exception cleanup block currently assumes a “collection” exists when it does
Swap { index: 2 }+PopTop(comment says “Pop incomplete collection”). Ifinit_collectionis None, the stack shape differs and this cleanup may pop the iterator or otherwise mis-restore locals. Even if PEP 709 is currently disabled, tightening this now will prevent future footguns.
6955-6983: Generic emission plumbing (Into<AnyInstruction>) is a solid direction; watch for monomorphization bloat in hot paths.
_emit/emit_argnow monomorphize per instruction constructor closure type. It’s probably fine for codegen, but if compile times regress, consider moving some generic helpers behind a smaller set of concrete wrappers (or acceptAnyInstructionat specific choke points).Also applies to: 7454-7478
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
crates/codegen/src/compile.rscrates/codegen/src/ir.rscrates/compiler-core/src/bytecode.rscrates/compiler-core/src/bytecode/instruction.rscrates/compiler-core/src/bytecode/oparg.rscrates/jit/src/instructions.rscrates/stdlib/src/opcode.rscrates/vm/src/frame.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/compiler-core/src/bytecode.rs
- crates/compiler-core/src/bytecode/oparg.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style usingcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/frame.rscrates/jit/src/instructions.rscrates/stdlib/src/opcode.rscrates/codegen/src/ir.rscrates/codegen/src/compile.rscrates/compiler-core/src/bytecode/instruction.rs
🧠 Learnings (1)
📚 Learning: 2026-01-14T14:52:10.778Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-01-14T14:52:10.778Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/codegen/src/ir.rs
🧬 Code graph analysis (4)
crates/vm/src/frame.rs (2)
crates/compiler-core/src/bytecode/instruction.rs (1)
get(1074-1076)crates/compiler-core/src/bytecode/oparg.rs (1)
get(81-87)
crates/stdlib/src/opcode.rs (1)
crates/compiler-core/src/bytecode/instruction.rs (4)
try_from(300-353)try_from(838-849)try_from(915-917)try_from(923-928)
crates/codegen/src/ir.rs (1)
crates/compiler-core/src/bytecode/instruction.rs (3)
encode_load_attr_arg(1108-1110)encode_load_super_attr_arg(1122-1124)marker(1056-1058)
crates/codegen/src/compile.rs (3)
crates/codegen/src/ir.rs (2)
idx(49-51)new(43-45)crates/compiler-core/src/bytecode/instruction.rs (2)
marker(1056-1058)new(1061-1063)crates/compiler-core/src/bytecode/oparg.rs (2)
null(19-21)null(42-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Run snippets and cpython tests (windows-2025)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run tests under miri
🔇 Additional comments (13)
crates/compiler-core/src/bytecode/instruction.rs (3)
356-390:label_arg/unconditional_branchonInstructionlook consistent.Nice to see
JumpBackwardNoInterrupttreated as both a label-bearing op and unconditional branch.
560-563: Stack effect forLoadSuperAttris now correctly derived from packed oparg.Decoding the flags to compute
-3 + (1|2)matches the runtime behavior inframe.rs.
706-738:fmt_disforLoadAttr/LoadSuperAttris much more informative.The
method=/class=flags should make debugging attribute/super lowering far easier.crates/vm/src/frame.rs (2)
10-11: Import change tobytecode::{self, Instruction}looks consistent.This matches the new “real instruction executed by VM” split.
390-401: Reraise detection is clearer and covers both RERAISE forms.Treating
RaiseVarargs(BareRaise|ReraiseFromStack)andInstruction::Reraiseas “no new traceback entry” matches the intent expressed in the comment.crates/jit/src/instructions.rs (1)
212-221: Jump handling simplification matches the real/pseudo split.Restricting control-flow to real jump ops in the JIT aligns with the new pipeline.
Also applies to: 563-569
crates/stdlib/src/opcode.rs (1)
12-34: Opcode wrapper aroundAnyInstructionis a clean API boundary.
TryFrom<i32>+try_from_pyint(u16)matches the new “real+pseudo opcode space”.crates/codegen/src/ir.rs (1)
88-96: IR carryingAnyInstructionis the right shape for pseudo-op staging.This makes the “lower to real before emission” boundary explicit.
crates/codegen/src/compile.rs (5)
37-45: HashSet-based duplicate param detection looks good; confirmstddependency is acceptable here.Using
HashSetavoids O(n²) duplicate checks and keeps allocations low (borrowing&str). Ifcrates/codegenever targetsno_std + alloc, this newstd::collections::HashSetwould be a blocker—worth double-checking the crate’s intended platform support.Also applies to: 189-201
1188-1219:PseudoInstruction::PopBlockinunwind_fblockneeds a quick semantic sanity-check.This is a subtle correctness area: unwinding
withduringreturn/break/continuemust ensure the exception-table state and stack effects match what the VM expects before calling__exit__. Please confirm the lowering ofPseudoInstruction::PopBlock(and its interaction with exception tables) matches the previous behavior.
1913-1945: BroadJump→PseudoInstruction::Jumpmigration is consistent; please ensure all back-edges and “skip blocks” are covered.The conversions look uniform across if/try/loops/pattern-match/comprehensions. Main risk is a missed “jump-like” emission (or one still emitting a real
Instruction::Jump) causing inconsistent label resolution/arg encoding in the later lowering stage.Also applies to: 2520-2595, 2795-2802, 2927-3146, 4021-4032, 4163-4168, 4321-4325, 4380-4397, 5049-5051, 5284-5287, 5397-5406, 5839-5841, 6867-6884
5904-5920: super()/method-call pseudo ops: verify stack protocol matches CALL expectations.The switch to
PseudoInstruction::LoadAttrMethod,LoadSuperMethod, and zero-arg super variants is plausible, but it’s easy to accidentally break the[callable, self_or_null, args…]contract (especially around “self vs NULL” placement). Would be good to validate via a focused disassembly snapshot test for:
super().m()vssuper(C, self).m()obj.m()vsobj.attr(non-call)Also applies to: 6358-6381
7002-7010:emit_return_valuerewrite: may accidentally move out ofinst.instrunlessAnyInstruction: Copy.If
AnyInstructionisn’tCopy,&& let AnyInstruction::Real(..) = inst.instrtypically won’t compile (move out of a borrowed field). Safer pattern is to match by reference and copy/cloneidxbefore assigning.Proposed fix
- if let Some(inst) = self.current_block().instructions.last_mut() - && let AnyInstruction::Real(Instruction::LoadConst { idx }) = inst.instr - { - inst.instr = Instruction::ReturnConst { idx }.into(); - return; - } + if let Some(inst) = self.current_block().instructions.last_mut() + && let AnyInstruction::Real(Instruction::LoadConst { idx }) = &inst.instr + { + let idx = *idx; + inst.instr = Instruction::ReturnConst { idx }.into(); + return; + }
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| /// A Single bytecode instruction that are executed by the VM. | ||
| /// | ||
| /// Currently aligned with CPython 3.13. | ||
| /// | ||
| /// ## See also | ||
| /// - [CPython opcode IDs](https://github.com/python/cpython/blob/627894459a84be3488a1789919679c997056a03c/Include/opcode_ids.h) | ||
| #[derive(Clone, Copy, Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix docs grammar (“instruction … is executed”).
“A Single bytecode instruction that are executed…” should be singular (“is executed”). Also consider “single” vs “Single” for style consistency.
🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 15 - 21, The
doc comment for the Instruction type has a grammar/style issue: change "A Single
bytecode instruction that are executed by the VM." to a grammatically correct,
stylistically consistent sentence such as "A single bytecode instruction that is
executed by the VM." Update the comment above the #[derive(Clone, Copy, Debug)]
(the Instruction documentation block) to use lowercase "single" and singular
verb "is executed" while leaving the rest of the doc (CPython reference lines)
intact.
| // Pseudos (needs to be moved to `PseudoInstruction` enum. | ||
| LoadClosure(Arg<NameIdx>) = 253, // TODO: Move to pseudos | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment typo / unfinished paren in “Pseudos … enum.”
Line 282 reads like it’s missing a closing ) and could be tightened since the enum name is already Instruction.
🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 282 - 284, The
comment above the LoadClosure variant has a typo/unfinished parenthesis; update
the comment to a clean, concise note (e.g., "Pseudos — should be moved to
PseudoInstruction enum.") and remove the stray parenthesis so it reads properly,
and keep the existing TODO about moving LoadClosure(Arg<NameIdx>) = 253 into the
PseudoInstruction enum.
| /// Same as [`Self::real`] but panics if wasn't called on [`Self::Real`]. | ||
| /// | ||
| /// # Panics | ||
| /// | ||
| /// If was called on something else other than [`Self::Real`]. | ||
| pub const fn expect_real(self) -> Instruction { | ||
| self.real() | ||
| .expect("Expected Instruction::Real, found Instruction::Pseudo") | ||
| } | ||
|
|
||
| /// Same as [`Self::pseudo`] but panics if wasn't called on [`Self::Pseudo`]. | ||
| /// | ||
| /// # Panics | ||
| /// | ||
| /// If was called on something else other than [`Self::Pseudo`]. | ||
| pub const fn expect_pseudo(self) -> PseudoInstruction { | ||
| self.pseudo() | ||
| .expect("Expected Instruction::Pseudo, found Instruction::Real") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Panic messages refer to the wrong type (Instruction::Real/Pseudo).
These are AnyInstruction::Real/Pseudo; the current messages will confuse crash reports.
Suggested fix
pub const fn expect_real(self) -> Instruction {
self.real()
- .expect("Expected Instruction::Real, found Instruction::Pseudo")
+ .expect("Expected AnyInstruction::Real, found AnyInstruction::Pseudo")
}
pub const fn expect_pseudo(self) -> PseudoInstruction {
self.pseudo()
- .expect("Expected Instruction::Pseudo, found Instruction::Real")
+ .expect("Expected AnyInstruction::Pseudo, found AnyInstruction::Real")
}🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 977 - 995, The
panic messages in expect_real and expect_pseudo reference the wrong enum name;
update the expect calls in the const fns expect_real and expect_pseudo to
mention AnyInstruction::Real and AnyInstruction::Pseudo respectively (e.g.,
replace "Expected Instruction::Real, found Instruction::Pseudo" with "Expected
AnyInstruction::Real, found AnyInstruction::Pseudo" and similarly for the other)
so crash reports correctly identify the type variant.
| /// Encode LOAD_ATTR oparg: bit 0 = method flag, bits 1+ = name index. | ||
| #[inline] | ||
| pub const fn encode_load_attr_arg(name_idx: u32, is_method: bool) -> u32 { | ||
| (name_idx << 1) | (is_method as u32) | ||
| } | ||
|
|
||
| /// Decode LOAD_ATTR oparg: returns (name_idx, is_method). | ||
| #[inline] | ||
| pub const fn decode_load_attr_arg(oparg: u32) -> (u32, bool) { | ||
| let is_method = (oparg & 1) == 1; | ||
| let name_idx = oparg >> 1; | ||
| (name_idx, is_method) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add bounds assertions to prevent silent bit-pack truncation.
encode_load_attr_arg(name_idx << 1) will silently drop high bits if name_idx exceeds 31 bits. A debug_assert! (or fallible encode) would make failures loud.
Example
pub const fn encode_load_attr_arg(name_idx: u32, is_method: bool) -> u32 {
+ debug_assert!(name_idx < (1 << 31));
(name_idx << 1) | (is_method as u32)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Encode LOAD_ATTR oparg: bit 0 = method flag, bits 1+ = name index. | |
| #[inline] | |
| pub const fn encode_load_attr_arg(name_idx: u32, is_method: bool) -> u32 { | |
| (name_idx << 1) | (is_method as u32) | |
| } | |
| /// Decode LOAD_ATTR oparg: returns (name_idx, is_method). | |
| #[inline] | |
| pub const fn decode_load_attr_arg(oparg: u32) -> (u32, bool) { | |
| let is_method = (oparg & 1) == 1; | |
| let name_idx = oparg >> 1; | |
| (name_idx, is_method) | |
| } | |
| /// Encode LOAD_ATTR oparg: bit 0 = method flag, bits 1+ = name index. | |
| #[inline] | |
| pub const fn encode_load_attr_arg(name_idx: u32, is_method: bool) -> u32 { | |
| debug_assert!(name_idx < (1 << 31)); | |
| (name_idx << 1) | (is_method as u32) | |
| } | |
| /// Decode LOAD_ATTR oparg: returns (name_idx, is_method). | |
| #[inline] | |
| pub const fn decode_load_attr_arg(oparg: u32) -> (u32, bool) { | |
| let is_method = (oparg & 1) == 1; | |
| let name_idx = oparg >> 1; | |
| (name_idx, is_method) | |
| } |
🤖 Prompt for AI Agents
In `@crates/compiler-core/src/bytecode/instruction.rs` around lines 1106 - 1118,
Add a bounds check to encode_load_attr_arg to make overflow loud: assert that
name_idx fits in 31 bits (e.g. name_idx < (1 << 31) or name_idx <= (u32::MAX >>
1)) before doing (name_idx << 1) and keep the is_method bool casting; use
debug_assert! for non-production overhead. Optionally add a debug_assert in
decode_load_attr_arg to validate that re-encoding the returned (name_idx,
is_method) yields the original oparg to catch corruption during development.
youknowone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.