Add vectorcall (PEP 590) dispatch for function calls#7329
Conversation
Add VectorCallFunc slot to PyTypeSlots and vectorcall dispatch path in the interpreter loop for Call and CallKw instructions. Implement vectorcall for PyFunction (with fast path for simple positional-only calls that fills fastlocals directly), PyBoundMethod (avoids prepend_arg O(n) shift), and PyNativeFunction. Add FuncArgs::from_vectorcall helper for fallback conversion. Vectorcall slot is inherited with call slot and cleared when __call__ is overridden in Python subclasses.
📝 WalkthroughWalkthroughThis PR implements PEP 590 vectorcall support across the VM to enable fast-path calling for builtin functions, Python functions, and bound methods. It adds vectorcall slots to the type system, modifies frame dispatch to use vectorcall when available with fallback to existing paths, and enables owned-argument optimization to reduce allocations. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Sorry, something went wrong.
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin skip-class |
Sorry, something went wrong.
…ecialized paths - invoke_exact_args takes Vec by value and uses drain() to move args into fastlocals instead of cloning (eliminates refcount overhead) - CallPyGeneral and CallBoundMethodGeneral now call vectorcall_function directly instead of going through FuncArgs + prepend_arg + invoke - CallKwPy and CallKwBoundMethod use vectorcall_function with kwnames - vectorcall_bound_method uses insert(0) on existing Vec instead of allocating a second Vec
When needs_self was true and kwargs were present, pos_args only contained positional args (self + original positionals) but from_vectorcall expected kwarg values to follow in the slice. Build the full args array (self + all original args including kwarg values) before passing to from_vectorcall.
|
Super excited about this one ngl |
Sorry, something went wrong.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
crates/vm/src/frame.rs (1)
5695-5816: Consider extracting shared vectorcall argument marshalling into one helper.
execute_call_vectorcallandexecute_call_kw_vectorcallduplicate stack-index math and vector construction. A small internal helper for shared stack slicing/consumption would reduce drift risk and simplify future fixes.As per coding guidelines "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/frame.rs` around lines 5695 - 5816, Duplicate stack-index math and args construction in execute_call_vectorcall and execute_call_kw_vectorcall; extract a small helper to centralize that logic. Create an internal method (e.g., collect_vectorcall_args) used by both execute_call_vectorcall and execute_call_kw_vectorcall that: computes callable_idx/self_or_null_idx/args_start from nargs and optional kw_count, consumes the stack values (including self_or_null if present and kw values) into a Vec<PyObjectRef>, returns (args_vec, effective_nargs, optional_kwnames) and leaves the stack truncated at callable_idx; then call callable_obj.vectorcall(args_vec, effective_nargs, kwnames_opt, vm) and push_value(result). Reuse existing helpers like collect_positional_args/pop_multiple/pop_value_opt where appropriate and update both functions to call this new helper and remove the duplicated index math and loop logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/builtins/function.rs`:
- Around line 1280-1289: The vectorcall fast path currently eagerly allocates a
new dict for NEWLOCALS (via vm.ctx.new_dict() and ArgMapping::from_dict_exact)
before creating the Frame; change this to use a None-local mapping so Scope::new
receives None when code.flags contains bytecode::CodeFlags::NEWLOCALS, matching
invoke_with_locals/invoke_exact_args behavior and allowing Frame initialization
to use FrameLocals::lazy() (defer allocation). Update the locals binding used by
Frame::new (the locals variable) to be an Option that is None for NEWLOCALS
instead of constructing vm.ctx.new_dict().
In `@crates/vm/src/function/argument.rs`:
- Around line 146-151: In from_vectorcall validate the vectorcall layout before
any unchecked slicing: ensure nargs <= args.len() and if kwnames.is_some() then
nargs + kwnames.len() <= args.len(), returning a Python TypeError/ValueError
(Err) instead of panicking; replace the .expect() downcast on kwnames entries
with safe handling (use .get() or iterate with .zip and match on PyString
downcast) and return an Err TypeError when a keyword name is not a string;
update construction of pos_args and kwargs to use checked slices (or .get()
results) so no unchecked indexing or .expect() can panic the VM.
---
Nitpick comments:
In `@crates/vm/src/frame.rs`:
- Around line 5695-5816: Duplicate stack-index math and args construction in
execute_call_vectorcall and execute_call_kw_vectorcall; extract a small helper
to centralize that logic. Create an internal method (e.g.,
collect_vectorcall_args) used by both execute_call_vectorcall and
execute_call_kw_vectorcall that: computes
callable_idx/self_or_null_idx/args_start from nargs and optional kw_count,
consumes the stack values (including self_or_null if present and kw values) into
a Vec<PyObjectRef>, returns (args_vec, effective_nargs, optional_kwnames) and
leaves the stack truncated at callable_idx; then call
callable_obj.vectorcall(args_vec, effective_nargs, kwnames_opt, vm) and
push_value(result). Reuse existing helpers like
collect_positional_args/pop_multiple/pop_value_opt where appropriate and update
both functions to call this new helper and remove the duplicated index math and
loop logic.
ℹ️ Review info
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
crates/vm/src/builtins/builtin_func.rscrates/vm/src/builtins/function.rscrates/vm/src/frame.rscrates/vm/src/function/argument.rscrates/vm/src/protocol/callable.rscrates/vm/src/types/slot.rscrates/vm/src/types/slot_defs.rs
Sorry, something went wrong.
be0c3ca
into
RustPython:main
Mar 3, 2026
* Add vectorcall (PEP 590) dispatch for function calls Add VectorCallFunc slot to PyTypeSlots and vectorcall dispatch path in the interpreter loop for Call and CallKw instructions. Implement vectorcall for PyFunction (with fast path for simple positional-only calls that fills fastlocals directly), PyBoundMethod (avoids prepend_arg O(n) shift), and PyNativeFunction. Add FuncArgs::from_vectorcall helper for fallback conversion. Vectorcall slot is inherited with call slot and cleared when __call__ is overridden in Python subclasses. * Optimize vectorcall: move args instead of clone, use vectorcall in specialized paths - invoke_exact_args takes Vec by value and uses drain() to move args into fastlocals instead of cloning (eliminates refcount overhead) - CallPyGeneral and CallBoundMethodGeneral now call vectorcall_function directly instead of going through FuncArgs + prepend_arg + invoke - CallKwPy and CallKwBoundMethod use vectorcall_function with kwnames - vectorcall_bound_method uses insert(0) on existing Vec instead of allocating a second Vec * Auto-format: cargo fmt --all * Fix vectorcall_native_function kwarg slice out-of-bounds When needs_self was true and kwargs were present, pos_args only contained positional args (self + original positionals) but from_vectorcall expected kwarg values to follow in the slice. Build the full args array (self + all original args including kwarg values) before passing to from_vectorcall. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Add VectorCallFunc slot to PyTypeSlots and vectorcall dispatch path in the interpreter loop for Call and CallKw instructions.
Implement vectorcall for PyFunction (with fast path for simple positional-only calls that fills fastlocals directly), PyBoundMethod (avoids prepend_arg O(n) shift), and PyNativeFunction.
Add FuncArgs::from_vectorcall helper for fallback conversion. Vectorcall slot is inherited with call slot and cleared when call is overridden in Python subclasses.
Summary by CodeRabbit
Release Notes