-
-
Notifications
You must be signed in to change notification settings - Fork 33.8k
Description
Bug report
Bug description:
Python's ceval.c contains the function
PyObject *
_Py_CallBuiltinClass_StackRefSteal(
_PyStackRef callable,
_PyStackRef *arguments,
int total_args)
{
PyObject *res;
STACKREFS_TO_PYOBJECTS(arguments, total_args, args_o);
if (CONVERSION_FAILED(args_o)) {
res = NULL;
goto cleanup;
}
PyTypeObject *tp = (PyTypeObject *)PyStackRef_AsPyObjectBorrow(callable);
res = tp->tp_vectorcall((PyObject *)tp, args_o, total_args, NULL); /* <----- relevant line */
....
}The call to tp_vectorcall does not declare PY_VECTORCALL_ARGUMENTS_OFFSET despite STACKREFS_TO_PYOBJECTS (via _PyObjectArray_FromStackRefArray) always allocating an args_o that has an extra scratch space at position -1.
Strictly speaking, this isn't incorrect. But it can force the caller to make a redundant copy of the arguments, which is inefficient. Best practice to allocate the extra entry and then communicate its presence to the vector-callee.
This inefficiency is not restricted to Python-internal types. It also affects custom heap types that are themselves vector-callable. In this case, Python's adaptive optimizing interpreter replaces CALL_NON_PY_GENERAL with CALL_BUILTIN_CLASS that then goes through this code path.
CPython versions tested on:
3.14
Operating systems tested on:
macOS