Skip to content

Commit d0af539

Browse files
authored
Add PEP 523: Adding a frame evaluation API to CPython
1 parent 92b5b19 commit d0af539

File tree

1 file changed

+391
-0
lines changed

1 file changed

+391
-0
lines changed

pep-0523.txt

Lines changed: 391 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,391 @@
1+
PEP: 523
2+
Title: Adding a frame evaluation API to CPython
3+
Version: $Revision$
4+
Last-Modified: $Date$
5+
Author: Brett Cannon <[email protected]>,
6+
Dino Viehland <[email protected]>
7+
Status: Draft
8+
Type: Standards Track
9+
Content-Type: text/x-rst
10+
Created: 16-May-2016
11+
Post-History: 16-May-2016
12+
13+
14+
Abstract
15+
========
16+
17+
This PEP proposes to expand CPython's C API [#c-api]_ to allow for
18+
the specification of a per-interpreter function pointer to handle the
19+
evaluation of frames [#pyeval_evalframeex]_. This proposal also
20+
suggests adding a new field to code objects [#pycodeobject]_ to store
21+
arbitrary data for use by the frame evaluation function.
22+
23+
24+
Rationale
25+
=========
26+
27+
One place where flexibility has been lacking in Python is in the direct
28+
execution of Python code. While CPython's C API [#c-api]_ allows for
29+
constructing the data going into a frame object and then evaluating it
30+
via ``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_, control over the
31+
execution of Python code comes down to individual objects instead of a
32+
holistic control of execution at the frame level.
33+
34+
While wanting to have influence over frame evaluation may seem a bit
35+
too low-level, it does open the possibility for things such as a
36+
method-level JIT to be introduced into CPython without CPython itself
37+
having to provide one. By allowing external C code to control frame
38+
evaluation, a JIT can participate in the execution of Python code at
39+
the key point where evaluation occurs. This then allows for a JIT to
40+
conditionally recompile Python bytecode to machine code as desired
41+
while still allowing for executing regular CPython bytecode when
42+
running the JIT is not desired. This can be accomplished by allowing
43+
interpreters to specify what function to call to evaluate a frame. And
44+
by placing the API at the frame evaluation level it allows for a
45+
complete view of the execution environment of the code for the JIT.
46+
47+
This ability to specify a frame evaluation function also allows for
48+
other use-cases beyond just opening CPython up to a JIT. For instance,
49+
it would not be difficult to implement a tracing or profiling function
50+
at the call level with this API. While CPython does provide the
51+
ability to set a tracing or profiling function at the Python level,
52+
this would be able to match the data collection of the profiler and
53+
quite possibly be faster for tracing by simply skipping per-line
54+
tracing support.
55+
56+
It also opens up the possibility of debugging where the frame
57+
evaluation function only performs special debugging work when it
58+
detects it is about to execute a specific code object. In that
59+
instance the bytecode could be theoretically rewritten in-place to
60+
inject a breakpoint function call at the proper point for help in
61+
debugging while not having to do a heavy-handed approach as
62+
required by ``sys.settrace()``.
63+
64+
To help facilitate these use-cases, we are also proposing the adding
65+
of a "scratch space" on code objects via a new field. This will allow
66+
per-code object data to be stored with the code object itself for easy
67+
retrieval by the frame evaluation function as necessary. The field
68+
itself will simply be a ``PyObject *`` type so that any data stored in
69+
the field will participate in normal object memory management.
70+
71+
72+
Proposal
73+
========
74+
75+
All proposed C API changes below will not be part of the stable ABI.
76+
77+
78+
Expanding ``PyCodeObject``
79+
--------------------------
80+
81+
One field is to be added to the ``PyCodeObject`` struct
82+
[#pycodeobject]_::
83+
84+
typedef struct {
85+
...
86+
PyObject *co_extra; /* "Scratch space" for the code object. */
87+
} PyCodeObject;
88+
89+
The ``co_extra`` will be ``NULL`` by default and will not be used by
90+
CPython itself. Third-party code is free to use the field as desired.
91+
Values stored in the field are expected to not be required in order
92+
for the code object to function, allowing the loss of the data of the
93+
field to be acceptable (this keeps the code object as immutable from
94+
a functionality point-of-view; this is slightly contentious and so is
95+
listed as an open issue in `Is co_extra needed?`_). The field will be
96+
freed like all other fields on ``PyCodeObject`` during deallocation
97+
using ``Py_XDECREF()``.
98+
99+
It is not recommended that multiple users attempt to use the
100+
``co_extra`` simultaneously. While a dictionary could theoretically be
101+
set to the field and various users could use a key specific to the
102+
project, there is still the issue of key collisions as well as
103+
performance degradation from using a dictionary lookup on every frame
104+
evaluation. Users are expected to do a type check to make sure that
105+
the field has not been previously set by someone else.
106+
107+
108+
Expanding ``PyInterpreterState``
109+
--------------------------------
110+
111+
The entrypoint for the frame evalution function is per-interpreter::
112+
113+
// Same type signature as PyEval_EvalFrameEx().
114+
typedef PyObject* (__stdcall *PyFrameEvalFunction)(PyFrameObject*, int);
115+
116+
typedef struct {
117+
...
118+
PyFrameEvalFunction eval_frame;
119+
} PyInterpreterState;
120+
121+
By default, the ``eval_frame`` field will be initialized to a function
122+
pointer that represents what ``PyEval_EvalFrameEx()`` currently is
123+
(called ``PyEval_EvalFrameDefault()``, discussed later in this PEP).
124+
Third-party code may then set their own frame evaluation function
125+
instead to control the execution of Python code. A pointer comparison
126+
can be used to detect if the field is set to
127+
``PyEval_EvalFrameDefault()`` and thus has not been mutated yet.
128+
129+
130+
Changes to ``Python/ceval.c``
131+
-----------------------------
132+
133+
``PyEval_EvalFrameEx()`` [#pyeval_evalframeex]_ as it currently stands
134+
will be renamed to ``PyEval_EvalFrameDefault()``. The new
135+
``PyEval_EvalFrameEx()`` will then become::
136+
137+
PyObject *
138+
PyEval_EvalFrameEx(PyFrameObject *frame, int throwflag)
139+
{
140+
PyThreadState *tstate = PyThreadState_GET();
141+
return tstate->interp->eval_frame(frame, throwflag);
142+
}
143+
144+
This allows third-party code to place themselves directly in the path
145+
of Python code execution while being backwards-compatible with code
146+
already using the pre-existing C API.
147+
148+
149+
Updating ``python-gdb.py``
150+
--------------------------
151+
152+
The generated ``python-gdb.py`` file used for Python support in GDB
153+
makes some hard-coded assumptions about ``PyEval_EvalFrameEx()``, e.g.
154+
the names of local variables. It will need to be updated to work with
155+
the proposed changes.
156+
157+
158+
Performance impact
159+
==================
160+
161+
As this PEP is proposing an API to add pluggability, performance
162+
impact is considered only in the case where no third-party code has
163+
made any changes.
164+
165+
Several runs of pybench [#pybench]_ consistently showed no performance
166+
cost from the API change alone.
167+
168+
A run of the Python benchmark suite [#py-benchmarks]_ showed no
169+
measurable cost in performance.
170+
171+
In terms of memory impact, since there are typically not many CPython
172+
interpreters executing in a single process that means the impact of
173+
``co_extra`` being added to ``PyCodeObject`` is the only worry.
174+
According to [#code-object-count]_, a run of the Python test suite
175+
results in about 72,395 code objects being created. On a 64-bit
176+
CPU that would result in 579,160 bytes of extra memory being used if
177+
all code objects were alive at once and had nothing set in their
178+
``co_extra`` fields.
179+
180+
181+
Example Usage
182+
=============
183+
184+
A JIT for CPython
185+
-----------------
186+
187+
Pyjion
188+
''''''
189+
190+
The Pyjion project [#pyjion]_ has used this proposed API to implement
191+
a JIT for CPython using the CoreCLR's JIT [#coreclr]_. Each code
192+
object has its ``co_extra`` field set to a ``PyjionJittedCode`` object
193+
which stores four pieces of information:
194+
195+
1. Execution count
196+
2. A boolean representing whether a previous attempt to JIT failed
197+
3. A function pointer to a trampoline (which can be type tracing or not)
198+
4. A void pointer to any JIT-compiled machine code
199+
200+
The frame evaluation function has (roughly) the following algorithm::
201+
202+
def eval_frame(frame, throw_flag):
203+
pyjion_code = frame.code.co_extra
204+
if not pyjion_code:
205+
frame.code.co_extra = PyjionJittedCode()
206+
elif not pyjion_code.jit_failed:
207+
if not pyjion_code.jit_code:
208+
return pyjion_code.eval(pyjion_code.jit_code, frame)
209+
elif pyjion_code.exec_count > 20_000:
210+
if jit_compile(frame):
211+
return pyjion_code.eval(pyjion_code.jit_code, frame)
212+
else:
213+
pyjion_code.jit_failed = True
214+
pyjion_code.exec_count += 1
215+
return PyEval_EvalFrameDefault(frame, throw_flag)
216+
217+
The key point, though, is that all of this work and logic is separate
218+
from CPython and yet with the proposed API changes it is able to
219+
provide a JIT that is compliant with Python semantics (as of this
220+
writing, performance is almost equivalent to CPython without the new
221+
API). This means there's nothing technically preventing others from
222+
implementing their own JITs for CPython by utilizing the proposed API.
223+
224+
225+
Other JITs
226+
''''''''''
227+
228+
It should be mentioned that the Pyston team was consulted on an
229+
earlier version of this PEP that was more JIT-specific and they were
230+
not interested in utilizing the changes proposed because they want
231+
control over memory layout they had no interest in directly supporting
232+
CPython itself. An informal discusion with a developer on the PyPy
233+
team led to a similar comment.
234+
235+
Numba [#numba]_, on the other hand, suggested that they would be
236+
interested in the proposed change in a post-1.0 future for
237+
themselves [#numba-interest]_.
238+
239+
The experimental Coconut JIT [#coconut]_ could have benefitted from
240+
this PEP. In private conversations with Coconut's creator we were told
241+
that our API was probably superior to the one they developed for
242+
Coconut to add JIT support to CPython.
243+
244+
245+
Debugging
246+
---------
247+
248+
In conversations with the Python Tools for Visual Studio team (PTVS)
249+
[#ptvs]_, they thought they would find these API changes useful for
250+
implementing more performant debugging. As mentioned in the Rationale_
251+
section, this API would allow for switching on debugging functionality
252+
only in frames where it is needed. This could allow for either
253+
skipping information that ``sys.settrace()`` normally provides and
254+
even go as far as to dynamically rewrite bytecode prior to execution
255+
to inject e.g. breakpoints in the bytecode.
256+
257+
It also turns out that Google has provided a very similar API
258+
internally for years. It has been used for performant debugging
259+
purposes.
260+
261+
262+
Implementation
263+
==============
264+
265+
A set of patches implementing the proposed API is available through
266+
the Pyjion project [#pyjion]_. In its current form it has more
267+
changes to CPython than just this proposed API, but that is for ease
268+
of development instead of strict requirements to accomplish its goals.
269+
270+
271+
Open Issues
272+
===========
273+
274+
Allow ``eval_frame`` to be ``NULL``
275+
-----------------------------------
276+
277+
Currently the frame evaluation function is expected to always be set.
278+
It could very easily simply default to ``NULL`` instead which would
279+
signal to use ``PyEval_EvalFrameDefault()``. The current proposal of
280+
not special-casing the field seemed the most straight-forward, but it
281+
does require that the field not accidentally be cleared, else a crash
282+
may occur.
283+
284+
285+
Is co_extra needed?
286+
-------------------
287+
288+
While discussing this PEP at PyCon US 2016, some core developers
289+
expressed their worry of the ``co_extra`` field making code objects
290+
mutable. The thinking seemed to be that having a field that was
291+
mutated after the creation of the code object made the object seem
292+
mutable, even though no other aspect of code objects changed.
293+
294+
The view of this PEP is that the `co_extra` field doesn't change the
295+
fact that code objects are immutable. The field is specified in this
296+
PEP as to not contain information required to make the code object
297+
usable, making it more of a caching field. It could be viewed as
298+
similar to the UTF-8 cache that string objects have internally;
299+
strings are still considered immutable even though they have a field
300+
that is conditionally set.
301+
302+
The field is also not strictly necessary. While the field greatly
303+
simplifies attaching extra information to code objects, other options
304+
such as keeping a mapping of code object memory addresses to what
305+
would have been kept in ``co_extra`` or perhaps using a weak reference
306+
of the data on the code object and then iterating through the weak
307+
references until the attached data is found is possible. But obviously
308+
all of these solutions are not as simple or performant as adding the
309+
``co_extra`` field.
310+
311+
312+
Rejected Ideas
313+
==============
314+
315+
A JIT-specific C API
316+
--------------------
317+
318+
Originally this PEP was going to propose a much larger API change
319+
which was more JIT-specific. After soliciting feedback from the Numba
320+
team [#numba]_, though, it became clear that the API was unnecessarily
321+
large. The realization was made that all that was truly needed was the
322+
opportunity to provide a trampoline function to handle execution of
323+
Python code that had been JIT-compiled and a way to attach that
324+
compiled machine code along with other critical data to the
325+
corresponding Python code object. Once it was shown that there was no
326+
loss in functionality or in performance while minimizing the API
327+
changes required, the proposal was changed to its current form.
328+
329+
330+
References
331+
==========
332+
333+
.. [#pyjion] Pyjion project
334+
(https://github.com/microsoft/pyjion)
335+
336+
.. [#c-api] CPython's C API
337+
(https://docs.python.org/3/c-api/index.html)
338+
339+
.. [#pycodeobject] ``PyCodeObject``
340+
(https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
341+
342+
.. [#coreclr] .NET Core Runtime (CoreCLR)
343+
(https://github.com/dotnet/coreclr)
344+
345+
.. [#pyeval_evalframeex] ``PyEval_EvalFrameEx()``
346+
(https://docs.python.org/3/c-api/veryhigh.html?highlight=pyframeobject#c.PyEval_EvalFrameEx)
347+
348+
.. [#pycodeobject] ``PyCodeObject``
349+
(https://docs.python.org/3/c-api/code.html#c.PyCodeObject)
350+
351+
.. [#numba] Numba
352+
(http://numba.pydata.org/)
353+
354+
.. [#numba-interest] numba-users mailing list:
355+
"Would the C API for a JIT entrypoint being proposed by Pyjion help out Numba?"
356+
(https://groups.google.com/a/continuum.io/forum/#!topic/numba-users/yRl_0t8-m1g)
357+
358+
.. [#code-object-count] [Python-Dev] Opcode cache in ceval loop
359+
(https://mail.python.org/pipermail/python-dev/2016-February/143025.html)
360+
361+
.. [#py-benchmarks] Python benchmark suite
362+
(https://hg.python.org/benchmarks)
363+
364+
.. [#pyston] Pyston
365+
(http://pyston.org)
366+
367+
.. [#pypy] PyPy
368+
(http://pypy.org/)
369+
370+
.. [#ptvs] Python Tools for Visual Studio
371+
(http://microsoft.github.io/PTVS/)
372+
373+
.. [#coconut] Coconut
374+
(https://github.com/davidmalcolm/coconut)
375+
376+
377+
Copyright
378+
=========
379+
380+
This document has been placed in the public domain.
381+
382+
383+
384+
..
385+
Local Variables:
386+
mode: indented-text
387+
indent-tabs-mode: nil
388+
sentence-end-double-space: t
389+
fill-column: 70
390+
coding: utf-8
391+
End:

0 commit comments

Comments
 (0)