Skip to content

Document how to add a bytecode specialization in Interpreter.md file #130831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
62 changes: 62 additions & 0 deletions InternalDocs/interpreter.md
Original file line number Diff line number Diff line change
Expand Up @@ -506,6 +506,68 @@ After the last `DEOPT_IF` has passed, a hit should be recorded with
After an optimization has been deferred in the adaptive instruction,
that should be recorded with `STAT_INC(BASE_INSTRUCTION, deferred)`.

## How to add a new bytecode specialization
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a line spacing

Suggested change
## How to add a new bytecode specialization
## How to add a new bytecode specialization


Assuming you found an instruction that serves as a good specialization candidate.
Let's use the example of [`CONTAINS_OP`](../Doc/library/dis.rst#contains_op):
Comment on lines +511 to +512
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is a dependent clause, you either need a comma and then some more information, or remove the word "assuming." How about this?

Suggested change
Assuming you found an instruction that serves as a good specialization candidate.
Let's use the example of [`CONTAINS_OP`](../Doc/library/dis.rst#contains_op):
Let's say you found an instruction that serves as a good specialization candidate, such as [`CONTAINS_OP`](../Doc/library/dis.rst#contains_op):


1. Update below in [Python/bytecodes.c](../Python/bytecodes.c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the phrase "Update below in," it's kind of wordy and an incomplete sentence. You could try something like this:

Suggested change
1. Update below in [Python/bytecodes.c](../Python/bytecodes.c)
1. Make necessary changes to the instruction in [Python/bytecodes.c](../Python/bytecodes.c)


- Convert `CONTAINS_OP` to a micro-operation (uop) by renaming
it to `_CONTAINS_OP` and changing the instruction definition
from `inst` to `op`.
Comment on lines +516 to +518
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • CONTAINS_OP is still an example here, we should try to be more general.
  • For clarity, we could explain why it's a "u."
  • Reduce indentation.
Suggested change
- Convert `CONTAINS_OP` to a micro-operation (uop) by renaming
it to `_CONTAINS_OP` and changing the instruction definition
from `inst` to `op`.
- Convert the instruction (`CONTAINS_OP`, in our example) to a micro-operation (uop, formally μop) by renaming it to `_INSTRUCTION_NAME` (e.g., `_CONTAINS_OP`) and changing the instruction definition
from `inst` to `op`.


```c
// Before
inst(CONTAINS_OP, ...);

// After
op(_CONTAINS_OP, ...);
```

- Add a uop that calls the specializing function:

```c
specializing op(_SPECIALIZE_CONTAINS_OP, (counter/1, left, right -- left, right)) {
#if ENABLE_SPECIALIZATION
if (ADAPTIVE_COUNTER_IS_ZERO(counter)) {
next_instr = this_instr;
_Py_Specialize_ContainsOp(right, next_instr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, presumably this would use both operands:

Suggested change
_Py_Specialize_ContainsOp(right, next_instr);
_Py_Specialize_ContainsOp(left, right, next_instr);

DISPATCH_SAME_OPARG();
}
STAT_INC(CONTAINS_OP, deferred);
DECREMENT_ADAPTIVE_COUNTER(this_instr[1].cache);
#endif /* ENABLE_SPECIALIZATION */
}
```

- Create a macro for the original bytecode name:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"intruction" reads better to me here.

Suggested change
- Create a macro for the original bytecode name:
- Create a macro for the original instruction name:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also might be worth mentioning here what should go in that macro, but it also seems pretty clear based on the example.


```c
macro(CONTAINS_OP) = _SPECIALIZE_CONTAINS_OP + _CONTAINS_OP;
```

2. Define the cache structure in [Include/internal/pycore_code.h](../Include/internal/pycore_code.h),
at the very least, a 16-bit counter is needed.
Comment on lines +550 to +551
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clause after the first comma is independent, so it can't be used there. Let's just turn it into its own sentence.

Suggested change
2. Define the cache structure in [Include/internal/pycore_code.h](../Include/internal/pycore_code.h),
at the very least, a 16-bit counter is needed.
2. Define the cache structure in [Include/internal/pycore_code.h](../Include/internal/pycore_code.h). It needs to have at least a 16-bit counter field.


```c
typedef struct {
uint16_t counter;
} _PyContainsOpCache;
```

3. Write the specializing function itself (`_Py_Specialize_ContainsOp`) in [Python/specialize.c ](../Python/specialize.c).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_Py_Specialize_ContainsOp is an example.

Suggested change
3. Write the specializing function itself (`_Py_Specialize_ContainsOp`) in [Python/specialize.c ](../Python/specialize.c).
3. Write the specializing function itself (e.g., `_Py_Specialize_ContainsOp`) in [Python/specialize.c ](../Python/specialize.c).

Refer to other functions in that file for the pattern.

4. Add a call to `add_stat_dict` in `_Py_GetSpecializationStats` which is in [Python/specialize.c ](../Python/specialize.c).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space.

Suggested change
4. Add a call to `add_stat_dict` in `_Py_GetSpecializationStats` which is in [Python/specialize.c ](../Python/specialize.c).
4. Add a call to `add_stat_dict` in `_Py_GetSpecializationStats` which is in [Python/specialize.c ](../Python/specialize.c).


5. Add the cache layout in [Lib/opcode.py](../Lib/opcode.py) so that Python's
`dis` module will know how to represent it properly.
Comment on lines +564 to +565
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • "to," not "in."
  • I think the context already implies that it's Python's dis module :)
Suggested change
5. Add the cache layout in [Lib/opcode.py](../Lib/opcode.py) so that Python's
`dis` module will know how to represent it properly.
5. Add the cache layout to [Lib/opcode.py](../Lib/opcode.py) so that the
`dis` module will know how to represent it properly.


6. Bump magic number in [Include/core/pycore_magic_number.h](../Include/internal/pycore_magic_number.h).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra space.

Suggested change
6. Bump magic number in [Include/core/pycore_magic_number.h](../Include/internal/pycore_magic_number.h).
6. Bump magic number in [Include/core/pycore_magic_number.h](../Include/internal/pycore_magic_number.h).


7. Run ``make regen-all`` on `*nix` or `build.bat --regen` on Windows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the only thing missing is adding an actual specialized variant. Maybe that's implied/obvious, but it wouldn't hurt to provide a dumb example of _Py_Specialize_ContainsOp and _CONTAINS_OP_UNICODE_UNICODE that just guards and calls PyUnicode_Contains or something.


Additional resources
--------------------
Expand Down
Loading