Branching design for Tier 2 (uops) interpreter

This issue is part of the larger epic of gh-104584. In PR gh-106393 I tried to implement branching, but it was premature. Here's a better design, following @markshannon's [guidance](https://github.com/python/cpython/pull/106393#discussion_r1252794047).

We have the following jump instructions (not counting the instrumented versions):

Unconditional jumps:

- [x] JUMP_BACKWARD
- [ ] JUMP_BACKWARD_NO_INTERRUPT
- [x] JUMP_FORWARD

Branches, a.k.a. conditional jumps:

- [x] POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE, ~POP_JUMP_IF_NONE, POP_JUMP_IF_NOT_NONE~
- [x] FOR_ITER's specializations:
  - [x] FOR_ITER_LIST
  - [x] FOR_ITER_TUPLE
  - [x] FOR_ITER_RANGE
- [ ] FOR_ITER_GEN
- [ ] SEND

- [ ] Add counters to to POP_JUMP_IF_{FALSE,TRUE} to determine likeliness

The translation strategies could be as follows:

## Unconditional jumps

### **JUMP_BACKWARD**

- If this jumps to exactly the top of the current trace, emit a Tier 2 JUMP_TO_TOP uop, and stop projecting (i.e., exit the trace generation loop). The JUMP_TO_TOP uop implementation should include a CHECK_EVAL_BREAKER call.
- If this jumps anywhere else, emit a SAVE_IP uop with the destination of the jump, followed by an EXIT_TRACE uop, and stop projecting.

### **JUMP_BACKWARD_NO_INTERRUPT**

- Since this is typically only used in special circumstances, just emit a SAVE_IP instruction with the destination and an EXIT_TRACE uop, and stop projecting.
- Alternatively, we could make CHECK_EVAL_BREAKER a separate UOP that is inserted for JUMP_BACKWARD but not for JUMP_BACKWARD_NO_INTERRUPT, and otherwise treat the two backward jumps the same.

### **JUMP_FORWARD**

- Emit a SAVE_IP uop with the destination of the jump, and continue projecting from there (i.e. set `instr` to the destination of the jump).

## Conditional jumps (branches)

### **POP_JUMP_IF_FALSE** and friends

Consider the following Python code:
```py
if cond:
    block
rest
```
This translates roughly to the following Tier 1 bytecode (using B1, B2, ... to label Tier 1 instructions, and `<cond>`, `<block>` etc. to represent code blocks that evaluate or execute the corresponding Python fragments):
```
B1: <cond>
B2: POP_JUMP_IF_FALSE B4
B3: <block>
B4: <rest>
B5:
```
I propose the following translation into Tier 2 uops, assuming the branch is "unlikely":
```
    SAVE_IP B1
    <cond>
    SAVE_IP B2
    JUMP_IF_FALSE stub
    POP_TOP
    SAVE_IP B3
    <block>
    SAVE_IP B4
    EXIT_TRACE

stub:
    POP_TOP
    SAVE_IP B4
    EXIT_TRACE
```
Where JUMP_IF_FALSE inspects the top of stack but doesn't pop it, and has an argument that executes a jump in the Tier 2 uop instruction sequence.

If the branch is "likely", we do this instead:
```
    SAVE_IP B1
    <cond>
    SAVE_IP B2
    JUMP_IF_TRUE stub
    POP_TOP
    SAVE_IP B4
    <rest>
    SAVE_IP B5
    EXIT_TRACE

stub:
    POP_TOP
    SAVE_IP B3
    EXIT_TRACE
```
Note how in this case, `<rest>` is projected as part of the trace, while `<block>` is not, since the likely case is that we jump over `<block>` to `<rest>`.

For the other simple conditional jumps (POP_JUMP_IF_TRUE, ~POP_JUMP_IF_NONE, POP_JUMP_IF_NOT_NONE~) we do the same: if the jump is unlikely, emit a JUMP_IF_XXX uop and a stub; if the jump is likely, emit the inverse JUMP_IF_NOT_XXX uop and a different stub, and continue projecting at the destination of the original jump bytecode.

I propose to have hand-written cases both in the superblock generator and in the Tier 2 interpreter for these, since the translations are too irregular to fit easily in the macro expansion data structure. The stub generation will require additional logic and data structures in `translate_bytecode_to_trace()` to keep track of the stubs required so far, the available space for those, and the back-patching required to set the operands for the JUMP_IF_XXX uops.

### **FOR_ITER** and (especially) its specializations

The common case for these is not to jump. The bytecode definitions are too complex to duplicate in hand-written Tier 2 uops. My proposal is to change these in bytecodes.c so that, instead of using the `JUMPBY(n)` macro, they use `JUMPBY_POP_DISPATCH(n)`, which in Tier 1 translates into just `JUMPBY(n)`, but in Tier 2 translates into roughly
```cc
frame->prev_instr += (x);
PY_DECREF(stack_pointer[-1]);
stack_pointer -= 1;
goto exit;
```
thereby exiting the trace when the corresponding for-loop terminates.

I am assuming here that most loops have several iterations. I don't think it's worth special-casing the occasional for-loop that almost always immediately terminates.

### **SEND**

Possibly we could treat this the same as FOR_ITER. But right now I propose to just punt here, and when we encounter it, stop projecting, as we do with any other unsupported bytecode instruction.


### Linked PRs
* gh-106542
* gh-106543
* gh-106551
* gh-106599
* gh-106613
* gh-106638
* gh-106651
* gh-106696
* gh-106756
* gh-106796
* gh-112134
* gh-112214

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GitHub Sponsors

Uh oh!

Branching design for Tier 2 (uops) interpreter #106529

Unconditional jumps

JUMP_BACKWARD

JUMP_BACKWARD_NO_INTERRUPT

JUMP_FORWARD

Conditional jumps (branches)

POP_JUMP_IF_FALSE and friends

FOR_ITER and (especially) its specializations

SEND

Linked PRs

26 remaining items

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Participants

Branching design for Tier 2 (uops) interpreter #106529

Description

Unconditional jumps

JUMP_BACKWARD

JUMP_BACKWARD_NO_INTERRUPT

JUMP_FORWARD

Conditional jumps (branches)

POP_JUMP_IF_FALSE and friends

FOR_ITER and (especially) its specializations

SEND

Linked PRs

Activity

gvanrossum commented on Jul 7, 2023

gvanrossum commented on Jul 9, 2023

gvanrossum commented on Jul 9, 2023

markshannon commented on Jul 10, 2023

mdboom commented on Jul 10, 2023

markshannon commented on Jul 10, 2023

gvanrossum commented on Jul 10, 2023

gvanrossum commented on Jul 10, 2023

26 remaining items

gvanrossum commented on Nov 15, 2023

gvanrossum commented on Nov 16, 2023

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Participants

Issue actions