You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, we now expect any further performance improvements to be provided by the JIT, not the interpreter.
This means that specializations other role, that of gathering type and branching information for the JIT, is at least as important as pure interpreter performance.
We should therefore seek to broaden specialization to gather more information, as long as it does not make interpreter performance worse, or at least no significantly so.
Using some old stats, by fraction of unspecialized bytecode executed, the top 10 were:
BINARY_OP 31.3%
FOR_ITER 19.4%
LOAD_ATTR 10.9%
STORE_SUBSCR 9.2%
BINARY_SLICE 7.3%
COMPARE_OP 7.0%
TO_BOOL 5.8%
CALL 2.5%
CONTAINS_OP 2.4%
SEND 1.7%
We should fully specialize most, if not all, of these.
In general, the above instructions have a matching __dunder__ method which determines the behavior of the operation. Recording the type of the operand(s) allows us to know what __dunder__ method is to be called.
We cannot specialize for all possible types, but we can ensure we have good inputs and type information for the JIT by adding the following two specializations for all families of instructions:
__dunder__ implemented in Python. Most of the above instructions have a matching __dunder__ method. These specializations should jump directly into the method. LOAD_ATTR_GETATTRIBUTE_OVERRIDDEN already does this for LOAD_ATTR. Other families should follow this template.
__dunder__ implemented in C. In practice, this is just the generic instruction with a bit more information recorded.
BINARY_SLICE. This is supposed to avoid creating temporary slice objects for expressions like a[b:c] but has yet to be implemented properly. There is no corresponding __dunder__ method, so we would need to expose slicing methods to use.
SEND. There is no __send__ method. For iterators, __next__ is called if the value is None, otherwise .send() is called. Rather than try to replicate the specializations of FOR_ITER we should maybe look to combine SEND and FOR_ITER much like we did for CALL and CALL_METHOD
First step
Add two specializations for __dunder__ in Python and the fallback __dunder__ in C for:
FOR_ITER
LOAD_ATTR
STORE_SUBSCR
COMPARE_OP
TO_BOOL
CALL
CONTAINS_OP
For a total of 12 new instructions as LOAD_ATTR already has the specialization for the Python __getattribute__ and CALL already has the generic fallback.
Until now, our choice of specialization in the SAI has been driven by performance of the interpreter alone https://github.com/python/cpython/blob/main/InternalDocs/interpreter.md#performance-analysis.
However, we now expect any further performance improvements to be provided by the JIT, not the interpreter.
This means that specializations other role, that of gathering type and branching information for the JIT, is at least as important as pure interpreter performance.
We should therefore seek to broaden specialization to gather more information, as long as it does not make interpreter performance worse, or at least no significantly so.
Using some old stats, by fraction of unspecialized bytecode executed, the top 10 were:
BINARY_OP 31.3%
FOR_ITER 19.4%
LOAD_ATTR 10.9%
STORE_SUBSCR 9.2%
BINARY_SLICE 7.3%
COMPARE_OP 7.0%
TO_BOOL 5.8%
CALL 2.5%
CONTAINS_OP 2.4%
SEND 1.7%
We should fully specialize most, if not all, of these.
In general, the above instructions have a matching
__dunder__method which determines the behavior of the operation. Recording the type of the operand(s) allows us to know what__dunder__method is to be called.We cannot specialize for all possible types, but we can ensure we have good inputs and type information for the JIT by adding the following two specializations for all families of instructions:
__dunder__implemented in Python. Most of the above instructions have a matching__dunder__method. These specializations should jump directly into the method.LOAD_ATTR_GETATTRIBUTE_OVERRIDDENalready does this forLOAD_ATTR. Other families should follow this template.__dunder__implemented in C. In practice, this is just the generic instruction with a bit more information recorded.Three instructions need special casing:
a[b:c]but has yet to be implemented properly. There is no corresponding__dunder__method, so we would need to expose slicing methods to use.__send__method. For iterators,__next__is called if the value isNone, otherwise.send()is called. Rather than try to replicate the specializations ofFOR_ITERwe should maybe look to combineSENDandFOR_ITERmuch like we did forCALLandCALL_METHODFirst step
Add two specializations for
__dunder__in Python and the fallback__dunder__in C for:For a total of 12 new instructions as
LOAD_ATTRalready has the specialization for the Python__getattribute__andCALLalready has the generic fallback.Second step
Implement #100239
Third step
Handle
BINARY_SLICEandSENDLinked PRs