Add support for dumping interpreter IR for debugging purposes#126370
Add support for dumping interpreter IR for debugging purposes#126370BrzVlad wants to merge 2 commits intodotnet:mainfrom
Conversation
Interpreter IR was logged through functionality inside InterpCompiler class via PrintCompiledCode (only during compilation via DOTNET_InterpDump). This commit extracts this logic outside of the class, passing all necessary information as arguments (dataItems, pointer to name map, etc). This also adds code size field on the InterpMethod since this information wasn't easily available. Some data will not be pretty printed, because we don't have access to the jitinterface methods. Example for crash in InterpExecMethod: ``` (lldb) frame sel 4 frame #4: 0x00007ffff72f8b19 libcoreclr.so`InterpExecMethod(pInterpreterFrame=0x00007ffb899eaa20, pFrame=0x00007ffb899e92b0, pThreadContext=0x00007ffb78009000, pExceptionClauseArgs=0x0000000000000000) at interpexec.cpp:2566:42 [opt] 2563 case INTOP_STIND_O: 2564 { 2565 char *dst = LOCAL_VAR(ip[1], char*); -> 2566 OBJECTREF storeObj = LOCAL_VAR(ip[2], OBJECTREF); 2567 NULL_CHECK(dst); 2568 SetObjectReferenceUnchecked((OBJECTREF*)(dst + ip[3]), storeObj); 2569 ip += 4; (lldb) call InterpDumpIR(pFrame->startIp) Dumping interpreter IR at 0x7fff7b514460 (method 0x7fff7a8c9230)) 0008: IR_0000: initlocals [nil <- nil], 16,16 0014: IR_0003: safepoint [nil <- nil], 0018: IR_0004: mov.8 [48 <- 0], 0024: IR_0007: ldc.i4 [32 <- nil], 0 0030: IR_000a: ldc.i4 [64 <- nil], 0 003c: IR_000d: ldloca [40 <- nil], 16 0048: IR_0010: zeroblk.imm [nil <- 40], 8 0054: IR_0013: mov.vt [72 <- 16], 8 0064: IR_0017: conv.u1.i4 [56 <- 32], 0070: IR_001a: call [32 <- 48], 0x7fff79ea2030 0080: IR_001e: mov.8 [32 <- 0], 008c: IR_0021: mov.8 [40 <- 8], 0098: IR_0024: stind.o [nil <- 32 40], 56 00a8: IR_0028: ret.void [nil <- nil], End of method: 00ac: IR_0029 (lldb) dumpmd 0x7fff7a8c9230 Method Name: System.Threading.Tasks.Task`1[[System.__Canon, System.Private.CoreLib]]..ctor(System.__Canon) ... (lldb) dumpmd 0x7fff79ea2030 Method Name: System.Threading.Tasks.Task..ctor(Boolean, System.Threading.Tasks.TaskCreationOptions, System.Threading.CancellationToken) ... ```
|
Tagging subscribers to this area: @BrzVlad, @janvorli, @kg |
|
@janvorli @davidwrighton I assume you spent some time in the debugger, while iterating over interpreter failures. Seems weird to investigate without an easy way to dump the code of various methods. Using only interp dump is awkward and noisy. Any thoughts on this, did you use a different approach ? |
There was a problem hiding this comment.
Pull request overview
Refactors interpreter IR dumping so it can be invoked outside of InterpCompiler (e.g., from a debugger using a bytecode start pointer), and plumbs additional metadata needed to support that.
Changes:
- Added
InterpMethod::codeSizeto preserve the emitted IR size (inint32_tslots) for later dumping. - Extracted/rehomed IR dump helpers into standalone
Dump*routines that can operate with explicit inputs (code,pDataItems, optional name map /ICorJitInfo). - Added
InterpDumpIR(const InterpByteCodeStart*)entrypoint to dump IR starting from an interpreted frame’sstartIp.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/coreclr/interpreter/inc/interpretershared.h | Adds InterpMethod::codeSize and wires it into construction so IR dumping can know method bytecode length post-compilation. |
| src/coreclr/interpreter/compiler.h | Moves pointer-name sentinel constants to file scope for use by extracted dump helpers. |
| src/coreclr/interpreter/compiler.cpp | Refactors print/dump implementation into standalone helpers and introduces InterpDumpIR debugger entrypoint. |
|
|
||
| void *helperAddr = GetDataItemAtIndex(data.addressDataItemIndex); | ||
| PrintPointer(helperAddr); | ||
| void *helperAddr = pDataItems[data.addressDataItemIndex]; |
There was a problem hiding this comment.
DumpHelperFtn (and several DumpInsData cases) now index directly into pDataItems without any bounds checking. Previously GetDataItemAtIndex/GetAddrOfDataItemAtIndex asserted and returned null on invalid indices; with the new code, a corrupted/incorrect operand can cause an out-of-bounds read and crash the dump helper itself. Consider passing the data-item count into DumpHelperFtn/DumpInsData (or providing a small safe accessor) and gracefully handling invalid indices (e.g., print "").
| void *helperAddr = pDataItems[data.addressDataItemIndex]; | |
| if (pDataItems == nullptr || data.addressDataItemIndex < 0) | |
| { | |
| printf("<bad data item index> "); | |
| return; | |
| } | |
| void* helperAddr = pDataItems[data.addressDataItemIndex]; |
| extern "C" void InterpDumpIR(const InterpByteCodeStart *startIp) | ||
| { | ||
| InterpMethod *pMethod = startIp->Method; | ||
| const int32_t *code = startIp->GetByteCodes(); | ||
|
|
There was a problem hiding this comment.
InterpDumpIR is intended to be called from a debugger, but clrinterpreter is linked with an explicit exports list (clrinterpreter.exports / clrinterpreter_unixexports.src) that currently only exports getJit/jitStartup. As-is, InterpDumpIR may not be callable on Unix/stripped builds due to hidden visibility and/or version-script exports. If you need this to work reliably, consider marking it INTERP_API and adding it to the interpreter export lists (or otherwise ensuring the symbol is retained/visible).
I've always used the InterpDump. An ideal way would be to allow SOS plugin to disassemble the IR bytecode so that running !u (clru on Unix) would work for interpreter byte code the same way it works for native code. Obviously, that would mean having a copy of the disassembling code in SOS (or exposing functions to do that via mscordaccore - but then we would generate debt for the CDAC effort). I was hesitant to add the copy to SOS before since the bytecode was evolving a lot, but now that it became stable, it might make sense to consider that. |
Interpreter IR was logged through functionality inside InterpCompiler class via PrintCompiledCode (only during compilation via DOTNET_InterpDump). This commit extracts this logic outside of the class, passing all necessary information as arguments (dataItems, pointer to name map, etc). This also adds code size field on the InterpMethod since this information wasn't easily available. Some data will not be pretty printed, because we don't have access to the jitinterface methods.
Example for crash in InterpExecMethod: