Skip to content

Add support for dumping interpreter IR for debugging purposes#126370

Open
BrzVlad wants to merge 2 commits intodotnet:mainfrom
BrzVlad:feature-clrinterp-ir-dump
Open

Add support for dumping interpreter IR for debugging purposes#126370
BrzVlad wants to merge 2 commits intodotnet:mainfrom
BrzVlad:feature-clrinterp-ir-dump

Conversation

@BrzVlad
Copy link
Copy Markdown
Member

@BrzVlad BrzVlad commented Mar 31, 2026

Interpreter IR was logged through functionality inside InterpCompiler class via PrintCompiledCode (only during compilation via DOTNET_InterpDump). This commit extracts this logic outside of the class, passing all necessary information as arguments (dataItems, pointer to name map, etc). This also adds code size field on the InterpMethod since this information wasn't easily available. Some data will not be pretty printed, because we don't have access to the jitinterface methods.

Example for crash in InterpExecMethod:

(lldb) frame sel 4
frame 4: 0x00007ffff72f8b19 libcoreclr.so`InterpExecMethod(pInterpreterFrame=0x00007ffb899eaa20, pFrame=0x00007ffb899e92b0, pThreadContext=0x00007ffb78009000, pExceptionClauseArgs=0x0000000000000000) at interpexec.cpp:2566:42 [opt]
   2563	                case INTOP_STIND_O:
   2564	                {
   2565	                    char *dst = LOCAL_VAR(ip[1], char*);
-> 2566	                    OBJECTREF storeObj = LOCAL_VAR(ip[2], OBJECTREF);
   2567	                    NULL_CHECK(dst);
   2568	                    SetObjectReferenceUnchecked((OBJECTREF*)(dst + ip[3]), storeObj);
   2569	                    ip += 4;
(lldb) call InterpDumpIR(pFrame->startIp)
Dumping interpreter IR at 0x7fff7b514460 (method 0x7fff7a8c9230))
0008: IR_0000: initlocals     [nil <- nil], 16,16
0014: IR_0003: safepoint      [nil <- nil],
0018: IR_0004: mov.8          [48 <- 0],
0024: IR_0007: ldc.i4         [32 <- nil], 0
0030: IR_000a: ldc.i4         [64 <- nil], 0
003c: IR_000d: ldloca         [40 <- nil], 16
0048: IR_0010: zeroblk.imm    [nil <- 40], 8
0054: IR_0013: mov.vt         [72 <- 16], 8
0064: IR_0017: conv.u1.i4     [56 <- 32],
0070: IR_001a: call           [32 <- 48], 0x7fff79ea2030
0080: IR_001e: mov.8          [32 <- 0],
008c: IR_0021: mov.8          [40 <- 8],
0098: IR_0024: stind.o        [nil <- 32 40], 56
00a8: IR_0028: ret.void       [nil <- nil],
End of method: 00ac: IR_0029
(lldb) dumpmd 0x7fff7a8c9230
Method Name:          System.Threading.Tasks.Task`1[[System.__Canon, System.Private.CoreLib]]..ctor(System.__Canon)
...
(lldb) dumpmd 0x7fff79ea2030
Method Name:          System.Threading.Tasks.Task..ctor(Boolean, System.Threading.Tasks.TaskCreationOptions, System.Threading.CancellationToken)
...

BrzVlad added 2 commits March 31, 2026 21:48
Interpreter IR was logged through functionality inside InterpCompiler class via PrintCompiledCode (only during compilation via DOTNET_InterpDump). This commit extracts this logic outside of the class, passing all necessary information as arguments (dataItems, pointer to name map, etc). This also adds code size field on the InterpMethod since this information wasn't easily available. Some data will not be pretty printed, because we don't have access to the jitinterface methods.

Example for crash in InterpExecMethod:

```
(lldb) frame sel 4
frame #4: 0x00007ffff72f8b19 libcoreclr.so`InterpExecMethod(pInterpreterFrame=0x00007ffb899eaa20, pFrame=0x00007ffb899e92b0, pThreadContext=0x00007ffb78009000, pExceptionClauseArgs=0x0000000000000000) at interpexec.cpp:2566:42 [opt]
   2563	                case INTOP_STIND_O:
   2564	                {
   2565	                    char *dst = LOCAL_VAR(ip[1], char*);
-> 2566	                    OBJECTREF storeObj = LOCAL_VAR(ip[2], OBJECTREF);
   2567	                    NULL_CHECK(dst);
   2568	                    SetObjectReferenceUnchecked((OBJECTREF*)(dst + ip[3]), storeObj);
   2569	                    ip += 4;
(lldb) call InterpDumpIR(pFrame->startIp)
Dumping interpreter IR at 0x7fff7b514460 (method 0x7fff7a8c9230))
0008: IR_0000: initlocals     [nil <- nil], 16,16
0014: IR_0003: safepoint      [nil <- nil],
0018: IR_0004: mov.8          [48 <- 0],
0024: IR_0007: ldc.i4         [32 <- nil], 0
0030: IR_000a: ldc.i4         [64 <- nil], 0
003c: IR_000d: ldloca         [40 <- nil], 16
0048: IR_0010: zeroblk.imm    [nil <- 40], 8
0054: IR_0013: mov.vt         [72 <- 16], 8
0064: IR_0017: conv.u1.i4     [56 <- 32],
0070: IR_001a: call           [32 <- 48], 0x7fff79ea2030
0080: IR_001e: mov.8          [32 <- 0],
008c: IR_0021: mov.8          [40 <- 8],
0098: IR_0024: stind.o        [nil <- 32 40], 56
00a8: IR_0028: ret.void       [nil <- nil],
End of method: 00ac: IR_0029
(lldb) dumpmd 0x7fff7a8c9230
Method Name:          System.Threading.Tasks.Task`1[[System.__Canon, System.Private.CoreLib]]..ctor(System.__Canon)
...
(lldb) dumpmd 0x7fff79ea2030
Method Name:          System.Threading.Tasks.Task..ctor(Boolean, System.Threading.Tasks.TaskCreationOptions, System.Threading.CancellationToken)
...
```
Copilot AI review requested due to automatic review settings March 31, 2026 19:07
@BrzVlad BrzVlad requested a review from janvorli as a code owner March 31, 2026 19:07
@BrzVlad BrzVlad requested a review from kg as a code owner March 31, 2026 19:07
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @BrzVlad, @janvorli, @kg
See info in area-owners.md if you want to be subscribed.

@BrzVlad
Copy link
Copy Markdown
Member Author

BrzVlad commented Mar 31, 2026

@janvorli @davidwrighton I assume you spent some time in the debugger, while iterating over interpreter failures. Seems weird to investigate without an easy way to dump the code of various methods. Using only interp dump is awkward and noisy. Any thoughts on this, did you use a different approach ?

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors interpreter IR dumping so it can be invoked outside of InterpCompiler (e.g., from a debugger using a bytecode start pointer), and plumbs additional metadata needed to support that.

Changes:

  • Added InterpMethod::codeSize to preserve the emitted IR size (in int32_t slots) for later dumping.
  • Extracted/rehomed IR dump helpers into standalone Dump* routines that can operate with explicit inputs (code, pDataItems, optional name map / ICorJitInfo).
  • Added InterpDumpIR(const InterpByteCodeStart*) entrypoint to dump IR starting from an interpreted frame’s startIp.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/coreclr/interpreter/inc/interpretershared.h Adds InterpMethod::codeSize and wires it into construction so IR dumping can know method bytecode length post-compilation.
src/coreclr/interpreter/compiler.h Moves pointer-name sentinel constants to file scope for use by extracted dump helpers.
src/coreclr/interpreter/compiler.cpp Refactors print/dump implementation into standalone helpers and introduces InterpDumpIR debugger entrypoint.


void *helperAddr = GetDataItemAtIndex(data.addressDataItemIndex);
PrintPointer(helperAddr);
void *helperAddr = pDataItems[data.addressDataItemIndex];
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DumpHelperFtn (and several DumpInsData cases) now index directly into pDataItems without any bounds checking. Previously GetDataItemAtIndex/GetAddrOfDataItemAtIndex asserted and returned null on invalid indices; with the new code, a corrupted/incorrect operand can cause an out-of-bounds read and crash the dump helper itself. Consider passing the data-item count into DumpHelperFtn/DumpInsData (or providing a small safe accessor) and gracefully handling invalid indices (e.g., print "").

Suggested change
void *helperAddr = pDataItems[data.addressDataItemIndex];
if (pDataItems == nullptr || data.addressDataItemIndex < 0)
{
printf("<bad data item index> ");
return;
}
void* helperAddr = pDataItems[data.addressDataItemIndex];

Copilot uses AI. Check for mistakes.
Comment on lines +11693 to +11697
extern "C" void InterpDumpIR(const InterpByteCodeStart *startIp)
{
InterpMethod *pMethod = startIp->Method;
const int32_t *code = startIp->GetByteCodes();

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InterpDumpIR is intended to be called from a debugger, but clrinterpreter is linked with an explicit exports list (clrinterpreter.exports / clrinterpreter_unixexports.src) that currently only exports getJit/jitStartup. As-is, InterpDumpIR may not be callable on Unix/stripped builds due to hidden visibility and/or version-script exports. If you need this to work reliably, consider marking it INTERP_API and adding it to the interpreter export lists (or otherwise ensuring the symbol is retained/visible).

Copilot uses AI. Check for mistakes.
@janvorli
Copy link
Copy Markdown
Member

Using only interp dump is awkward and noisy. Any thoughts on this, did you use a different approach ?

I've always used the InterpDump. An ideal way would be to allow SOS plugin to disassemble the IR bytecode so that running !u (clru on Unix) would work for interpreter byte code the same way it works for native code. Obviously, that would mean having a copy of the disassembling code in SOS (or exposing functions to do that via mscordaccore - but then we would generate debt for the CDAC effort). I was hesitant to add the copy to SOS before since the bytecode was evolving a lot, but now that it became stable, it might make sense to consider that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants