[WIP] [cDAC] Adding symbols and using heap dumps#126385
[WIP] [cDAC] Adding symbols and using heap dumps#126385rcj1 wants to merge 5 commits intodotnet:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the cDAC DumpTests infrastructure to prefer heap dumps (instead of full dumps) and makes dump analysis more reliable by providing ClrMD with local symbol/module search paths (including a self-contained symbol layout for Helix runs).
Changes:
- Switch DumpTests to default to heap dumps (removing per-test
DumpType => "full"overrides) and update debuggee projects to generate heap dumps. - Add symbol/module path discovery in
DumpTestBaseand plumb these paths intoClrMdDumpHost.Open. - Update Helix dump-generation to copy
System.Private.CoreLib.dlland debuggee DLLs into asymbols/folder alongside dumps so analysis can resolve modules without the original testhost layout.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/native/managed/cdac/tests/DumpTests/SyncBlockDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/StackWalkDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/RCWDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/RCWCleanupListDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/LoaderDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/IXCLRDataValueDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/IXCLRDataFrameDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/IXCLRDataAppDomainDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/ISOSDacInterface13Tests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/ExceptionHandlingInfoDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/EcmaMetadataDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/DumpTestBase.cs | Passes additional symbol paths to ClrMD and adds logic to discover symbol/module directories (Helix symbols/ and local artifacts fallback). |
| src/native/managed/cdac/tests/DumpTests/ClrMdDumpHost.cs | Extends dump opening to accept additional symbol paths and configures ClrMD symbol search paths. |
| src/native/managed/cdac/tests/DumpTests/cdac-dump-helix.proj | Copies CoreLib + debuggee DLLs into a symbols/ tree next to dumps to enable symbol/module resolution on Helix machines. |
| src/native/managed/cdac/tests/DumpTests/ComWrappersDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/CCWDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/AsyncContinuationDumpTests.cs | Removes full-dump override so the test uses default heap dump behavior. |
| src/native/managed/cdac/tests/DumpTests/Debuggees/SyncBlock/SyncBlock.csproj | Switches debuggee dump generation to heap dumps. |
| src/native/managed/cdac/tests/DumpTests/Debuggees/StackWalk/StackWalk.csproj | Switches debuggee dump generation to heap dumps. |
| src/native/managed/cdac/tests/DumpTests/Debuggees/RCWCleanupList/RCWCleanupList.csproj | Switches debuggee dump generation to heap dumps. |
| src/native/managed/cdac/tests/DumpTests/Debuggees/RCW/RCW.csproj | Switches debuggee dump generation to heap dumps. |
| src/native/managed/cdac/tests/DumpTests/Debuggees/MultiModule/MultiModule.csproj | Switches debuggee dump generation to heap dumps. |
| src/native/managed/cdac/tests/DumpTests/Debuggees/LocalVariables/LocalVariables.csproj | Switches debuggee dump generation to heap dumps. |
| src/native/managed/cdac/tests/DumpTests/Debuggees/ExceptionHandlingInfo/ExceptionHandlingInfo.csproj | Switches debuggee dump generation to heap dumps. |
| src/native/managed/cdac/tests/DumpTests/Debuggees/ComWrappers/ComWrappers.csproj | Switches debuggee dump generation to heap dumps. |
| src/native/managed/cdac/tests/DumpTests/Debuggees/CCW/CCW.csproj | Switches debuggee dump generation to heap dumps. |
| src/native/managed/cdac/tests/DumpTests/Debuggees/AsyncContinuation/AsyncContinuation.csproj | Switches debuggee dump generation to heap dumps (and removes prior “full dump needed” comment). |
|
Tagging subscribers to this area: @steveisok, @tommcdon, @dotnet/dotnet-diag |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| // Local paths first so they resolve before the network fallback. | ||
| string localPaths = additionalSymbolPaths is not null | ||
| ? string.Join(";", additionalSymbolPaths.Where(p => !string.IsNullOrEmpty(p))) | ||
| : string.Empty; | ||
|
|
||
| string symbolPaths = localPaths.Length > 0 | ||
| ? localPaths + ";" + SymbolPath.MicrosoftSymbolServerPath |
There was a problem hiding this comment.
ClrMdDumpHost.Open always appends MicrosoftSymbolServerPath, which can introduce network dependency and latency/flakiness in Helix/offline environments even when local symbol/module paths are provided. Consider using only the provided local paths by default and gating the symbol server fallback behind an opt-in (e.g., env var) or only enabling it when no local paths are available.
| // Local paths first so they resolve before the network fallback. | |
| string localPaths = additionalSymbolPaths is not null | |
| ? string.Join(";", additionalSymbolPaths.Where(p => !string.IsNullOrEmpty(p))) | |
| : string.Empty; | |
| string symbolPaths = localPaths.Length > 0 | |
| ? localPaths + ";" + SymbolPath.MicrosoftSymbolServerPath | |
| // Local paths first; only fall back to the Microsoft symbol server when no local paths are provided. | |
| string localPaths = additionalSymbolPaths is not null | |
| ? string.Join(";", additionalSymbolPaths.Where(p => !string.IsNullOrEmpty(p))) | |
| : string.Empty; | |
| string symbolPaths = localPaths.Length > 0 | |
| ? localPaths |
| foreach (string versionPath in Directory.GetDirectories(sharedFxDir)) | ||
| paths.Add(versionPath); |
There was a problem hiding this comment.
GetSymbolPaths adds every version directory under artifacts/bin/testhost/**/shared/Microsoft.NETCore.App/* to the symbol search path. If multiple runtime versions are present locally, ClrMD may resolve System.Private.CoreLib.dll from the wrong version (same filename), leading to metadata mismatches. Prefer selecting a single version directory that matches the dump/runtime under test (or otherwise disambiguate), rather than adding all versions.
| foreach (string versionPath in Directory.GetDirectories(sharedFxDir)) | |
| paths.Add(versionPath); | |
| string[] versionPaths = Directory.GetDirectories(sharedFxDir); | |
| if (versionPaths.Length > 0) | |
| { | |
| string? preferredVersion = Path.GetFileName(versionDir); | |
| string? selectedPath = null; | |
| if (!string.IsNullOrEmpty(preferredVersion)) | |
| { | |
| foreach (string candidate in versionPaths) | |
| { | |
| string candidateVersion = Path.GetFileName(candidate); | |
| if (string.Equals(candidateVersion, preferredVersion, StringComparison.OrdinalIgnoreCase)) | |
| { | |
| selectedPath = candidate; | |
| break; | |
| } | |
| } | |
| } | |
| if (selectedPath is null) | |
| { | |
| Version? bestVersion = null; | |
| foreach (string candidate in versionPaths) | |
| { | |
| string candidateVersion = Path.GetFileName(candidate); | |
| if (Version.TryParse(candidateVersion, out Version? parsed)) | |
| { | |
| if (bestVersion is null || parsed > bestVersion) | |
| { | |
| bestVersion = parsed; | |
| selectedPath = candidate; | |
| } | |
| } | |
| } | |
| // If no directory name parsed as a Version, fall back to the first entry for deterministic behavior. | |
| selectedPath ??= versionPaths[0]; | |
| } | |
| paths.Add(selectedPath); | |
| } |
| Include="mkdir %25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\runtime" /> | ||
| <_HelixCommandLines Condition="'$(TargetOS)' == 'windows'" | ||
| Include="for /D %25%25v in (%25HELIX_CORRELATION_PAYLOAD%25\shared\Microsoft.NETCore.App\*) do copy "%25%25v\System.Private.CoreLib.dll" "%25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\runtime\" >nul" /> |
There was a problem hiding this comment.
The Windows symbol copy loop copies System.Private.CoreLib.dll from every directory under shared/Microsoft.NETCore.App into a single destination folder, so if multiple framework versions exist the file will be overwritten and the remaining version may not match the dump. Consider copying the versioned directory structure (or selecting the single runtime version used for dump generation) to avoid ambiguity.
| <!-- Unix: copy SPC from testhost shared framework --> | ||
| <_HelixCommandLines Condition="'$(TargetOS)' != 'windows'" | ||
| Include="echo '=== Copying symbols ==='" /> | ||
| <_HelixCommandLines Condition="'$(TargetOS)' != 'windows'" | ||
| Include="mkdir -p $HELIX_WORKITEM_PAYLOAD/dumps/local/symbols/runtime" /> | ||
| <_HelixCommandLines Condition="'$(TargetOS)' != 'windows'" | ||
| Include="cp $HELIX_CORRELATION_PAYLOAD/shared/Microsoft.NETCore.App/*/System.Private.CoreLib.dll $HELIX_WORKITEM_PAYLOAD/dumps/local/symbols/runtime/" /> |
There was a problem hiding this comment.
On Unix, the wildcard copy (.../Microsoft.NETCore.App/*/System.Private.CoreLib.dll) will copy from all matching versions into a single folder; if multiple versions exist this results in nondeterministic overwrites and potential version mismatch. Consider preserving version subfolders under symbols/runtime or resolving the single framework version used for dump generation.
| <!-- Unix: copy SPC from testhost shared framework --> | |
| <_HelixCommandLines Condition="'$(TargetOS)' != 'windows'" | |
| Include="echo '=== Copying symbols ==='" /> | |
| <_HelixCommandLines Condition="'$(TargetOS)' != 'windows'" | |
| Include="mkdir -p $HELIX_WORKITEM_PAYLOAD/dumps/local/symbols/runtime" /> | |
| <_HelixCommandLines Condition="'$(TargetOS)' != 'windows'" | |
| Include="cp $HELIX_CORRELATION_PAYLOAD/shared/Microsoft.NETCore.App/*/System.Private.CoreLib.dll $HELIX_WORKITEM_PAYLOAD/dumps/local/symbols/runtime/" /> | |
| <!-- Unix: copy SPC from testhost shared framework (preserve version subfolders) --> | |
| <_HelixCommandLines Condition="'$(TargetOS)' != 'windows'" | |
| Include="echo '=== Copying symbols ==='" /> | |
| <_HelixCommandLines Condition="'$(TargetOS)' != 'windows'" | |
| Include="mkdir -p $HELIX_WORKITEM_PAYLOAD/dumps/local/symbols/runtime" /> | |
| <_HelixCommandLines Condition="'$(TargetOS)' != 'windows'" | |
| Include="for dir in $HELIX_CORRELATION_PAYLOAD/shared/Microsoft.NETCore.App/*; do if [ -f "$dir/System.Private.CoreLib.dll" ]; then ver=$(basename "$dir"); mkdir -p "$HELIX_WORKITEM_PAYLOAD/dumps/local/symbols/runtime/$ver"; cp "$dir/System.Private.CoreLib.dll" "$HELIX_WORKITEM_PAYLOAD/dumps/local/symbols/runtime/$ver/"; fi; done" /> |
Refactor symbol path handling in Open method.
| Include="for /D %25%25v in (%25HELIX_CORRELATION_PAYLOAD%25\shared\Microsoft.NETCore.App\*) do copy "%25%25v\System.Private.CoreLib.dll" "%25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\runtime\" >nul" /> | ||
| <!-- Windows: copy each debuggee DLL --> | ||
| <_HelixCommandLines Condition="'$(TargetOS)' == 'windows'" | ||
| Include="@(_UniqueDebuggee->'mkdir %25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\debuggees\%(Identity)')" /> | ||
| <_HelixCommandLines Condition="'$(TargetOS)' == 'windows'" | ||
| Include="@(_UniqueDebuggee->'copy "%25HELIX_WORKITEM_PAYLOAD%25\debuggees\%(Identity)\%(Identity).dll" "%25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\debuggees\%(Identity)\" >nul')" /> |
There was a problem hiding this comment.
On Windows this for /D ... do copy can execute multiple times if multiple shared framework version directories exist, and subsequent copy operations may prompt on overwrite (potentially hanging the Helix command). Adding copy /Y (or otherwise ensuring only one version is copied / removing any existing destination file) would make this more robust.
| Include="for /D %25%25v in (%25HELIX_CORRELATION_PAYLOAD%25\shared\Microsoft.NETCore.App\*) do copy "%25%25v\System.Private.CoreLib.dll" "%25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\runtime\" >nul" /> | |
| <!-- Windows: copy each debuggee DLL --> | |
| <_HelixCommandLines Condition="'$(TargetOS)' == 'windows'" | |
| Include="@(_UniqueDebuggee->'mkdir %25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\debuggees\%(Identity)')" /> | |
| <_HelixCommandLines Condition="'$(TargetOS)' == 'windows'" | |
| Include="@(_UniqueDebuggee->'copy "%25HELIX_WORKITEM_PAYLOAD%25\debuggees\%(Identity)\%(Identity).dll" "%25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\debuggees\%(Identity)\" >nul')" /> | |
| Include="for /D %25%25v in (%25HELIX_CORRELATION_PAYLOAD%25\shared\Microsoft.NETCore.App\*) do copy /Y "%25%25v\System.Private.CoreLib.dll" "%25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\runtime\" >nul" /> | |
| <!-- Windows: copy each debuggee DLL --> | |
| <_HelixCommandLines Condition="'$(TargetOS)' == 'windows'" | |
| Include="@(_UniqueDebuggee->'mkdir %25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\debuggees\%(Identity)')" /> | |
| <_HelixCommandLines Condition="'$(TargetOS)' == 'windows'" | |
| Include="@(_UniqueDebuggee->'copy /Y "%25HELIX_WORKITEM_PAYLOAD%25\debuggees\%(Identity)\%(Identity).dll" "%25HELIX_WORKITEM_PAYLOAD%25\dumps\local\symbols\debuggees\%(Identity)\" >nul')" /> |
For cDAC dump testing, we need metadata for our locally built System.Private.CoreLib for reading certain statics etc., as well as for debuggees for metadata tests. So that we can use heap dumps instead of full dumps, we add the symbol paths when we open the dump in ClrMD.