Skip to content

Refactor EraseUnions: DU-based domain model for union lowering#19518

Open
T-Gro wants to merge 45 commits intomainfrom
refactoring/erase-unions
Open

Refactor EraseUnions: DU-based domain model for union lowering#19518
T-Gro wants to merge 45 commits intomainfrom
refactoring/erase-unions

Conversation

@T-Gro
Copy link
Copy Markdown
Member

@T-Gro T-Gro commented Mar 30, 2026

@ALL: Pls provide feedback if you have any

Preparatory work to make the union lowering codebase ready for C# union type interop — when that lands, we'll need to decide case by case (pun intended) which IL representation strategy to use for each DU shape, and the old code made that very hard to reason about.

Related:

Replaces UnionReprDecisions (boolean predicates scattered across ~30 functions) with explicit classifications:

  • UnionLayout (9 cases) — how the union type is structured in IL
  • CaseStorage (4 cases) — where each individual case lives
  • DataAccess (4 cases) — replaces avoidHelpers: bool threaded through 16 functions
  • Active patterns with data for discrimination method, tag field presence, etc.

Splits into three files: classification algebra, IL instruction emission, type-def generation.

Zero IL output changes.

T-Gro and others added 30 commits March 30, 2026 17:14
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extract three helper functions from megafunctions in EraseUnions.fs:

- rewriteFieldsForStructFlattening: Nullable attribute rewriting for struct DU
  field flattening, extracted from mkClassUnionDef
- emitDebugProxyType: Debug proxy type generation for union alternatives,
  extracted from convAlternativeDef
- rootTypeNullableAttrs: Root type [Nullable(2)] attribute application,
  extracted from mkClassUnionDef

Zero behavior change - code is moved verbatim into named functions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Wire TypeDefContext into mkClassUnionDef for parameter bundling
- Change return type from 6-element tuple to AlternativeDefResult record
- Extract emitMakerMethod for non-nullary case maker methods
- Extract emitTesterMethodAndProperty for Is* test methods/properties
- Extract emitNullaryCaseAccessor for nullary case getter properties
- Extract emitConstantAccessor for unique singleton object accessors
- Extract emitNullaryConstField for nullary case static field definitions
- Extract emitNestedAlternativeType for nested case type definitions
- Rename convAlternativeDef to processAlternative (~46 lines)
- Format with fantomas

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Extract emitRootClassFields, emitRootConstructors, emitConstFieldInitializers,
emitTagInfrastructure, assembleUnionTypeDef, computeSelfAndTagFields, and
computeEnumTypeDef from the monolithic mkClassUnionDef function. The main
function is now a 54-line orchestrator that delegates to these helpers.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Delete DiscriminationTechnique DU, UnionReprDecisions generic class,
cuspecRepr/cudefRepr instances, NoTypesGeneratedViaThisReprDecider,
and layoutToTechnique bridge function.

All code paths now use UnionLayout type with exhaustive active patterns
and focused helper functions (altFoldsAsRootInstance, altOptimizesToRoot,
maintainConstantField, hasNullCase).

Simplify AlternativeDefResult.NullaryConstFields tuple by removing the
unused (ILTypeDef * IlxUnionInfo) element.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The NoHelpers branch extracted isStruct/isNull into booleans then used
if/elif chains. Replace with direct pattern matching on (layout, cidx)
and layout active patterns, making the logic readable at a glance.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
tyForAlt duplicated the entire altOptimizesToRoot classification logic
inline. Extract tyForAltIdx taking cidx for direct calls, and have
tyForAlt look up cidx. All internal callers that already have cidx
now call tyForAltIdx directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace 'not alt.IsNullary && (match ctx.layout with ValueTypeLayout
-> true | ReferenceTypeLayout -> false)' with a direct match on
ctx.layout with a when guard, collapsing two match arms into one.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the boolean chain 'g.checkNullness && g.langFeatureNullness &&
(match layout ... -> true | ... -> false) && not alt.IsNullary' with a
direct pattern match on ctx.layout with when guard. Also merge the two
early-return conditions into one.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace 'altFoldsAsRootInstance || (match layout ValueTypeLayout -> true
| ReferenceTypeLayout -> false)' with a clear match: value types always
put fields on root, reference types only when they fold as root instance.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Break the 5-line boolean expression into named conditions with comments
explaining when the root ctor is needed vs skipped.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tening

The function took an isStruct bool and checked 'isStruct && cases > 1'
which is exactly TaggedStructUnion. Match on layout directly and remove
the now-unnecessary cud parameter.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…stance

isSingleNonNullaryFoldedToRoot duplicated logic already in
altFoldsAsRootInstance for SmallRefUnion. Replace all 4 call sites with
altFoldsAsRootInstance which encodes the same semantics. Also simplify
caseFoldsToRootClass which was a thin wrapper around the eliminated fn.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
caseFoldsToRootClass was a thin wrapper around altFoldsAsRootInstance
for SmallRefUnion. All 3 callers are already inside SmallRefUnion match
arms, so they can call altFoldsAsRootInstance directly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace nested if/elif chain with a match on (isList, alts.Length,
isStruct) tuple. Hoist allNullary computation. Each match arm maps
cleanly to one UnionLayout case.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace 'alt.IsNullary && (match ValueTypeLayout -> false | ...) &&
(match CaseIsNull -> false | ...)' with a nested match that reads
naturally: null-represented cases don't need a constant field,
value types don't need one, ref types do.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace 'g.checkNullness && g.langFeatureNullness && (match CaseIsNull
-> true | ...)' with a direct match on (layout, num) with a when guard,
eliminating the AP-to-bool intermediate.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce DataAccess = RawFields | ViaHelpers | ViaListHelpers that
collapses the per-call-site avoidHelpers:bool × per-union HasHelpers
enum into a single DU computed once at entry points.

- avoidHelpers parameter eliminated from 16 internal + 7 public functions
- doesRuntimeTypeDiscriminateUseHelper simplified from 3-way && to DU match
- emitLdDataTagPrim: 'match HasHelpers with AllHelpers when not avoidHelpers'
  becomes clean 'match access with ViaHelpers | ViaListHelpers'
- adjustFieldName split: DataAccess version for access path,
  adjustFieldNameForTypeDef for type-def path

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
nullCaseIdx was confusable with 'nullary' (no data). It actually means
UseNullAsTrueValue — a case represented as null at runtime. Renamed to
nullAsTrueValueIdx throughout.

ListTailOrNull renamed to FSharpList for clarity — this layout is
exclusively for F# list<'T>.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lary DU cases

TaggedRefUnion(baseTy, allNullary:bool) → TaggedRef baseTy + TaggedRefAllNullary baseTy
TaggedStructUnion(baseTy, allNullary:bool) → TaggedStruct baseTy + TaggedStructAllNullary baseTy

Eliminates the hidden 3-way logic where TaggedRefUnion(_, true) was
enum-like (all on root), TaggedRefUnion(_, false) split nullary→root
vs non-nullary→nested. Now explicit DU cases, no boolean field.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fix SpecialFSharpOptionHelpers mapping (ViaHelpers, not RawFields).

TaggedRefUnion(_, allNullary) → TaggedRef + TaggedRefAllNullary
TaggedStructUnion(_, allNullary) → TaggedStruct + TaggedStructAllNullary
SmallRefUnion(_, opt) → SmallRef + SmallRefWithNullAsTrueValue(_, idx)

UnionLayout now has 8 cases, zero boolean fields, zero option fields.
Every match arm is explicit — no hidden 3-way logic via bool destructuring.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove 3 dead SmallRef when-guards (altFoldsAsRootInstance always
  returns false for SmallRef — dead code since the DU split)
- Replace wildcard in altFoldsAsRootInstance with explicit cases
  for compiler-enforced exhaustiveness
- Eliminate hasNullCase function — callers match SmallRefWithNullAsTrueValue directly

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nion

- Add nullnessCheckingEnabled helper replacing 5 scattered
  'g.checkNullness && g.langFeatureNullness' conjunctions
- Restructure classifyUnion: replace 4 when-guarded arms with
  nested match on (isStruct, allNullary) — exhaustive, no guards

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…er guard

- Inline doesRuntimeTypeDiscriminateUseHelper (1-line function, 2 call
  sites) directly into mkRuntimeTypeDiscriminate and mkRuntimeTypeDiscriminateThen
- Replace 'cud.UnionCases.Length <= 1 || (match SmallRefWithNull -> true)'
  with clean match on layout cases (SingleCaseRef, SingleCaseStruct,
  SmallRefWithNullAsTrueValue → skip tester)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lthrough

TaggedStructAllNullary + when alt.IsNullary was redundant — all cases
are nullary by definition. Split into explicit arm without guard.
Remove TaggedStructAllNullary from default fallthrough (already handled).
Remove dead 'when cud.UnionCases.Length <= 1' guard (covered by layout).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove duplicate XML doc comment at line 1785.
Remove dead FSharpList arm from emitRawConstruction default (already
handled by when alt.IsNullary and non-nullary Cons arms above).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
F1: Replace NullaryConstFields 5-tuple with NullaryConstFieldInfo record
    — named fields instead of opaque (IlxUnionCase*ILType*int*ILFieldDef*bool)

F5: Introduce CaseStorage DU (Null|Singleton|OnRoot|InNestedType|StructTag)
    — classifyCaseStorage computes once per case, used in emitCastToCase
      and processAlternative to eliminate duplicated decision trees

F7: Rename helpers HOW→WHAT for domain clarity:
    altFoldsAsRootInstance → caseFieldsOnRoot
    altOptimizesToRoot → caseRepresentedOnRoot
    maintainConstantField → needsSingletonField
    convNewDataInstrInternal → emitRawNewData

F8: emitDebugProxyType: 11 positional params → TypeDefContext + 2 params

F4: Extract ILStamping record from TypeDefContext, separating domain
    data (layout, cuspec, cud) from infrastructure callbacks

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
G1: Introduce CaseIdentity record + resolveCase — eliminates repeated
    (alt, altTy, altName) computation in 4 emit functions

G3: Use CaseStorage in emitRawConstruction — replaces nested layout
    match + 4 when guards with flat match on precomputed CaseStorage.
    Fix: needsSingletonField fallback for nullary SmallRef cases.

G4: Merge duplicate Singleton/InNestedType branches (OR-pattern)

G5: Remove mkTagFieldFormalType (identical to mkTagFieldType)

G9: Rename self* → rootCase* for domain clarity

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
T-Gro and others added 13 commits March 30, 2026 17:14
- [<Struct>] on CaseIdentity record — allocated per-instruction-site,
  immediately destructured, no need to heap-allocate
- inline on (|CaseIsNull|CaseIsAllocated|) AP — avoids tuple allocation
  at 8 call sites (tuple was (layout, cidx))
- Array.filter |> Array.length = 1 → Array.existsOne in caseFieldsOnRoot
  — eliminates intermediate array allocation per call
- cuspec.Alternatives → cuspec.AlternativesArray in emitLdDataTagPrim
  — avoids Array.toList allocation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add tyForAltIdxWith and resolveCaseWith that accept precomputed layout
and baseTy. Update emit functions (emitRawConstruction, emitIsCase,
emitBranchOnCase, emitCastToCase, emitCaseSwitch, emitLdDataTagPrim)
and type-def functions (emitNullaryConstField, emitNestedAlternativeType)
to use the precomputed values instead of recomputing per call.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Extract rewriteNullableAttrForFlattenedField from
  rewriteFieldsForStructFlattening — reduces nesting from 3→1 level,
  documents the byte semantics clearly
- Add comments: reverse iteration rationale in emitLdDataTagPrim,
  minNullaryIdx ctor sharing logic, isAbstract derivation

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…thod

Replace inline re-derivation from raw UnionLayout with two-axis
classification: CaseStorage (WHERE) × DiscriminationMethod AP (HOW).

emitIsCase: match storage with Null→null-ceq, then match (storage,
layout) with OnRoot+RuntimeType→non-null-test, NoDiscrimination→
always-true, RuntimeType→isinst, TagField→tag-ceq, TailNull→tail-check.

emitBranchOnCase: same two-axis structure with branch instructions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ps params

- Add two-axis architecture comment documenting UnionLayout×CaseStorage model
- Add InteropCapability type + classifier for fslang-suggestions/1463 readiness
  (ClosedHierarchy|UnionProtocol|UnionProtocolWithTupleBoxing|NoInterop)
- Reduce mkMethodsAndPropertiesForFields from 8 params to 3 (ctx, ilTy, fields)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Dead code for a future feature that hasn't been approved.
Only code that is actually used belongs in the codebase.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ldsOnRootType APs

All three were never called by any live code path:
- _validateActivePatterns: 'compile-time check' trick — real match sites enforce exhaustiveness
- NonNullaryFoldsToRoot AP: only consumed by the removed validation function
- FieldsOnRootType AP: same

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Active patterns can carry data. The DiscriminationMethod AP now returns:
- DiscriminateByTagField baseTy
- DiscriminateByRuntimeType(baseTy, nullAsTrueValueIdx: int option)
- DiscriminateByTailNull baseTy
- NoDiscrimination baseTy

This eliminates 3 separate baseTyOfUnionSpec calls and 2 re-matches
on SmallRefWithNullAsTrueValue to extract nullIdx in emit functions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
EraseUnions.Types.fs (421 lines): Domain model, classification, APs
  UnionLayout, CaseStorage, DataAccess, CaseIdentity, all active
  patterns, classification functions, layout-based helpers.

EraseUnions.Emit.fs (442 lines): IL instruction emission
  Per-instruction-site IL: construct, discriminate, branch, cast,
  switch, load field, load tag. Public API consumed by IlxGen.fs.

EraseUnions.fs (1039 lines): Type definition generation
  Per-union-definition: generate ILTypeDef with nested types, methods,
  properties, fields, debug proxies. The orchestrator.

Dependency flow: Types ← Emit ← TypeDef (one-directional).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
BUG FIX: emitRawConstruction for SingleCaseStruct nullary — was emitting
tag-ctor for CaseStorage.OnRoot+IsNullary but SingleCaseStruct has no
tag field (NoTagField). Now checks HasTagField|NoTagField before
choosing ctor shape.

Review fixes (3-model voted, 8/10 unanimous):
- H1: altOfUnionSpec blanket 'with _' catch → bounds check
- H2: tyForAlt Array.findIndex → tryFindIndex + meaningful error
- H3+H4: Fix stale case counts in architecture comment (8→9, 5→4)
- H5: Remove unused _cuspec param from mkTagFieldType (10 call sites)
- H7: List.ofArray|>List.mapi → Array.mapi|>Array.toList
- H8: Rename quadruple negation hasFieldsOrTagButNoMethods → onlyMethodsOnRoot
- H9: Named constants for DynamicDependency flags 0x660/0x7E0

Also revert unrelated FSharp.Core doc/test changes (per expert review).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…on headers

P1: Move TypeDefContext/ILStamping/AlternativeDefResult/NullaryConstFieldInfo
    from Types.fs to EraseUnions.fs — they're type-def scaffolding, not
    classification algebra. Types.fs is now purely: DU types, classifiers, APs.

P2+P6: Add architecture overview at top of EraseUnions.fs with pipeline
    description and concrete DU→UnionLayout→CaseStorage example mappings
    (Option, Color, Result, Shape, Token).

P3: Add section header before nullable-attr rewriting functions.

P7: Fix duplicate comment in rewriteNullableAttrForFlattenedField.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…FieldNameForList

- mkMethodsAndPropertiesForFields → private (internal-only, flagged by Opus)
- Extract adjustFieldNameForList to Types.fs as shared naming core
  (adjustFieldNameForTypeDef and adjustFieldName both delegate to it,
  eliminating duplicated Head→HeadOrDefault / Tail→TailOrNull mapping)
- Flagged by 4/7 council models across 2 rounds

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SpecialFSharpOptionHelpers needs helpers for FIELD access (private
fields) but raw discrimination for TAG access (inline isinst, not
GetTag call). The old code made these decisions independently.

DataAccess.ViaOptionHelpers: field access uses get_X helpers (like
ViaHelpers), tag access uses raw isinst chain (like RawFields).
This preserves exact IL output — zero .bsl changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@T-Gro T-Gro requested a review from a team as a code owner March 30, 2026 15:16
@T-Gro T-Gro added the NO_RELEASE_NOTES Label for pull requests which signals, that user opted-out of providing release notes label Mar 30, 2026
@T-Gro T-Gro requested a review from abonie March 30, 2026 15:16
@github-actions
Copy link
Copy Markdown
Contributor

⚠️ Release notes required, but author opted out

Warning

Author opted out of release notes, check is disabled for this pull request.
cc @dotnet/fsharp-team-msft

@T-Gro
Copy link
Copy Markdown
Member Author

T-Gro commented Mar 31, 2026

/run fantomas

@github-actions
Copy link
Copy Markdown
Contributor

🔧 CLI Command Report

  • Command: /run fantomas
  • Outcome: success

✅ Patch applied:
- Files changed: 1
- Lines changed: 48

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NO_RELEASE_NOTES Label for pull requests which signals, that user opted-out of providing release notes

Projects

Status: New

Development

Successfully merging this pull request may close these issues.

2 participants