Stop lazy alias resolver from overwriting real classes (#1662)#1689
Stop lazy alias resolver from overwriting real classes (#1662)#1689
Conversation
|
🚀 Preview deployment available at: https://d1e27c16.rdoc-6cd.pages.dev (commit: 485c8ed) |
There was a problem hiding this comment.
Pull request overview
Fixes a regression where constant-alias handling could overwrite (“clobber”) documentation for a real class/module with an alias copy, especially when aliases are resolved lazily and/or :nodoc: directives are involved.
Changes:
- Splits constant alias handling into an explicit accessor (
Constant#is_alias_for) and a non-mutating lazy lookup (Constant#resolved_alias_target) based on a new parse-time fieldis_alias_for_path. - Updates Prism and Ripper parsers to record RHS constant-paths (
is_alias_for_path) rather than re-deriving alias-ness from the textualvalue. - Adjusts
ClassModule#update_aliasesto useresolved_alias_targetas a fallback and adds tests covering:nodoc:and collision scenarios.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
test/rdoc/parser/prism_ruby_test.rb |
Updates existing alias tests to the new API and adds regression tests for :nodoc: and collision/clobber scenarios. |
test/rdoc/code_object/class_module_test.rb |
Adds unit tests asserting update_aliases won’t overwrite real classes and will skip :nodoc: constants. |
lib/rdoc/parser/ripper_ruby.rb |
Records constant.is_alias_for_path when creating module aliases. |
lib/rdoc/parser/prism_ruby.rb |
Propagates parse-time constant-path detection into add_constant(..., alias_path:) and uses it for alias registration. |
lib/rdoc/code_object/constant.rb |
Introduces is_alias_for_path and resolved_alias_target, and makes is_alias_for a pure explicit accessor. |
lib/rdoc/code_object/class_module.rb |
Uses the new lazy resolution fallback and adds a collision guard for the lazy path in update_aliases. |
AGENTS.md |
Documents the dual-parser architecture, alias resolution phases, and marshal/ri compatibility constraints. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
679db33 to
00e904a
Compare
|
@codex are you able to review this |
Fixes #1662. PR #1621 made RDoc::Constant#is_alias_for fall through to a lazy find_alias_for lookup that returned whatever class the constant's value named in the current store. The lazy result then flowed into RDoc::ClassModule#update_aliases, which unconditionally wrote a dup'd alias copy into @store.classes_hash, with two safeguards present elsewhere bypassed: * Context#add_module_alias refuses to clobber an existing class entry (the historic BasicObject = BlankSlate guard), but update_aliases did not. * The prism parser only registers an alias when the constant has document_self set (so :nodoc: is honored), but the lazy resolver ignored it. In combination these meant `Foo = Bar # :nodoc:`, where a real `Foo` class was parsed elsewhere, would silently replace the real class's documentation with an alias copy of `Bar` -- the literal failure mode in #1662 (the prism shim's `Ripper = Prism::Translation::Ripper` clobbering the real Ripper class docs). Architectural fix ----------------- Split RDoc::Constant#is_alias_for back into a pure accessor and a separate Constant#resolved_alias_target lookup. is_alias_for now returns only what was recorded explicitly (by Context#add_module_alias, by ClassModule#update_aliases, or by ri marshal load) and never mutates state. resolved_alias_target is the opportunistic forward-reference lookup, used only by update_aliases as a fallback when no explicit alias is recorded; it honors document_self so :nodoc: constants don't lazy-resolve. The lazy lookup itself uses a new Constant#is_alias_for_path attribute populated by the parsers, instead of regex-matching #value at lookup time. Both the prism parser (via ConstantReadNode/ConstantPathNode detection in constant_path_string) and the legacy ripper parser (via on_const-only token accumulation in parse_constant_body) already know whether the RHS is a constant reference at parse time; we now propagate that explicitly rather than rediscovering it from a stringly-typed value. Mechanical fix -------------- ClassModule#update_aliases falls back to const.resolved_alias_target when const.is_alias_for is unset, and gates the destination write behind a collision check that mirrors the one in Context#add_module_alias: skip if classes_hash/modules_hash already holds a real (non-alias) class at the destination name. The guard now applies uniformly to both the explicit and lazy paths. When the fallback resolves a forward-reference target, the resolved class is also written back to const.is_alias_for so downstream consumers (RDoc::Stats#report_constants, RDoc::Constant#marshal_dump) observe the alias relationship the lazy lookup found -- otherwise forward-ref aliases get reported as undocumented and the alias slot is serialized as nil into ri data. For that guard to compose with add_module_alias's pre-registration flow (which already places an alias copy at the destination expecting update_aliases to overwrite it with the properly-marked version), add_module_alias now sets is_alias_for on the alias copy at creation time. Existing alias copies are therefore distinguishable from real classes by the unconditional guard. Tests ----- Parser-level regressions cover :nodoc: assignment, real-class collision, the combined :nodoc:-plus-collision scenario from #1662, the :stopdoc:/:startdoc: workaround path, an explicit alias followed by a same-named class re-open (which under the new guard preserves both the alias target's methods and the methods added by the re-open), and the post-Store#complete is_alias_for persistence on a forward-reference alias. Each asserts both the negative (the would-be alias didn't clobber) and the positive (the alias target keeps its own methods and aliases list). Unit tests on update_aliases assert the same. The two existing PR #1621 tests (test_constant_alias_reverse_order, test_repeated_constant_alias) are updated to exercise the renamed resolved_alias_target API; the underlying forward-reference behavior is preserved.
Replace the bare two-line "Pluggable System" subsection of Architecture Notes with three substantive subsections capturing invariants future agents need: * "Parsers and Generators" -- names the two Ruby parser classes (PrismRuby default, RipperRuby legacy), explains the RDocParserPrismTestCases mixin pattern, and notes that the ripper variant is gated by RDOC_USE_RIPPER_PARSER (so a local bundle exec rake only exercises prism by default; CI runs ripper in a separate job). * "Code Object Model and Constant Aliases" -- describes the two-phase build (parse-time add_constant/add_module_alias vs. Store#complete -> ClassModule#update_aliases), the invariant that safeguards on one path must mirror on the other, and the known lexical-scope approximation in Context#find_enclosing_module_named. * "Marshal / ri Data Compatibility" -- names the in-tree consumers (ri CLI in lib/rdoc/ri/driver.rb, ri --server servlet in lib/rdoc/ri/servlet.rb) and the per-class MARSHAL_VERSION discipline needed to keep locally-cached .ri data loadable across rdoc upgrades.
In RDoc::Parser::RipperRuby#parse_constant_body, the on_const branch only calls create_module_alias (and so only sets is_alias_for_path on the constant) when the RHS const token is followed immediately by on_nl or EOF. Anything else parsed as a separate token after the const -- a trailing +# comment+, a blank line before another statement -- exits the loop via a different branch with is_alias_for_path still nil. Pre-ruby#1689 that didn't matter: Constant#is_alias_for fell back to a regex match on +value+ at lookup time. This PR removed that fallback (split into resolved_alias_target, which only fires when is_alias_for_path was set at parse time), so under RDOC_USE_RIPPER_PARSER=1 those aliases silently fail to resolve during Store#complete -- Stats reports them as undocumented and ri data drops the alias slot. Backstop with a regex check on the constant body text after parse_constant_body returns: if the body is purely a constant path, record it as is_alias_for_path. This mirrors the pre-PR Constant#find_alias_for regex exactly, so behavior for ripper-only inputs that worked before continues to work. Trailing-semicolon forms (+Foo = Bar;+) remain unresolved because the body text includes the +;+ and so doesn't match -- that's a pre-existing limitation, not a fresh regression. Prism is unaffected: its is_alias_for_path is set from the AST node type, which doesn't depend on adjacent whitespace/comment tokens. The new test_constant_alias_with_trailing_comment_resolves_after_complete runs against both parsers via the RDocParserPrismTestCases mixin and fails on the ripper variant before this change.
Codex flagged a ripper-only path where Constant#is_alias_for_path stays nil for +Foo = Bar # comment+: parse_constant_body's on_const branch only fires create_module_alias when the const token is followed by on_nl/EOF, so the trailing comment short-circuits the recording. Pre-PR this was masked by Constant#find_alias_for's lookup-time regex on +value+, which this PR removes. Skip the actual fix -- the ripper parser is on its way out (loads with a deprecation banner, opt-in via RDOC_USE_RIPPER_PARSER=1) and isn't worth a backstop. Keep the test in the shared parser-mixin (RDocParserPrismTestCases) so prism is still guarded against the same shape, and omit it under the legacy ripper variant.
00e904a to
1846a8a
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
* update_aliases persisted const.is_alias_for from the lazy resolved_alias_target lookup before the collision guard ran. When a real class already lived at the destination, the alias copy was correctly skipped but the constant was still mismarked, so Stats#report_constants and Constant#marshal_dump observed the alias relationship for an object the guard had just protected. Move the persistence below the guard so skipped aliases also skip the persistence. * In the prism parser, add_constant resolved owner/name from the full constant path but called @container.add_module_alias on the outer lexical container. For Outer::Foo = Bar parsed at top level, this registered the alias copy as classes_hash['Foo'] (using the container's child_name) instead of 'Outer::Foo', leaving a stray top-level entry that update_aliases later duplicated under the correct namespace. Call owner.add_module_alias instead. The ripper parser already handles this correctly because parse_constant reassigns its container to the resolved nested module. Tests: * test_constant_alias_collision_does_not_mismark_constant_as_alias asserts Foo's constant ends up with is_alias_for=nil when a real Foo class already exists alongside Foo = Bar. * test_qualified_constant_alias_registers_in_owner_namespace asserts Outer::Foo = Bar registers only at classes_hash['Outer::Foo'] and does not leak a top-level Foo entry.
a0405da to
485c8ed
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fixes #1662.
Problem
The lazy alias resolver added in #1621 (
Constant#is_alias_for's fall-through tofind_alias_for) bypassed two safeguards that the explicit alias path has:add_module_alias's collision guard against clobbering existing classesdocument_self(:nodoc:) checkNet result:
Foo = Bar # :nodoc:parsed alongside a realclass Foowould silently overwrite the real class with an alias copy ofBar— which is how prism'sRipper = Prism::Translation::Rippershim wiped Ripper's docs.Fix
Split the API.
Constant#is_alias_foris back to a pure accessor; the lazy lookup moves to a separateConstant#resolved_alias_target.Make the lazy path opt-in. A new
Constant#is_alias_for_pathattribute is set by both parsers (prism viaConstantReadNode/ConstantPathNode, ripper via:on_consttoken info) using information they already had, so the regex-on-#valueguess goes away.Tighten
update_aliases. It now:resolved_alias_targetonly when there's no explicit alias.Stats#report_constantsandConstant#marshal_dumpobserve the relationship.BasicObject = BlankSlateguard inadd_module_alias. The guard applies uniformly to both paths.Compose the guard with pre-registration. For the unconditional guard to coexist with
add_module_alias's practice of pre-registering an alias copy inclasses_hash,add_module_aliasnow setsis_alias_foron that copy at creation time — so existing alias placeholders are distinguishable from real classes.Tests
Each scenario asserts both that the real class wasn't clobbered and that the alias target keeps its own methods:
:nodoc:assignmentRipperis overtaken byPrism::Translation::Ripper#1662 scenario (:nodoc:+ collision):stopdoc:/:startdoc:workaround path:nodoc:commentBoth prism and (
RDOC_USE_RIPPER_PARSER=1) ripper variants pass.Docs
Companion commit refactors AGENTS.md's Architecture Notes to document the parser duality, the two-phase alias build, the lexical-scope limitation, and
MARSHAL_VERSIONdiscipline.Followups (out of scope)
is_alias_forandis_alias_for_pathinto one field — needsMARSHAL_VERSIONbump.resolved_alias_targetdoesn't lean on the syntactic parent chain (already known-approximate perContext#find_enclosing_module_nameddocstring).