Hash pyi files after ruff post-processing in pyi_generator#6640
Conversation
pyi_hashes.json entries were computed as each stub was written, before scan_all ran `ruff format` and `ruff check --fix` over the generated files. Since ruff materially rewrites the stubs (quoting, wrapping, import fixes), the registry recorded hashes of intermediate content that never exists on disk afterwards: any generator change affecting only the pre-format output flagged hash changes even when the final .pyi files were byte-for-byte identical. _scan_file now just records which .pyi files were written, and scan_all computes the md5 hashes from the final on-disk content after ruff post-processing. written_files becomes a per-instance list (it was a mutable class attribute shared across generator instances) and the dead modules/root/current_module class attributes are dropped. pyi_hashes.json is regenerated with the new scheme (one-time value churn for every entry; keys unchanged). Verified idempotent across repeated --force runs and explicit-target merge runs.
Greptile SummaryThis PR fixes a bug where
Confidence Score: 5/5Safe to merge — the change is narrowly scoped to hash-timing and a class-attribute cleanup with no behavioral impact on stub generation itself. The stub writing logic is unchanged; only the moment at which hashes are computed moves (from before ruff to after). The mutable class-attribute fix eliminates a potential cross-instance contamination. The new test exercises the full end-to-end path. No files require special attention. Important Files Changed
Reviews (2): Last reviewed commit: "Re-trigger CI" | Re-trigger Greptile |
Merging this PR will not alter performance
Comparing Footnotes
|
integration-app-harness-playwright (redis, 3.11) failed to pull the redis image from Docker Hub (context deadline exceeded) before any tests ran; all 105 other checks passed. Empty commit to re-roll. https://claude.ai/code/session_01APPJC9ZSmcHQy9WkzfxqVs
pyi_hashes.json entries were computed as each stub was written, before
scan_all ran
ruff formatandruff check --fixover the generatedfiles. Since ruff materially rewrites the stubs (quoting, wrapping,
import fixes), the registry recorded hashes of intermediate content
that never exists on disk afterwards: any generator change affecting
only the pre-format output flagged hash changes even when the final
.pyi files were byte-for-byte identical.
_scan_file now just records which .pyi files were written, and scan_all
computes the md5 hashes from the final on-disk content after ruff
post-processing. written_files becomes a per-instance list (it was a
mutable class attribute shared across generator instances) and the dead
modules/root/current_module class attributes are dropped.
pyi_hashes.json is regenerated with the new scheme (one-time value
churn for every entry; keys unchanged). Verified idempotent across
repeated --force runs and explicit-target merge runs.