ZOOKEEPER-5051: Avoid NPE closing 4lw connection during startup#2391
ZOOKEEPER-5051: Avoid NPE closing 4lw connection during startup#2391PAC-MAJ wants to merge 1 commit into
Conversation
6f17b3e to
fa9f51d
Compare
8ace05d to
89dae72
Compare
|
Hi ZooKeeper maintainers, Gentle ping on this PR for ZOOKEEPER-5051. This fixes a startup race where a 4lw command connection can be closed before ZooKeeperServer.startdata() initializes zkDb, causing ZooKeeperServer.removeCnxn() to throw an NPE and potentially leave the client connection hanging. The change is intentionally small: guard removeCnxn() when zkDb is not initialized yet. I also added a deterministic regression test for the pre-startdata case. The previous precommit failure looked like Jenkins infrastructure/agent failure during result collection rather than a test failure. Could someone please rerun CI or advise if anything else is needed from my side? Thanks! |
PDavid
left a comment
There was a problem hiding this comment.
Thanks, this looks like a clean, minimal fix. 👍
Disclaimer: I contributed to ZooKeeper but I'm not (yet) a ZooKeeper committer so I cannot merge your PR. You'll still need ZooKeeper committer approval.
Problem
When a 4-letter command is sent while the ZooKeeper server is starting,
the connection close path can call
ZooKeeperServer.removeCnxn()beforestartdata()has initializedzkDb.This can trigger:
But this also hangs any client waiting for connection close.
Fix
Guard
ZooKeeperServer.removeCnxn()so it only delegates to zkDbwhen zkDb has already been initialized.
Test
Added
ZooKeeperServerTest#testRemoveCnxnBeforeStartData.The test creates a ZooKeeperServer but intentionally does not call
startdata(), then verifies thatremoveCnxn()does not throw.