Skip to content

HBASE-30161 Add paginated, single-RPC RegionLocator.getRegionLocations(startKey, limit) API for bulk meta-cache warmup#8237

Open
sanjeet006py wants to merge 5 commits into
apache:masterfrom
sanjeet006py:HBASE-30161-async-master
Open

HBASE-30161 Add paginated, single-RPC RegionLocator.getRegionLocations(startKey, limit) API for bulk meta-cache warmup#8237
sanjeet006py wants to merge 5 commits into
apache:masterfrom
sanjeet006py:HBASE-30161-async-master

Conversation

@sanjeet006py
Copy link
Copy Markdown
Contributor

JIRA: HBASE-30161

Generated-by: Claude Opus 4.7

Sanjeet Malhotra added 3 commits May 14, 2026 22:21
…s(startKey, limit) API for bulk meta-cache warmup
… and add default-throws

Review comments on apache#8236 by haridsv:
- Adding an abstract method to RegionLocator / AsyncTableRegionLocator breaks
  external implementers. Convert the new method to a default that throws
  UnsupportedOperationException, with javadoc instructing callers to fall
  back to getAllRegionLocations().
- Reusing the getRegionLocations name overloads a method with completely
  different semantic (row -> all replicas of containing region). Rename to
  getRegionLocationsPage(byte[] startKey, int limit) to make the
  pagination/range intent explicit.
- SnapshotRegionLocator no longer needs an override; it inherits the
  default-throws and callers fall back to getAllRegionLocations().

Tests renamed to call getRegionLocationsPage; semantic and assertions
unchanged.
Address point apache#4 of the canvas/code review: the paged meta scan was making
ceil(limit / hbase.meta.scanner.caching) ScannerNext RPCs whenever the
limit exceeded the configured caching, contradicting the
"at most one RPC per invocation" javadoc on
RegionLocator.getRegionLocationsPage / AsyncTableRegionLocator.getRegionLocationsPage.

Plumb a private isPagedScan flag through ClientMetaTableAccessor.scanMeta
and getMetaScan. When true, size both setLimit and setCaching to the
caller-supplied rowUpperLimit so the slice returns in a single ScannerNext
RPC regardless of the configured caching default. The unbounded scanMeta
overload (used by getTableHRegionLocations(metaTable, tableName) for
getAllRegionLocations) keeps its existing behavior.

Add TestRegionLocatorPagedScanRpcCount: starts a mini-cluster with
hbase.meta.scanner.caching=2, wraps the meta AsyncTable to count
AdvancedScanResultConsumer.onNext invocations, and asserts:
- limit <= caching   -> 1 ScannerNext (baseline)
- limit > caching    -> 1 ScannerNext (the regression check)
- unbounded scan     -> ceil(N / caching) ScannerNext (proves the flag
                        does not leak into the existing path)

Verified the test catches the regression: reverting the getMetaScan
change makes testSingleRpcWhenLimitExceedsCaching fail with onNext=3.
sanjeet006py pushed a commit to sanjeet006py/hbase that referenced this pull request May 15, 2026
…scans

Mirrors the master-PR test (TestRegionLocatorPagedScanRpcCount on
apache#8237). On the sync path we proxy:

  Connection.getTable(META_TABLE_NAME) -> Table.getScanner(Scan) ->
  ResultScanner.close()

so we can flip Scan.setScanMetricsEnabled(true) on the way in and read
ScanMetrics.countOfRPCcalls before the underlying scanner is closed. The
RPC count is the natural sync analogue of the
AdvancedScanResultConsumer.onNext invocation count we used on master.

Cluster runs with hbase.meta.scanner.caching = 2 against a table of
5 user regions, asserting:
- limit <= caching: 1 ScannerNext RPC (baseline)
- limit > caching: 1 ScannerNext RPC (regression check; without the
                   isPagedScan fix this would be ceil(5/2) = 3)
- unbounded scan:  ceil(NUM_REGIONS / caching) = 3 ScannerNext RPCs
                   (proves the isPagedScan flag does not leak into the
                   non-paged callers)

Verified the test catches the regression: reverting the isPagedScan
branch in MetaTableAccessor.getMetaScan makes
testSingleRpcWhenLimitExceedsCaching fail with countOfRPCcalls = 3.
* Ordering: regions are returned in ascending region start-key order (the natural order of
* {@code hbase:meta} rows for a single table). Within each region, replicas are returned in
* ascending replica-id order (replica 0, then 1, then 2, ...). Split parents are filtered out,
* which may cause a page to contain fewer than {@code limit} regions but never disturbs ordering
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about merged regions ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging — let me share my understanding and please correct me if I've got this wrong.

I think merge parents don't need filtering here because of how splits vs merges record cleanup state in hbase:meta:

  • Split: the parent row stays around with setOffline(true).setSplit(true) (see RegionStateStore#splitRegion) and carries info:splitA/info:splitB pointers. The catalog janitor seems to use that row to track whether daughters still hold HFile references, and only deletes the parent row + HDFS dir once references are gone. So during that window, meta scans do see the parent — which is why excludeOfflinedSplitParents exists.

  • Merge: from what I can see in RegionStateStore#mergeRegions, the same multi-mutate that commits the merge deletes every parent row and writes a single child row with info:merge_0000, info:merge_0001, … pointing back at the parents. So the "waiting on HFile cleanup" bookkeeping lives on the child row, not on the parents.

If that's right, a meta scan should never see merge parents, and the merged child row is itself the live region for that key range — which is what we'd want to return. This also seems to match getAllRegionLocations()'s existing behavior; both share CollectRegionLocationsVisitor(excludeOfflinedSplitParents=true).

Does this line up with your understanding? Happy to add a comment near the visitor call site noting this if it'd help future readers, or revisit the filter if I'm missing a case.

addListener(future, (locs, error) -> {
if (error != null || locs == null) {
return;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if there is error ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — let me explain the intent and please push back if it's not the right call.

The listener at AsyncTableRegionLocatorImpl.java:104 is a side-effect-only hook for cache warm-up. The future returned to the caller is the one from ClientMetaTableAccessor.getTableHRegionLocations, which already completes exceptionally on any meta-scan error (see getTableRegionsAndLocations's future.completeExceptionally(error) paths). So the error propagates to the caller through the returned future regardless of what the listener does.

On the listener side, if error != null there's simply nothing valid to cache, and if locs == null there's nothing to iterate. Either way, returning early is the safe behavior — we don't want to mask the error or double-fail the future, just skip cache population for this call. The next call to getRegionLocationsPage (or any locator path) will populate normally.

That said, I can add a LOG.debug (or LOG.warn) in the error branch so the silent skip is observable, e.g.:

if (error != null) {
  LOG.debug("Skipping cache warm-up for {}: {}", tableName, error.toString());
  return;
}
if (locs == null) {
  return;
}

Happy to add that if you think it's worth it.

Sanjeet Malhotra added 2 commits May 15, 2026 23:03
…nc-master

Resolves conflict in AbstractTestRegionLocator.java: takes upstream
JUnit 5 import direction (HBASE-30082) and adds assertNotNull/
assertThrows/assertTrue for the paged-locator tests; flips message
arg position to JUnit 5 ordering at four call sites.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants