Skip to content

logpuller: optimize resolved-ts dispatch allocations#4698

Merged
ti-chi-bot[bot] merged 4 commits intopingcap:masterfrom
asddongmen:0403-puller-cpu-surge
Apr 5, 2026
Merged

logpuller: optimize resolved-ts dispatch allocations#4698
ti-chi-bot[bot] merged 4 commits intopingcap:masterfrom
asddongmen:0403-puller-cpu-surge

Conversation

@asddongmen
Copy link
Copy Markdown
Collaborator

@asddongmen asddongmen commented Apr 3, 2026

What problem does this PR solve?

Issue Number: close #4697

What is changed and how it works?

  • Added a single-region fast path:
    • If len(resolvedTsEvent.Regions) == 1, directly push one-state event.
  • Changed initial resolvedStates capacity to:
    • capHint = min(len(resolvedTsEvent.Regions), resolvedTsStateBatchSize)
  • Kept large-batch protection:
    • resolvedTsStateBatchSize remains 1024.
  • Removed redundant allocation pattern:
    • After flush, do not always allocate a new slice.
    • Allocate the next batch buffer only when there are remaining regions to process.
  • Added tests and benchmarks in region_request_worker_test.go:
    • Single-region fast-path behavior
    • Large-batch split behavior
    • Legacy vs current benchmark comparisons

Benchmark Command

mkdir -p /tmp/go-cache /tmp/go-tmp && \
GOCACHE=/tmp/go-cache GOTMPDIR=/tmp/go-tmp \
go test ./logservice/logpuller \
  -run '^$' \
  -bench 'BenchmarkDispatchResolvedTsEvent' \
  -benchmem \
  -count=5

Average Comparison (5 runs)

Scenario Legacy ns/op Current ns/op ns/op Change Speedup Legacy B/op Current B/op B/op Change Legacy allocs/op Current allocs/op allocs Change
SingleRegion 664.94 30.19 -95.46% 22.02x 19074 137 -99.28% 2.00 1.00 -50.00%
SmallBatch 658.94 148.20 -77.51% 4.45x 19071 256 -98.66% 2.00 1.00 -50.00%
LargeBatch 36325.80 36532.80 +0.57% 0.99x 47899 38424 -19.78% 5.00 4.00 -20.00%
  • SingleRegion and SmallBatch show major CPU and allocation improvements.
  • LargeBatch keeps bounded memory behavior and reduces allocations; ns/op is roughly flat (+0.57%).

Check List

Tests

  • Unit test
  • Manual test (add detailed scripts or steps below)

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

  Reduce CPU and allocation overhead in log puller by optimizing resolved-ts event dispatch for small batches.

Summary by CodeRabbit

  • Performance

    • Improved resolved-timestamp dispatch: smarter pre-sizing, buffer reuse, and adaptive batching for single-region fast path and large multi-region splits, reducing allocations and improving throughput.
  • Tests

    • Added unit tests and benchmarks validating single-region fast-path, large-set batch-splitting, and performance comparisons against the legacy dispatcher.

Signed-off-by: dongmen <414110582@qq.com>
@ti-chi-bot ti-chi-bot bot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Apr 3, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 3, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4e756c24-8e9c-4105-b60c-46467824015a

📥 Commits

Reviewing files that changed from the base of the PR and between 3383cc7 and f83f247.

📒 Files selected for processing (2)
  • logservice/logpuller/region_request_worker.go
  • logservice/logpuller/region_request_worker_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • logservice/logpuller/region_request_worker_test.go
  • logservice/logpuller/region_request_worker.go

📝 Walkthrough

Walkthrough

Compute resolvedStates capacity from incoming region count (capped by batch size), use index-based iteration to track remaining regions, avoid unnecessary reallocation by nil-ing the slice after pushing. Add tests and benchmarks validating single-region fast-path and large-batch splitting.

Changes

Cohort / File(s) Summary
Dispatch logic
logservice/logpuller/region_request_worker.go
Initialize resolvedStates capacity from len(resolvedTsEvent.Regions) (capped by resolvedTsStateBatchSize); switch to index-based loop to compute remaining regions; pass slice directly to push, set resolvedStates = nil after flush, and only reallocate when remaining elements exist.
Tests & benchmarks
logservice/logpuller/region_request_worker_test.go
Add mockRegionEventDynamicStream, helper to build test worker and synthetic ResolvedTs events, unit tests asserting single-region push and 2050 -> 1024/1024/2 splitting, and benchmarks comparing legacy fixed-1024 batching vs new dispatch logic for various region counts.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

size/M

Suggested reviewers

  • wk989898
  • lidezhu
  • flowbehappy

Poem

🐇 I hopped through slices, small and fleet,
Sized each batch to fit the beat.
Nil the tray, reuse what's near,
Fewer hops, the path is clear.
Pushes counted — merry cheer!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: optimizing memory allocations in the resolved-ts dispatch logic within the log puller.
Description check ✅ Passed The description includes issue number (close #4697), detailed explanation of changes, benchmark results, test additions, and a release note—fully complying with the template requirements.
Linked Issues check ✅ Passed All coding requirements from issue #4697 are met: single-region fast path, sized initial capacity, large-batch protection, reduced redundant allocations, and comprehensive tests/benchmarks validating the optimization.
Out of Scope Changes check ✅ Passed All changes are strictly scoped to resolving the allocation optimization issue; no unrelated code modifications are present outside the linked issue objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ti-chi-bot ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 3, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request optimizes the dispatchResolvedTsEvent function by introducing a fast path for single-region events and refining memory allocation during batch processing of multi-region events. Review feedback suggests downgrading a new warning log to debug level to prevent potential log noise and further optimizing slice re-allocations by calculating a more precise capacity hint for remaining regions.

Comment on lines +313 to +317
log.Warn("region request worker receives a resolved ts event for an untracked region",
zap.Uint64("workerID", s.workerID),
zap.Uint64("subscriptionID", uint64(subscriptionID)),
zap.Uint64("regionID", regionID),
zap.Uint64("resolvedTs", resolvedTsEvent.Ts))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This log.Warn is new and could be noisy if many single-region ResolvedTs events are received for untracked regions. Consider changing it to log.Debug or removing it, as untracked regions are common during subscription changes.

Signed-off-by: dongmen <414110582@qq.com>
Signed-off-by: dongmen <414110582@qq.com>
@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Apr 4, 2026
@ti-chi-bot ti-chi-bot bot added the lgtm label Apr 4, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 4, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hongyunyan, lidezhu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Apr 4, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Apr 4, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-04-04 05:50:59.857827585 +0000 UTC m=+589865.063187642: ☑️ agreed by hongyunyan.
  • 2026-04-04 11:17:58.290557924 +0000 UTC m=+609483.495917981: ☑️ agreed by lidezhu.

@asddongmen
Copy link
Copy Markdown
Collaborator Author

/test all

@asddongmen
Copy link
Copy Markdown
Collaborator Author

/test all

@ti-chi-bot ti-chi-bot bot merged commit 165df37 into pingcap:master Apr 5, 2026
25 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

logpuller: excessive CPU usage under resolved-ts-heavy workload due to fixed-size slice allocations

3 participants