Skip to content

[Repo Assist] perf: reuse ResizeArray buffer in chunkBy and chunkByAsync#373

Merged
dsyme merged 2 commits intomainfrom
repo-assist/perf-chunkby-reuse-buffer-20260331-571b7516713643ae
Apr 1, 2026
Merged

[Repo Assist] perf: reuse ResizeArray buffer in chunkBy and chunkByAsync#373
dsyme merged 2 commits intomainfrom
repo-assist/perf-chunkby-reuse-buffer-20260331-571b7516713643ae

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

🤖 This is an automated pull request from Repo Assist.

Summary

TaskSeq.chunkBy and TaskSeq.chunkByAsync previously allocated a fresh ResizeArray<'T> at every chunk boundary:

yield prevKey, currentChunk.ToArray()
currentChunk <- ResizeArray<'T>()   // allocates every time
currentChunk.Add item

Since ToArray() already creates an independent copy of the chunk contents, the existing ResizeArray backing array can be safely reused by calling Clear() instead:

yield prevKey, currentChunk.ToArray()
currentChunk.Clear()                // zero cost — no heap allocation
currentChunk.Add item

Root Cause

ResizeArray<'T>() allocates a new object on the heap (with default initial capacity 4). Clear() simply sets the internal count to zero and retains the existing backing array. Because ToArray() captures a snapshot before the clear, the yielded arrays are unaffected.

Impact

  • Eliminates one ResizeArray allocation per chunk boundary
  • Reduces GC pressure proportionally to the number of distinct key transitions in the source sequence
  • Particularly noticeable for sequences with many short runs (e.g., alternating keys)

Test Status

  • ✅ Build: dotnet build — succeeded, 0 warnings
  • ✅ Tests: dotnet test5017 passed, 2 skipped (skipped tests are pre-existing infrastructure-only skips)
  • ✅ Formatting: dotnet fantomas . --check — no issues

Generated by 🌈 Repo Assist at {run-started}. Learn more.

To install this agentic workflow, run

gh aw add githubnext/agentics/workflows/repo-assist.md@1f672aef974f4246124860fc532f82fe8a93a57e

Previously, each time a new chunk was started in chunkBy/chunkByAsync,
a fresh ResizeArray was allocated. Since ToArray() already captures an
independent copy of the data, the internal buffer can be safely reused
by calling Clear() instead of constructing a new ResizeArray.

This eliminates one heap allocation per chunk boundary, reducing GC
pressure for sequences with many distinct runs.

5017 tests pass, 2 skipped.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dsyme dsyme marked this pull request as ready for review April 1, 2026 16:04
@dsyme dsyme merged commit 24c7791 into main Apr 1, 2026
4 checks passed
@dsyme dsyme deleted the repo-assist/perf-chunkby-reuse-buffer-20260331-571b7516713643ae branch April 1, 2026 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant