Skip to content

perf: offload BeautifulSoup parsing to a thread via asyncio.to_thread#1817

Merged
vdusek merged 1 commit intomasterfrom
perf/beautifulsoup-parse-offload-to-thread
Mar 29, 2026
Merged

perf: offload BeautifulSoup parsing to a thread via asyncio.to_thread#1817
vdusek merged 1 commit intomasterfrom
perf/beautifulsoup-parse-offload-to-thread

Conversation

@vdusek
Copy link
Copy Markdown
Collaborator

@vdusek vdusek commented Mar 28, 2026

Summary

  • BeautifulSoup() constructor performs synchronous HTML parsing that blocks the event loop, degrading concurrency.
  • Wrapped parse() and parse_text() in asyncio.to_thread() to offload parsing to a worker thread, consistent with how ParselParser.parse() already works.

Test plan

  • All 78 existing BeautifulSoup crawler tests pass

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vdusek vdusek added t-tooling Issues with this label are in the ownership of the tooling team. adhoc Ad-hoc unplanned task added during the sprint. labels Mar 28, 2026
@vdusek vdusek self-assigned this Mar 28, 2026
@github-actions github-actions bot added this to the 137th sprint - Tooling team milestone Mar 28, 2026
@vdusek vdusek changed the title perf: offload BeautifulSoup parsing to a thread via asyncio.to_thread() perf: offload BeautifulSoup parsing to a thread via asyncio.to_thread Mar 28, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.23%. Comparing base (02a18ea) to head (87a846b).
⚠️ Report is 3 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1817      +/-   ##
==========================================
- Coverage   92.23%   92.23%   -0.01%     
==========================================
  Files         157      157              
  Lines       10863    10884      +21     
==========================================
+ Hits        10020    10039      +19     
- Misses        843      845       +2     
Flag Coverage Δ
unit 92.23% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vdusek vdusek requested review from Mantisus and Pijukatel March 28, 2026 09:49
Copy link
Copy Markdown
Collaborator

@Mantisus Mantisus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vdusek vdusek merged commit d612ffa into master Mar 29, 2026
31 of 32 checks passed
@vdusek vdusek deleted the perf/beautifulsoup-parse-offload-to-thread branch March 29, 2026 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants