Skip to content

feat: Native Broadcast nested loop join support#4429

Open
coderfender wants to merge 33 commits into
apache:mainfrom
coderfender:broadcast_nested_join_loop_support
Open

feat: Native Broadcast nested loop join support#4429
coderfender wants to merge 33 commits into
apache:mainfrom
coderfender:broadcast_nested_join_loop_support

Conversation

@coderfender

@coderfender coderfender commented May 25, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #198

Wires Spark's BroadcastNestedLoopJoinExec to DataFusion's NestedLoopJoinExec.

Rationale for this change

Native support for BroadcastNestedLoopJoin to improve native performance.

What changes are included in this PR?

How are these changes tested?

  1. Unit tests in CometJoinSuite
  2. Setup BNLJ benches

@coderfender coderfender changed the title feat: Broadcast nested join loop support feat: Native Broadcast nested join loop support May 25, 2026
@coderfender coderfender force-pushed the broadcast_nested_join_loop_support branch from 1ef4611 to 57c26b8 Compare May 25, 2026 17:40
@coderfender coderfender changed the title feat: Native Broadcast nested join loop support feat: Native Broadcast nested loop join support May 25, 2026
@coderfender

Copy link
Copy Markdown
Contributor Author

Bunch TPCDS queries seem to be using BNLJ warranting golden file regen

@coderfender coderfender marked this pull request as ready for review May 28, 2026 22:17
@coderfender

coderfender commented May 28, 2026

Copy link
Copy Markdown
Contributor Author

Benchmarks (local M5 pro) (Edited after running benchmarks with JIT disabled) :

Running benchmark: range join (BETWEEN)
  Running case: Spark
  Stopped after 23 iterations, 2043 ms
  Running case: Comet
  Stopped after 33 iterations, 2051 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 26.3.2
Apple M5 Pro
range join (BETWEEN):                     Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                73             89          10         14.4          69.7       1.0X
Comet                                                49             62          11         21.4          46.8       1.5X

Running benchmark: inequality join (>)
  Running case: Spark
  Stopped after 21 iterations, 2038 ms
  Running case: Comet
  Stopped after 34 iterations, 2028 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 26.3.2
Apple M5 Pro
inequality join (>):                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                78             97          19         13.4          74.4       1.0X
Comet                                                48             60           6         21.8          46.0       1.6X

Running benchmark: left outer non-equi
  Running case: Spark
  Stopped after 27 iterations, 2028 ms
  Running case: Comet
  Stopped after 35 iterations, 2028 ms

OpenJDK 64-Bit Server VM 17.0.16+8-LTS on Mac OS X 26.3.2
Apple M5 Pro
left outer non-equi:                      Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Spark                                                63             75          11         16.6          60.1       1.0X
Comet                                                42             58          16         25.2          39.7       1.5X

@coderfender

Copy link
Copy Markdown
Contributor Author

TODO : Build vs Probe swap to DF's BNLJ op

@coderfender

Copy link
Copy Markdown
Contributor Author

Investigating test failures

@coderfender coderfender force-pushed the broadcast_nested_join_loop_support branch from eed19ae to eb3fe00 Compare May 29, 2026 22:29
@coderfender coderfender force-pushed the broadcast_nested_join_loop_support branch 2 times, most recently from 7221f56 to d5a4bf5 Compare May 29, 2026 23:35

@mbutrovich mbutrovich left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of feedback. Thanks @coderfender! This is looking really good!

" DataFusion's NestedLoopJoinExec does not provide. Affects: LeftOuter+BuildLeft," +
" RightOuter+BuildRight, FullOuter, LeftSemi+BuildLeft, LeftAnti+BuildLeft."

override def getSupportLevel(op: BroadcastNestedLoopJoinExec): SupportLevel =

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The support matrix here lines up exactly with Spark's BroadcastNestedLoopJoinExec.outputPartitioning supported set, and the reasoning in broadcastBuildReplicationReason is spot on. Every case returns either Compatible(None) or Unsupported(...), never Incompatible. Since isOperatorEnabled only consults the allow-incompat config on the Incompatible branch, the getOperatorAllowIncompatConfigKey("BroadcastNestedLoopJoinExec") line in the CometJoinSuite override is dead. Is that intentional, or leftover?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I was initially gating the join behind the config in the test suite to figure out the support matrix through testing. I have since then removed the now redundant config

Comment thread spark/src/main/scala/org/apache/spark/sql/comet/operators.scala
Comment thread spark/src/test/scala/org/apache/comet/exec/CometJoinSuite.scala Outdated
Comment thread spark/src/test/scala/org/apache/comet/exec/CometJoinSuite.scala
@coderfender coderfender force-pushed the broadcast_nested_join_loop_support branch 2 times, most recently from e489bd8 to 4115ada Compare June 9, 2026 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support BroadcastNestedLoopJoinExec

2 participants