Mark BufferExec and AnalyzeExec as eager#22711
Open
geoffreyclaude wants to merge 3 commits into
Open
Conversation
298245c to
aab6e33
Compare
gabotechs
approved these changes
Jun 2, 2026
Contributor
gabotechs
left a comment
There was a problem hiding this comment.
👍 +1 from me, this looks more correct now.
Comment on lines
-1220
to
+1249
| plan.properties().evaluation_type == EvaluationType::Eager | ||
| if let Some(repartition) = plan.downcast_ref::<RepartitionExec>() { | ||
| !matches!(repartition.partitioning(), Partitioning::RoundRobinBatch(_)) | ||
| } else if let Some(coalesce) = plan.downcast_ref::<CoalescePartitionsExec>() { | ||
| coalesce.input().output_partitioning().partition_count() > 1 | ||
| } else if let Some(sort_preserving_merge) = | ||
| plan.downcast_ref::<SortPreservingMergeExec>() | ||
| { | ||
| sort_preserving_merge | ||
| .input() | ||
| .output_partitioning() | ||
| .partition_count() | ||
| > 1 | ||
| } else { | ||
| false | ||
| } |
Contributor
There was a problem hiding this comment.
Some operators that are correctly marked as Eager, but that this function was wrongly treating them as need_data_exchange() -> true:
Contributor
Author
There was a problem hiding this comment.
need_data_exchange was pretty broken then... As discussed privately, we should look into deprecating then deleting this method, as probably no-one is using it (at least correctly.)
Contributor
|
I'll leave this here marinating for a day so that other people can chime in, otherwise, I'll merge it tomorrow |
aab6e33 to
334cb57
Compare
99f62b2 to
334cb57
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
BufferExecandAnalyzeExecboth have eager evaluation behavior: they can drive child streams in spawned tasks instead of performing exactly one downstream poll worth of work at a time. TheirPlanPropertiescurrently reportEvaluationType::Lazy, so physical-plan metadata does not describe how these operators execute.While making that metadata accurate, this PR also keeps
need_data_exchange(...)scoped to its original purpose. #4585 introduced that helper as a way to identify physical operators that require exchange-style handling because they redistribute partitions or gather multiple input partitions into one output partition. #4586 implemented it for the native exchange/gather operators called out there: non-round-robinRepartitionExec,CoalescePartitionsExec, andSortPreservingMergeExec.The later cooperative-scheduling work in #16398 introduced
EvaluationTypeand madeneed_data_exchange(...)useevaluation_type == EvaluationType::Eager. That shortcut worked when the eager operators and exchange/gather operators were the same set. WithBufferExecandAnalyzeExeccorrectly classified as eager, the shortcut would makeneed_data_exchange(...)report operators that do eager child polling but do not perform a data exchange.So the intended split in this PR is:
EvaluationTypedescribes execution behavior.need_data_exchange(...)identifies partition redistribution or partition gathering.What changes are included in this PR?
BufferExecasEvaluationType::Eagerwhile keeping its existing cooperative scheduling metadata.AnalyzeExecasEvaluationType::Eagerin its computed plan properties.EvaluationTypedocs so eager evaluation is not defined by whether work starts inexecuteor on the first stream poll.need_data_exchange(...)as an exchange/gather predicate for the native exchange operators instead of deriving it from all eager operators.Are these changes tested?
Targeted tests pass:
I also added a focused regression test that
BufferExecdoes not require data exchange. I did not add tests that merely assert the assignedEvaluationType, as those would duplicate the implementation rather than cover behavior.Are there any user-facing changes?
No query-result changes are expected. This updates physical-plan metadata and keeps
need_data_exchange(...)scoped to operators that move data between partitions or gather partitions together.