Skip to content

fix: array_size returns -1 instead of null for null input#4578

Merged
andygrove merged 4 commits into
apache:mainfrom
andygrove:fix-array-size-null
Jun 3, 2026
Merged

fix: array_size returns -1 instead of null for null input#4578
andygrove merged 4 commits into
apache:mainfrom
andygrove:fix-array-size-null

Conversation

@andygrove
Copy link
Copy Markdown
Member

Which issue does this PR close?

Closes #4560.

Rationale for this change

array_size(NULL) should return NULL, but Comet returned -1.

array_size is RuntimeReplaceable and lowers to Size(child, legacySizeOfNull = false). With legacySizeOfNull = false, Spark returns NULL for a NULL input; with legacySizeOfNull = true it returns -1.

Comet's Size serde built the else branch of its CASE WHEN from the session conf SQLConf.get.legacySizeOfNull rather than from the expression's own legacySizeOfNull field. Since spark.sql.legacy.sizeOfNull defaults to true, the else literal became -1 even though array_size had explicitly set the expression flag to false. The query runs natively via CometProject, so it did not fall back: it just returned the wrong value.

What changes are included in this PR?

CometSize.convert now reads expr.legacySizeOfNull instead of SQLConf.get.legacySizeOfNull. This honors the flag baked into each Size instance, which already accounts for both the array_size override (always false) and the ANSI handling Spark applies when constructing the expression. The regular size and cardinality paths are unaffected because their expressions carry the resolved session value.

How are these changes tested?

Added a test in CometArrayExpressionSuite asserting array_size returns NULL for a NULL array input under both spark.sql.legacy.sizeOfNull=false and true. The existing size - respect to legacySizeOfNull test continues to pass, confirming the size path is unchanged.

Copy link
Copy Markdown
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove

@andygrove andygrove merged commit 755dcd2 into apache:main Jun 3, 2026
133 of 134 checks passed
@hsiang-c
Copy link
Copy Markdown
Contributor

hsiang-c commented Jun 3, 2026

Thanks @andygrove

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

array_size returns -1 instead of NULL for NULL input

3 participants