Skip to content

[fix](be) Materialize variant defaults when copying ranges#64085

Closed
eldenmoon wants to merge 1 commit into
apache:masterfrom
eldenmoon:branch-doris-26104-26121
Closed

[fix](be) Materialize variant defaults when copying ranges#64085
eldenmoon wants to merge 1 commit into
apache:masterfrom
eldenmoon:branch-doris-26104-26121

Conversation

@eldenmoon
Copy link
Copy Markdown
Member

What problem does this PR solve?

Issue Number: close #0

Related PR: #0

Problem Summary: Copying a VARIANT subcolumn range could skip the source subcolumn's pending default suffix. The destination variant kept the requested logical row count while its finalized root column could remain physically shorter than the copied range. Exchange and join paths that copy blocks containing such VARIANT columns could then read missing rows, fail, or return unstable results. The fix appends the remaining default rows after copied physical parts so the logical and physical row counts stay aligned.

Release note

Fix an issue where queries using VARIANT columns through exchange or join paths could fail or return unstable results when copied VARIANT subcolumns contained pending default rows.

Check List (For Author)

  • Test
    • Unit Test: ./run-be-ut.sh --run --filter='ColumnVariantTest.insert_range_from_materializes_pending_default_suffix'
    • Build: ./build.sh --be
    • Manual test: local-shuffle LEFT ANTI query loop, 8 workers x 100 iterations, all 800 results were 0 with no ColumnVector or insert_range_from errors.
    • Manual test: constructed complex VARIANT hash join with local shuffle enabled and disabled returned identical hashes; local shuffle enabled loop was stable for 100 iterations.
    • Format: build-support/clang-format.sh and build-support/check-format.sh
  • Behavior changed:
    • Yes. VARIANT range copies now materialize pending default rows so copied columns remain physically aligned with logical row counts.
  • Does this need documentation?
    • No.

### What problem does this PR solve?

Issue Number: close #0

Related PR: #0

Problem Summary: Copying a VARIANT subcolumn range could skip the source subcolumn's pending default suffix. The destination variant kept the requested logical row count while its finalized root column could remain physically shorter than the copied range. Exchange and join paths that copy blocks containing such VARIANT columns could then read missing rows, fail, or return unstable results. The fix appends the remaining default rows after copied physical parts so the logical and physical row counts stay aligned.

### Release note

Fix an issue where queries using VARIANT columns through exchange or join paths could fail or return unstable results when copied VARIANT subcolumns contained pending default rows.

### Check List (For Author)

- Test
    - [x] Unit Test: ./run-be-ut.sh --run --filter='ColumnVariantTest.insert_range_from_materializes_pending_default_suffix'
    - [x] Build: ./build.sh --be
    - [x] Manual test: attached local-shuffle LEFT ANTI query loop, 8 workers x 100 iterations, all 800 results were 0 with no ColumnVector or insert_range_from errors.
    - [x] Manual test: constructed complex VARIANT hash join with local shuffle enabled and disabled returned identical hashes; local shuffle enabled loop was stable for 100 iterations.
- Behavior changed:
    - [x] Yes. VARIANT range copies now materialize pending default rows so copied columns remain physically aligned with logical row counts.
- Does this need documentation?
    - [x] No.
Copilot AI review requested due to automatic review settings June 3, 2026 12:51
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@eldenmoon
Copy link
Copy Markdown
Member Author

run buildall

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29172 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 67c5920fb4dfb7d0b8e2fb6830c0f316f5a36880, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17857	4058	4047	4047
q2	q3	10825	1350	812	812
q4	4687	478	341	341
q5	7558	877	591	591
q6	186	172	140	140
q7	770	867	654	654
q8	9406	1508	1576	1508
q9	5831	4422	4465	4422
q10	6767	1837	1564	1564
q11	441	269	249	249
q12	626	424	296	296
q13	18129	3397	2745	2745
q14	271	264	236	236
q15	q16	818	776	709	709
q17	881	861	954	861
q18	6773	5684	5560	5560
q19	1209	1199	1065	1065
q20	521	413	270	270
q21	6398	2837	2778	2778
q22	466	390	324	324
Total cold run time: 100420 ms
Total hot run time: 29172 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5023	4691	4760	4691
q2	q3	4832	5304	4719	4719
q4	2176	2204	1411	1411
q5	4878	4800	4640	4640
q6	233	184	131	131
q7	1984	1784	1570	1570
q8	2389	2098	2151	2098
q9	7844	7508	7358	7358
q10	4706	4694	4196	4196
q11	540	380	349	349
q12	722	740	530	530
q13	3006	3365	2767	2767
q14	281	285	256	256
q15	q16	678	693	619	619
q17	1281	1261	1246	1246
q18	7296	6852	6808	6808
q19	1126	1093	1117	1093
q20	2219	2205	1932	1932
q21	5309	4553	4443	4443
q22	510	451	420	420
Total cold run time: 57033 ms
Total hot run time: 51277 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169682 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 67c5920fb4dfb7d0b8e2fb6830c0f316f5a36880, data reload: false

query5	4312	650	475	475
query6	468	216	184	184
query7	4820	596	299	299
query8	377	242	202	202
query9	8753	4101	4072	4072
query10	479	316	270	270
query11	5929	2418	2142	2142
query12	168	109	104	104
query13	1305	618	442	442
query14	6474	5415	5115	5115
query14_1	4445	4436	4481	4436
query15	209	198	183	183
query16	998	456	411	411
query17	952	704	597	597
query18	2479	499	353	353
query19	223	188	148	148
query20	111	109	105	105
query21	220	141	125	125
query22	13652	13684	13390	13390
query23	17302	16538	16250	16250
query23_1	16205	16342	16300	16300
query24	7644	1797	1318	1318
query24_1	1306	1312	1344	1312
query25	591	474	417	417
query26	1348	342	173	173
query27	2618	600	336	336
query28	4501	2060	2034	2034
query29	1093	627	510	510
query30	318	242	203	203
query31	1125	1101	962	962
query32	116	64	62	62
query33	530	333	287	287
query34	1209	1186	656	656
query35	782	778	695	695
query36	1382	1397	1255	1255
query37	156	105	90	90
query38	3212	3136	3052	3052
query39	946	927	906	906
query39_1	887	859	868	859
query40	224	123	101	101
query41	65	63	67	63
query42	92	95	97	95
query43	324	334	282	282
query44	
query45	200	187	182	182
query46	1056	1213	769	769
query47	2364	2363	2169	2169
query48	410	440	295	295
query49	641	462	365	365
query50	954	358	264	264
query51	4341	4306	4285	4285
query52	91	91	84	84
query53	240	274	190	190
query54	274	220	194	194
query55	83	77	71	71
query56	254	234	218	218
query57	1439	1379	1306	1306
query58	243	208	217	208
query59	1619	1696	1500	1500
query60	280	253	250	250
query61	162	159	165	159
query62	695	665	574	574
query63	236	190	187	187
query64	2580	791	631	631
query65	
query66	1825	479	345	345
query67	29673	29788	29526	29526
query68	
query69	430	307	264	264
query70	979	926	927	926
query71	289	212	214	212
query72	2998	2737	2452	2452
query73	855	794	433	433
query74	5137	4945	4771	4771
query75	2659	2624	2244	2244
query76	2321	1165	785	785
query77	349	372	286	286
query78	12382	12445	11891	11891
query79	1474	1023	774	774
query80	1001	488	397	397
query81	510	272	239	239
query82	735	159	122	122
query83	359	277	258	258
query84	318	147	115	115
query85	963	530	442	442
query86	427	303	288	288
query87	3404	3337	3216	3216
query88	3703	2779	2810	2779
query89	447	369	332	332
query90	1838	182	173	173
query91	180	180	139	139
query92	67	64	55	55
query93	1440	1496	922	922
query94	641	361	315	315
query95	681	391	436	391
query96	1097	847	344	344
query97	2688	2691	2545	2545
query98	211	206	220	206
query99	1165	1190	1098	1098
Total cold run time: 252159 ms
Total hot run time: 169682 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (4/4) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.84% (27462/38226)
Line Coverage 55.43% (294123/530626)
Region Coverage 52.14% (245245/470330)
Branch Coverage 53.38% (106129/198824)

@eldenmoon eldenmoon closed this Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants