feat: add GroupColumn support for List / Struct / Map / FixedSizeList in multi-column GROUP BY

### Is your feature request related to a problem or challenge?

When a `GROUP BY` includes any column with a nested type (`List`, `LargeList`, `FixedSizeList`, `Struct`, `Map`), `multi_group_by::supported_type` returns false for that column and the whole grouping falls back to the byte-encoded `GroupValuesRows` path via `arrow_row::Rows`, even if every other column would be supported by `GroupValuesColumn`. Today's `supported_type` ([source](https://github.com/apache/datafusion/blob/main/datafusion/physical-plan/src/aggregates/group_values/multi_group_by/mod.rs#L1213-L1239)):

```rust
fn supported_type(data_type: &DataType) -> bool {
    matches!(*data_type,
        DataType::Int8 | Int16 | ... | Float64
        | Decimal128(_, _)
        | Utf8 | LargeUtf8 | Utf8View
        | Binary | LargeBinary | BinaryView
        | Date32 | Date64 | Time32(_) | Timestamp(_, _)
        | Boolean
    )
}
```

### Impact

A wide multi-column `GROUP BY` where most columns are individually supported by `GroupValuesColumn` (Utf8 / Date / Boolean / Float) but one column is a nested type (e.g. `LargeList<Struct<...>>`) is forced onto the slow `arrow_row::Rows` path. Group key memory grows materially compared to the equivalent column-format storage, and per-group comparisons become byte-by-byte rather than vectorized.

### Describe the solution you'd like

Add `GroupColumn` implementations for the missing nested types and extend `supported_type` accordingly. Equality is well-defined by Arrow's `make_equal`-style comparators; hashing of nested values has existing arrow utilities.

Suggested incremental order, easiest first:

1. `FixedSizeList<T>` where `T` is a `GroupColumn`-supported primitive (fixed-width per row, simplest)
2. `List<T>` and `LargeList<T>` with primitive child (variable-length, single offset buffer)
3. `Struct<...>` with all fields being `GroupColumn`-supported (composition of existing columns under a single null mask)
4. `Map<K, V>` (most complex due to entry ordering semantics)

Each can be its own PR. The existing `arrow_row::Rows` path remains the fallback for any nested type not yet supported, so this is purely an optimization.

### Describe alternatives you've considered

- **arrow_row fallback (current state)**: correct, but significantly more memory and CPU than `GroupValuesColumn` when the grouping mostly consists of cheap columns and the nested column is one of many.
- **Avoid grouping by the nested column at the SQL level**: works case-by-case (e.g. wrap in `ANY_VALUE`) but does not help when the nested column is the legitimate group key.

### Additional context

Issue #22526 is a separate, related pathology on the emit side. This issue is about the input/intern side (which `GroupValues` impl is selected). The two are independent and can land in either order.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add GroupColumn support for List / Struct / Map / FixedSizeList in multi-column GROUP BY #22682

Is your feature request related to a problem or challenge?

Impact

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: add GroupColumn support for List / Struct / Map / FixedSizeList in multi-column GROUP BY #22682

Description

Is your feature request related to a problem or challenge?

Impact

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions