Skip to content

feat: Add support for MapArray in arrow_row#9486

Open
rluvaton wants to merge 4 commits intoapache:mainfrom
rluvaton:add-support-for-map-in-row-format
Open

feat: Add support for MapArray in arrow_row#9486
rluvaton wants to merge 4 commits intoapache:mainfrom
rluvaton:add-support-for-map-in-row-format

Conversation

@rluvaton
Copy link
Member

Which issue does this PR close?

Rationale for this change

Allow using arrow_row with more types

What changes are included in this PR?

Reused code from list encoding for MapArray by adding some generics

Are these changes tested?

Yes

Are there any user-facing changes?

Yes, MapArray is now supported in row_format

@github-actions github-actions bot added the arrow Changes to the arrow crate label Feb 26, 2026
@rluvaton rluvaton marked this pull request as ready for review February 26, 2026 16:03
@rluvaton
Copy link
Member Author

@Jefffrey can you please review?
apache/datafusion#15428 (comment)

/// Lists are encoded by first encoding all child elements to the row format.
///
/// A list value is then encoded as the concatenation of each of the child elements,
/// the Map encoding is the same with the only difference being that the child elements are key-value pairs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: capitalization & period.

Ok(Self::List(converter))
}
DataType::Map(f, _) => {
// The encoded contents will be inverted if descending is set to true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: backticks around descending (for consitency) + period.

DataType::Map(f, _) => {
// The encoded contents will be inverted if descending is set to true
// As such we set `descending` to false and negate nulls first if it
// it set to true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"it" seems ambiguous -- is it nulls_first or descending? Suggestion: "if descending was set to true".

SortField::new_with_options(struct_field.data_type().clone(), options)
})
.collect::<Vec<_>>();
assert_eq!(fields.len(), 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this safe? (in contrast with returning an Err)


#[derive(Clone, Copy)]
enum GenerateAllUniqueNullBehavior {
AllowUpToSingleNull { valid_percent: f64 },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is probalistically adding a Null actually better than always inserting one?

.map(|_| {
let mut value;

loop {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you pass K=Int8Type with len=257 then you end up with an infinite loop.

loop {
match null_behavior {
GenerateAllUniqueNullBehavior::AllValid => {
value = Some(rng.random());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can factor out value = outside the match.

Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support MapArray in RowConverter

3 participants