feat: Add support for MapArray in arrow_row#9486
feat: Add support for MapArray in arrow_row#9486rluvaton wants to merge 4 commits intoapache:mainfrom
MapArray in arrow_row#9486Conversation
|
@Jefffrey can you please review? |
| /// Lists are encoded by first encoding all child elements to the row format. | ||
| /// | ||
| /// A list value is then encoded as the concatenation of each of the child elements, | ||
| /// the Map encoding is the same with the only difference being that the child elements are key-value pairs |
There was a problem hiding this comment.
Nit: capitalization & period.
| Ok(Self::List(converter)) | ||
| } | ||
| DataType::Map(f, _) => { | ||
| // The encoded contents will be inverted if descending is set to true |
There was a problem hiding this comment.
Nit: backticks around descending (for consitency) + period.
| DataType::Map(f, _) => { | ||
| // The encoded contents will be inverted if descending is set to true | ||
| // As such we set `descending` to false and negate nulls first if it | ||
| // it set to true |
There was a problem hiding this comment.
"it" seems ambiguous -- is it nulls_first or descending? Suggestion: "if descending was set to true".
| SortField::new_with_options(struct_field.data_type().clone(), options) | ||
| }) | ||
| .collect::<Vec<_>>(); | ||
| assert_eq!(fields.len(), 2); |
There was a problem hiding this comment.
Why is this safe? (in contrast with returning an Err)
|
|
||
| #[derive(Clone, Copy)] | ||
| enum GenerateAllUniqueNullBehavior { | ||
| AllowUpToSingleNull { valid_percent: f64 }, |
There was a problem hiding this comment.
Is probalistically adding a Null actually better than always inserting one?
| .map(|_| { | ||
| let mut value; | ||
|
|
||
| loop { |
There was a problem hiding this comment.
If you pass K=Int8Type with len=257 then you end up with an infinite loop.
| loop { | ||
| match null_behavior { | ||
| GenerateAllUniqueNullBehavior::AllValid => { | ||
| value = Some(rng.random()); |
There was a problem hiding this comment.
you can factor out value = outside the match.
Jefffrey
left a comment
There was a problem hiding this comment.
Generally looks good to me
Which issue does this PR close?
MapArrayinRowConverter#7879.Rationale for this change
Allow using arrow_row with more types
What changes are included in this PR?
Reused code from list encoding for MapArray by adding some generics
Are these changes tested?
Yes
Are there any user-facing changes?
Yes, MapArray is now supported in row_format