Replace ArrayDataBuilder with GenericListArray in ListArrayDecoder#9497
Replace ArrayDataBuilder with GenericListArray in ListArrayDecoder#9497liamzwbao wants to merge 1 commit intoapache:mainfrom
ArrayDataBuilder with GenericListArray in ListArrayDecoder#9497Conversation
|
Any particular benchmark you want me to run? |
alamb
left a comment
There was a problem hiding this comment.
Thanks @liamzwbao -- this looks good to me -- I have just one question
| // Safety | ||
| // Validated lengths above | ||
| Ok(unsafe { data.build_unchecked() }) | ||
| let array = GenericListArray::<O>::try_new(field, offsets, values, nulls)?; |
There was a problem hiding this comment.
Does try_new validate the offsets ? That could be a significant performance hit
Basically as long as this doesn't do additional validation I think it looks good to me
There was a problem hiding this comment.
Looks like it will do validation, but I think it's cheap and the local benchmark does not cause a regression. BTW, we don't have an unchecked API for ListArray, the new API simply unwrap the try_new.
arrow-rs/arrow-array/src/array/list_array.rs
Lines 205 to 251 in e4b68e6
| // Validated lengths above | ||
| Ok(unsafe { data.build_unchecked() }) | ||
| let array = GenericListArray::<O>::try_new(field, offsets, values, nulls)?; | ||
| Ok(array.into_data()) |
There was a problem hiding this comment.
I think this call simply creates an ArrayData (which is necessary given the API) so I am not sure it actually avoids any ArrayDatas
In order to avoid ArrayData we would probably need to change the signature of decode to return an ArrayRef directly (rather than ArrayData)
There was a problem hiding this comment.
Makes sense. I can change the signature of all the decoders if you think it's worth a try
This one should be good: https://github.com/apache/arrow-rs/blob/main/arrow-json/benches/json-reader.rs |
|
Hi @alamb, just noticed that we don't have benchmark for List yet. I will create one and get back to you later |
Which issue does this PR close?
ArrayDatawith direct Array construction, when possible #9298.Rationale for this change
While implementing
ListViewArrayDecoderin arrow-json, I noticed we could potentially retireArrayDataBuilderinsideListArrayDecoder. Therefore, I'd like to use a small PR here to make sure there's no regressionWhat changes are included in this PR?
Replace
ArrayDataBuilderwithGenericListArrayinListArrayDecoderAre these changes tested?
Covered by existing tests
Are there any user-facing changes?
No