54.0.0 (2024-12-18)
Breaking changes:
- avoid redundant parsing of repeated value in RleDecoder #6834 [parquet] (jp0317)
- Handling nullable DictionaryArray in CSV parser #6830 [arrow] (edmondop)
- fix(flightsql): remove Any encoding of DoPutUpdateResult #6825 [arrow] [arrow-flight] (davisp)
- arrow-ipc: Default to not preserving dict IDs #6788 [arrow] (brancz)
- Remove some very old deprecated functions #6774 [parquet] [arrow] (alamb)
- update to pyo3 0.23.0 #6745 [arrow] (psvri)
- Remove APIs deprecated since v 4.4.0 #6722 [arrow] [arrow-flight] (findepi)
- Return
None
when Parquet page indexes are not present in file #6639 [parquet] (etseidl) - Add
ParquetError::NeedMoreData
markParquetError
asnon_exhaustive
#6630 [parquet] (etseidl) - Remove APIs deprecated since v 2.0.0 #6609 [arrow] (findepi)
Implemented enhancements:
- Parquet schema hint doesn't support integer types upcasting #6891 [parquet]
- Parquet UTF-8 max statistics are overly pessimistic #6867 [parquet]
- Add builder support for Int8 keys #6844 [arrow]
- Formalize the name of the nested
Field
in a list #6784 [parquet] [arrow] [arrow-flight] - Allow disabling the writing of Parquet Offset Index #6778 [parquet]
parquet::record::make_row
is not exposed to users, leaving no option to users to manually createRow
objects #6761 [parquet]- Avoid
from_num_days_from_ce_opt
calls intimestamp_s_to_datetime
if we don't need #6746 [arrow] - Support Temporal -> Utf8View casting #6734 [arrow]
- Add Option To Coerce List Type on Parquet Write #6733 [parquet] [arrow]
- Support Numeric -> Utf8View casting #6714 [arrow]
- Support Utf8View <=> boolean casting #6713 [arrow]
Fixed bugs:
Buffer::bit_slice
loses length with byte-aligned offsets #6895 [arrow]- parquet arrow writer doesn't track memory size correctly for fixed sized lists #6839 [parquet]
- Casting Decimal128 to Decimal128 with smaller precision produces incorrect results in some cases #6833 [arrow]
- Should empty nullable dictionary be parsed as null from arrow-csv? #6821 [arrow]
- Array take doesn't make fields nullable #6809
- Arrow Flight Encodes a Slice's List Offsets If the slice offset is starts with zero #6803 [arrow]
- Parquet readers incorrectly interpret legacy nested lists #6756 [parquet]
- filter_bits under-allocates resulting boolean buffer #6750 [arrow]
- Multi-language support issues with Arrow FlightSQL client's execute_update and execute_ingest methods #6545 [arrow] [arrow-flight]
Documentation updates:
- Should we document at what rate deprecated APIs are removed? #6851 [parquet] [arrow]
- Fix docstring for
Format::with_header
inarrow-csv
#6856 [arrow] (kylebarron) - Add deprecation / API removal policy #6852 [parquet] [arrow] (alamb)
- Minor: add example for creating
SchemaDescriptor
#6841 [parquet] (alamb) - chore: enrich panic context when BooleanBuffer fails to create #6810 [arrow] (tisonkun)
Closed issues:
- [FlightSQL] GetCatalogsBuilder does not sort the catalog names #6807 [arrow] [arrow-flight]
- Add a lint to automatically check for unused dependencies #6796 [arrow] [arrow-flight]
Merged pull requests:
- doc: add comment for timezone string #6899 [arrow] (xxchan)
- docs: fix typo #6890 [arrow] (rluvaton)
- Minor: Fix deprecation notice for
arrow_to_parquet_schema
#6889 [parquet] (etseidl) - Add Field::with_dict_is_ordered #6885 [arrow] (alamb)
- Deprecate "max statistics size" property in
WriterProperties
#6884 [parquet] (etseidl) - Add deprecation warnings for everything related to
dict_id
#6873 [parquet] [arrow] [arrow-flight] (brancz) - Enable matching temporal as from_type to Utf8View #6872 [arrow] (Kev1n8)
- Enable string-based column projections from Parquet files #6871 [parquet] (etseidl)
- Improvements to UTF-8 statistics truncation #6870 [parquet] (etseidl)
- fix: make GetCatalogsBuilder sort catalog names #6864 [arrow] [arrow-flight] (niebayes)
- add buffered data_pages to parquet column writer total bytes estimation #6862 [parquet] (onursatici)
- Update prost-build requirement from =0.13.3 to =0.13.4 #6860 [arrow] [arrow-flight] (dependabot[bot])
- Minor: add comments explaining bad MSRV, output in json #6857 (alamb)
- perf: Use Cow in get_format_string in FFI_ArrowSchema #6853 [arrow] (andygrove)
- chore: add cast_decimal benchmark #6850 [arrow] (andygrove)
- arrow-array::builder: support Int8, Int16 and Int64 keys #6845 [arrow] (ajwerner)
- Add
ArrowToParquetSchemaConverter
, deprecatearrow_to_parquet_schema
#6840 [parquet] (alamb) - Remove APIs deprecated in 50.0.0 #6838 [arrow] (findepi)
- fix: decimal conversion looses value on lower precision #6836 [arrow] (himadripal)
- Update sysinfo requirement from 0.32.0 to 0.33.0 #6835 [parquet] (dependabot[bot])
- Optionally coerce names of maps and lists to match Parquet specification #6828 [parquet] (etseidl)
- Remove deprecated unary_dyn and try_unary_dyn #6824 [arrow] (findepi)
- Remove deprecated flight_data_from_arrow_batch #6823 [arrow] [arrow-flight] (findepi)
- [arrow-cast] Support cast boolean from/to string view #6822 [arrow] (tlm365)
- Hook up Avro Decoder #6820 [arrow] (tustvold)
- Fix arrow-avro compilation without default features #6819 [arrow] (findepi)
- Support shrink to empty #6817 [arrow] (tustvold)
- [arrow-cast] Support cast numeric to string view (alternate) #6816 [arrow] (alamb)
- Hide implicit optional dependency features in arrow-flight #6806 [arrow] [arrow-flight] (findepi)
- fix: Encoding of List offsets was incorrect when slice offsets begin with zero #6805 [arrow] (HawaiianSpork)
- Enable unused_crate_dependencies Rust lint, remove unused dependencies #6804 [arrow] [arrow-flight] (findepi)
- Minor: Fix docstrings for
ColumnProperties::statistics_enabled
property #6798 [parquet] (etseidl) - Add option to disable writing of Parquet offset index #6797 [parquet] (etseidl)
- Remove unused dependencies #6792 [arrow] [arrow-flight] (findepi)
- Add
Array::shrink_to_fit(&mut self)
#6790 [arrow] (emilk) - Formalize the default nested list field name to
item
#6785 [parquet] [arrow] [arrow-flight] (gruuya) - Improve UnionArray logical_nulls tests #6781 [arrow] (gstvg)
- Improve list builder usage example in docs #6775 [arrow] (findepi)
- Update proc-macro2 requirement from =1.0.89 to =1.0.92 #6772 [arrow] [arrow-flight] (dependabot[bot])
- Allow NullBuffer construction directly from array #6769 [parquet] [arrow] (findepi)
- Include license and notice files in published crates #6767 [parquet] [arrow] [arrow-flight] (ankane)
- fix: remove redundant
bit_util::ceil
#6766 [arrow] (miroim) - Remove 'make_row', expose a 'Row::new' method instead. #6763 [parquet] (jonded94)
- Read nested Parquet 2-level lists correctly #6757 [parquet] (etseidl)
- Split
timestamp_s_to_datetime
todate
andtime
to avoid unnecessary computation #6755 [arrow] (jayzhan211) - More trivial implementation of
Box<dyn AsyncArrowWriter>
andBox<dyn AsyncArrowReader>
#6748 [parquet] (ethe) - Update cache action to v4 #6744 (findepi)
- Remove redundant implementation of
StringArrayType
#6743 [arrow] (tlm365) - Fix Dictionary logical nulls for RunArray/UnionArray Values #6740 [arrow] (findepi)
- Allow reading Parquet maps that lack a
values
field #6730 [parquet] (etseidl) - Improve default implementation of Array::is_nullable #6721 [arrow] (findepi)
- Fix Buffer::bit_slice losing length with byte-aligned offsets #6707 [arrow] [arrow-flight] (itsjunetime)
* This Changelog was automatically generated by github_changelog_generator