Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add to_binary methods for integer types #21613

Open
mcrumiller opened this issue Mar 5, 2025 · 0 comments
Open

Add to_binary methods for integer types #21613

mcrumiller opened this issue Mar 5, 2025 · 0 comments
Labels
enhancement New feature or an improvement of an existing feature
Milestone

Comments

@mcrumiller
Copy link
Contributor

mcrumiller commented Mar 5, 2025

Description

See #21549 (comment).

Currently, casting most data types to binary goes through pl.String, as in:

>>> x = pl.Series([0, 2**16-1], dtype=pl.UInt16)
>>> x.cast(pl.Binary)
shape: (2,)
Series: '' [binary]
[
        b"0"
        b"65535"
]

Numpy allows using .tobytes() or .view(np.void) to convert to a binary view:

>>> x = np.array([0, 2**16-1], dtype=np.uint16)
>>> x.tobytes()
b'\x00\x00\xff\xff'
>>> x.view(np.void)
array([b'\x00\x00', b'\xFF\xFF'], dtype='|V2')

It would very useful in many contexts that require binary formats to convert series to binary, as in:

>>> x = pl.Series([0, 2**16-2], dtype=pl.UInt16)
>>> x.to_binary()
shape: (2,)
Series: '' [binary]
[
        b"0000"
        b"FEFF"
]

As an alternative, Expr.reinterpret could be enhanced to allow for pl.Binary, although the input arguments would probably have to be amended.

@mcrumiller mcrumiller added the enhancement New feature or an improvement of an existing feature label Mar 5, 2025
@deanm0000 deanm0000 added this to the 2.0.0 milestone Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants