Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue for integer formatting into a fixed-size buffer #138215

Open
1 of 4 tasks
hanna-kruppe opened this issue Mar 8, 2025 · 6 comments
Open
1 of 4 tasks

Tracking Issue for integer formatting into a fixed-size buffer #138215

hanna-kruppe opened this issue Mar 8, 2025 · 6 comments
Assignees
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-help-wanted Call for participation: Help is requested to fix this issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Mar 8, 2025

Feature gate: #![feature(int_format_into)]

This is a tracking issue for efficient decimal integer formatting into a fixed-size buffer.

Public API

// core::num

struct NumBuffer { .. }
impl NumBuffer {
    fn new() -> Self;
}

impl $Int {
    fn format_into(self, &mut NumBuffer) -> &str;
}

Steps / History

Unresolved Questions

Footnotes

  1. https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html

@hanna-kruppe hanna-kruppe added C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Mar 8, 2025
@hanna-kruppe
Copy link
Contributor Author

Implementation should be a straightforward refactoring of the existing code, but I don't know if and when I'll have time to do that, so I'd welcome if someone else wants to implement it.

@rustbot label E-easy E-help-wanted

@rustbot rustbot added E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-help-wanted Call for participation: Help is requested to fix this issue. labels Mar 8, 2025
@madhav-madhusoodanan
Copy link
Contributor

Hi, I'd love to implement this
@rustbot claim

@Kijewski
Copy link
Contributor

Kijewski commented Mar 8, 2025

In my experience with itoa, one problem with the equally sized buffers for all int types u8 to u128 is that a memcpy is generated when there is no need to. A u8 won't stringify to 20 bytes and it won't benefit from "smarter" copy implementations. A 3 byte buffer for u8, 4 bytes for i8, … would generate much better byte code. I guess this will be the same with this feature.

@hanna-kruppe
Copy link
Contributor Author

As I understand it, the main problem with the memcpy is that the number of bytes to copy is variable (determined at runtime), not that the underlying buffer is oversized. An u32 may turn into anything between one and 10 ASCII digits, so there’s no nice, fixed sequence of loads and stores to emit instead of a memcpy call. It’s possible to come up with marginally better inline code for specific length bounds (an oversized buffer can actually be helpful for this) but the real solution is to change the algorithm to avoid the variable-length copy to begin with. If you compute the number of digits up-front you can generate digits directly into the right place of the destination buffer and don’t need any memcpy.

However, that would be a more invasive change to the current standard library implementation and to the API sketch that was accepted in the ACP. Of course I’d be happy to see an even better solution for this functionality but after thinking about it for quite some time, I’m not sure of this is possible without tough trade-offs. So I’d rather see Rust ship something that’s going to work well for the libraries that happily use itoa today than never shipping anything because perfection is hard.

@madhav-madhusoodanan
Copy link
Contributor

Implementation should be a straightforward refactoring of the existing code, but I don't know if and when I'll have time to do that, so I'd welcome if someone else wants to implement it.

By refactor, did you mean actually create an implementation from scratch?
Also as I understand it, there isn't a Buffer implementation so we'd need to create the NumBuffer type right?

Looking forward to learning a lot by implementing this feature.

@hanna-kruppe
Copy link
Contributor Author

hanna-kruppe commented Mar 9, 2025

I mean that the code that implements the formatting doesn’t have to change except being refactored to expose it’s existing functionality in a new way (accept a buffer as argument instead of writing into its local variable buf, return as_str to the caller instead of always passing it to f.pad_integral). You’re right that the NumBuffer type needs to be created from scratch, but that type doesn’t do anything complicated, it just needs to contain a sufficiently large array of MaybeUninit<u8> to replace the local variable buf in the formatting implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. E-help-wanted Call for participation: Help is requested to fix this issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants