-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Rusty" API is NOT Idiomatic Rust #25
Comments
Rust API guidelines can be found here: |
Hi!
That makes two of us! (me most likely even less so, this was my first rust code).
Indeed. As you can probably tell, it was written by a C programmer!.
Never thought of it like that, smells a bit like C++...
Hmm, given the choice between rusty::init_ctx(...); and uwr_init_ctx(...); I do much prefer the latter (maybe that's just my C brain...).
Right, 'rusty' is really just a more (for some value of more) rust like interface on top of the raw bindings.
Well, it's 'rusty' because it's more rusty than the raw bindings luw_* If we do write an even more rust like interface, it can be called rusted! Thanks for your comments. |
More of an issue is that just looking at the function signatures I'm 99% sure it is unsound. If I were to call |
You think something like (from the rust echo-request example) uwr_write_str!(ctx, "REQUEST_PATH = {}\n", uwr_get_http_path(ctx)); is unsafe? uwr_get_http_path(ctx) simply returns a pointer to a nul-terminated string contained in memory (well, the rust version of that, but that's how it is underneath). It needs ctx so it knows where to look in memory, ctx contains a pointer to the memory in question. I'm not sure why you think ctx is invalid... |
A safe function, invoked by safe code, must not cause undefined behavior, no matter which inputs are passed to that safe function. A violation of this rule is considered unsound. The idea of this rule is to enforce a clear separation between safe, free of undefined behavior, and unsafe code, which if written incorrectly may cause undefined behavior. See https://doc.rust-lang.org/nomicon/safe-unsafe-meaning.html. Rather than working with raw pointers, it would make far more sense for a Rust-based API to take references. A reference in Rust is always guaranteed to be valid, which means safe code can't accidentally pass a broken reference. (Only unsafe code can break this semantic by creating references in some roundabout, errant fashion). |
Seeing as it's just a wrapper around the C API it needs to be 'unsafe' at somepoint AFAICT and to make it otherwise would require a native Rust API. Given various constraints the C API wrapper was the chosen method. |
The key is that you should either:
Currently, your public API claims to be safe, but it has undocumented preconditions required of the users (e.g. that they pass in a |
On Sun, 29 Oct 2023 01:08:33 -0700 Nemo157 ***@***.***> wrote:
The key is that you should either:
* expose the fact that the API is unsafe by declaring it `unsafe fn` (and then describe in the documentation what preconditions are required by users to avoid undefined behavior)
That's workable and provides a nice wee code clean up. Thanks!
* wrap the C API types in Rust types with a safe public API that does not allow users to cause undefined behavior
Where is the undefined behaviour? Or do you _really_ mean UB (where the
compiler is free to do what it likes) or something else?
Currently, your public API claims to be safe, but it has undocumented preconditions required of the users (e.g. that they pass in a `ctx` pointing to a valid instance).
I don't see the issue here, this is a very common idiom...
Besides it's not undocumented, it's quite clear that you pass around an
initialised ctx structure...
|
In my example: unit_wasm::rusty::uwr_get_http_path(0x12345 as *const unit_wasm::rusty::luw_ctx_t) This is safe code constructing a pointer to an invalid
I don't see any documentation on |
That makes no sense. At this point ctx is either initialised or not and it's quite clear that ctx should be previously initialised, even without reading the docs or even knowing the API, surely common sense dictates as much? The fact |
Rust's safety rules do not require common sense of users, they are designed to guarantee memory safety in the face of bugs in safe code. |
Scary... |
I think you are both missing the forest for the trees. Typical applications written in Rust will not interface with raw pointers. They may use Since unit-wasm interfaces with a C API, it necessarily needs unsafe. A pattern you could use for FFI purposes would be to wrap a pointer -- which you internally guarantee to be valid -- in a proper struct. Then API users can pass around references to this struct safely. The API calls might utilize the pointer under the hood, but such details are hidden from the API user, which prevents mistakenly using de-initialized memory. E.g.: use std::ptr;
/// Low-level FFI struct
pub struct luw_ctx_t {
// TODO
}
/// Safe interface for UWR context
pub struct UwrContext {
// Guarantees that fields are fully initialized
inner: luw_ctx_t
}
impl UwrContext {
// Creates a context
pub fn new() -> Self {
use ptr::null_mut;
let mut inner = luw_ctx_t {
addr: null_mut(),
mem: null_mut(),
req: null_mut(),
resp: null_mut(),
resp_hdr: null_mut(),
resp_offset: 0,
req_buf: null_mut(),
hdrp: null_mut(),
reqp: null_mut(),
resp_hdr_idx: -1,
};
unsafe {
// SAFETY
// (Document why safety constraints are upheld)
let ptr = &mut inner as *mut luw_ctx_t;
// I don't know enough low-level C to fill out these details
let addr = todo!();
let offset = todo!();
luw_init_ctx(ptr, addr, offset);
}
Self { inner }
}
pub fn http_send_response(&self) {
unsafe {
// SAFETY
// self.inner is guaranteed to be valid and initialized
let ptr = &self.inner as *const luw_ctx_t;
luw_http_send_response(ptr);
}
}
pub fn http_add_header(&mut self, name: &str, value: &str) {
unsafe {
// SAFETY
// self.inner is guaranteed to be valid and initialized
let ptr = &mut self.inner as *mut luw_ctx_t;
luw_http_add_header(
ptr,
S2C!(name).as_ptr() as *const i8,
S2C!(value).as_ptr() as *const i8,
);
}
}
} The act of wrapping an unsafe interface in a fully safe API with no runtime overhead is Rust's great innovation over existing native languages. In Rust culture, safe code is the norm; unsafety is borne of necessity where the compiler cannot check invariants. Instead, unsafe code will commonly use |
Thanks for sharing! We are aware, that the current implementation of the C-API in Rust is not 100% ready! It was our first implementation of raw FFI bindings and is under objective to be changed over time. The code you have shared is indeed a concept I saw while researching other FFI Implementations. We can use this issue to discuss the necessary changes OR create a new issue for some specific parts and link them together! I like the idea of Rust Wrappers for the low level FFI. |
From discussions on GitHub, people would like to see (besides a more idiomatic Rust API) the Rust functions more explicitly marked as 'unsafe'. Uplift 'unsafe' from the function body into the function prototype. Link: <nginx#25> Signed-off-by: Andrew Clayton <[email protected]>
First, let me say that I am by no means a Rust expert. However, I have noticed several cases while reviewing the Unit WASM "Rusty" API where the code is not very conventional (lit. doesn't follow conventions) of typical Rust code.
Right off the bat, I see the use of the
uwr
namespace in the name of functions, likeuwr_init_ctx
. That makes the fully qualified nameunit_wasm::rusty::uwr_init_ctx
. This is an instance of smurf naming. Generally this would be avoided by importing the module instead of the function, as follows:Otherwise* there are cases where ergonomic constructs could be used (avoiding raw pointers, not requiring writing code that is unsafe, use of higher-level types, etc.). I know that the WASM host-plugin interface is inherently low-level (basically limited to C-language constructs) but I believe the convention for FFI-centered APIs is to split into a low-level crate (usually suffixed
-sys
) interaction with C interfaces and a high-level crate that provides ergonomic and safe abstractions. As I see it, the current wasm_unit "ffi" module performs the lowest-level interactions, but the "rusty" module is still very low-level itself. "rusty" even suffers as a medium-level abstraction because it still requires writing unsafe code to use (e.g. manipulating raw pointers).Thus I believe that the "rusty" API ought not be named "rusty" because the name suggests a more idiomatic interface, or the API should be redesigned to be more high-level and idiomatic, or at least medium-level providing safe abstractions to ease the development of a high-level API.
*there are more cases of naming breaking convention, but I feel it would be pedantic to list them here in full.
The text was updated successfully, but these errors were encountered: