-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Access to the API from C? #2265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
In short, there isn't yet an API, largely because no one has shown significant interest that one be provided. I'm interested to hear more of the use case that you're thinking of. |
I have networking code written in plain old C that is used, well let's say "widely" and it isn't an understatement - let's call it curl. I'm right now researching the feasibility of switching over to using hyper for the the HTTP/1 and HTTP/2 handling parts. For this to work, I would need a C based API that's sensible and stable going forward. A plausible first step to ponder on: optionally (as a build-time option) replace nghttp2 with hyper for HTTP/2. I was primarily curious if I had missed an existing such API but I now realize I hadn't. I also understand that you possibly don't want to do this for me and my single (still "theoretical") use case, but maybe I'm not alone. |
Yea, I knew you from curl, just didn't want to assume what you might want to do :) I've thought about exposing a C API a bunch of times, but hadn't had anyone really ask, so I could only guess at interest. I'm happy to help figure this out, though! We could probably spec out some Can I also ask how "interested" you are? Are you just exploring what things look like? Do you have a deadline you'd hope to have this "theoretical" stuff in curl? Do you think you or others would be able to contribute to building hyper C bindings? Knowing these things would help me put together a proposal. 😀 |
I would definitely need my own event loop (libcurl itself is event loop agnostic and allows apps to use what they want). On the level of my interest: it's a little of a chicken and egg problem. If the API is "suitable enough" I'm very interested, as in I'm probably getting funding to work on this project. It's not a done deal though and there's no deadline at all at this point. Right now, without even an API to look at and think about, the project is not feasible at all (as I don't consider myself qualified enough to work effectively on the hyper side of the puzzle) I of course also have challenges of my own on "my side" of things that may or may not put obstacles in the way independently of your project and C API. I'm not trying to suggest that this just is all just waiting for a C API as the only bump in the road. I don't even think I yet have a fair grasp of what exactly I would want the API to look like even if I would dream up my ideal API. Early days still. Dipping toes.
I will certainly join in and do my part - if nothing else for entirely selfish reasons so that the bindings end up as good as possible for my intended use case. After all, I have some experience in making and offering C APIs. I haven't actually asked around anywhere so I don't know what level of interest we can hope for from others. |
OK, so I'm working on a draft for a header file ( So far the basic stuff is pretty straight forward, requests, responses, headers, the client. The more complex thing to represent I feel is a In hyper, it's typical that IO is the reason the When looking in curl, there seems to be some similar concepts in the multi API. It looks like you collect together these transfers into a The last wrinkle is that in Rust |
Hi, someone I know directed me to this issue- I've done some rust work in the past and am interested in contributing if you think I could be useful. Just wondering- were y'all considering bindgen at all? |
libcurl provides a "multi" API that performs any amount of parallel transfers within the same thread. It exists in two flavors, one that assumes the use of select() and one that lets the user plug-in and use their own event system - the latter for better scaling and performance when moving up in the thousands of simultaneous connections. We have users with libevent, libev, libuv, plain epoll and virtually every other known event-based system. So for curl it is important that support libraries are single-threaded (at least appearing so) and non-blocking. A typical non-blocking behavior for us, and perhaps one that you could mimic with your Futures, is to offer information in an API about how to poll it for completeness (like are we waiting for READ and/or WRITE on a socket) and/or a timeout when it should get polled even if no event has been detected. The the curl event loop just polls the socket waiting for a trigger event or a timeout, and only if one of those happens it calls the library that then performs whatever it needs needs to do - and can then update the next "waiting conditions". |
I'm not done yet, but I've made some progress and pushed it into this branch here: https://github.com/hyperium/hyper/tree/hyper-capi Some relevant pieces:
It doesn't work yet, but does show how the API mostly looks. There's a |
I like the spirit and the initial direction but it also makes it more clear to me that this is pretty challenging. I'm not sure what the best way is to provide feedback or at what point, so forgive me for just blurting out a few random points/questions here: negative error values for (unsigned) "size_t" really?That will cause code analyzers/compilers to get unhappy. hyper_request_set_* - copies the given input string presumablyJust suggesting it could be clarified in the description How to set/change request headersIs there a default set? How would a client pass on request headers to use in the request? how do we get (individual) response headers?The example just shows a write callback but that's for the body, right? How would a client get response headers? for a request body, it reads with the callback untilit returns 0? how do you envision server push to work this this API?how would trailers work?if hyper recv/send on the socket, how does it signal connection close?Ie if the connection is closed, perhaps prematurely, how is that passed to the application here? how does the client know the transfer is done?how is HTTPS (TLS really) glued into this?hyper_request_set_uriI fear "regular" URL interop issues here if hyper parses a URI that potentially is also parsed separately by the client application - I would prefer setting individual parts to mitigate that risk. On connections and streamsIn a curl world, curl would (most probably) ideally:
|
Thanks for the feedback! I spend a lot less time writing C, so I may not always think of the best way to express concepts in the language.
Yea, I originally started those callbacks with
These APIs are missing from the header file while I investigated the best way to pass and return strings in C. The plan in my head is to provide
The write callback is not just for the body, but the IO write function for all writing. It will also be called when writing headers, or connection frames in HTTP/2. This is because hyper doesn't assume the IO transport is a file descriptor, it could be anything. The same applies to the read callback. It should just read from a socket (or whatever other source) The body will be another pollable thing, similar to a
I haven't thought of that yet. It could possibly be something that is optionally polled out of the response future. Is it a needed feature in the first draft?
Trailers can also be polled out of the "body" type (in Rust), so it'd just be modeling that similar to polling for bytes.
The read/write callbacks do the actual socket IO, so if there was an error, they can return
When the user has the response, and has polled the body until it signals it's finished, then it's done.
That would be handled by the read and write callbacks on
I'm curious to hear more. hyper's
I agree completely. I've so far been modeling hyper's |
It's not a problem there because libcurl can never deliver that many bytes in a single callback invocation. |
URLs and URIs are seriously underspecified today to the degree that you never really know how a parser will deal with the "URL" you give it. See URL-interop for details. Therefor, there's a security problem waiting to happen when we mix multiple parser implementations. (Famously discussed in Orange Tsai's 2017 presentation: Exploiting Url Parsers) |
To aid with reviewing of the design, I've turned the branch into a pull request (#2278). I hope to make it more usable over the next couple weeks. |
Some of my own comments:
I would love to see Rust used in libcurl, but I also don’t want it to be a regression for libcurl’s users. If hyper could recover from OOM and have code size comparable to the C code it replaces, I would consider that a major advance for both hyper and the Rust ecosystem in general. |
I had no idea. That's a horrible way to deal with problems and not the decision to make for system libraries. Is that what hyper does too? If so, it'll be a concern.
I completely and wholeheartedly agree, and in fact you can consider me (as lead developer in the curl project) signed up on the mission to see us move on both these items. Put simply: Hyper cannot be a transparent and good choice for libcurl until it can be used in a way that doesn't cause any notable regressions or issues. The job set out here is thus not done until we're there. Regarding code size and curl being used in different environments etc: that is indeed true and we should not presume that all users want or can go with Rust solutions - not even when/if we can make them used completely transparently. curl has been used on over 70 operating systems on 20 different CPU architectures. It is quite simply much more portable and versatile than Rust is. The idea from my end is to make curl to optionally use Hyper as a backend (selected at build-time). |
It's the default behaviour, but you can override it, this would just need to be exposed somehow in the hyper C API. The API is not currently stable, so you have to use a nightly Rust compiler. |
Sure, if you think that's a valuable feature to expose. I would be fine with just making the C API always return error on errors... |
Sorry, maybe exposed was the wrong word. It totally could be just an error when you’re using the API. I more meant that there would need to be a function to set the override hook, see the below example. #![feature(alloc_error_hook)]
#[no_mangle]
pub unsafe extern "C" fn unset_alloc_abort() {
std::alloc::set_alloc_error_hook(|_| panic!(“alloc error”));
}
#[repr(C)] struct Foo;
#[no_mangle]
pub unsafe extern "C" fn new_foo() -> *mut Foo {
catch_unwind(|| {
// Allocations
Box::into_raw(Box::new(Foo))
}).unwrap_or(std::ptr::null_mut())
} |
Catching panics is not only nice for OOM error handling, but it is necessary anyway as panicking across FFI boundaries is UB. |
The other more insane version could be to longjump out of the error hook. |
you still want the library to not leak memory or any other resources, even in the OOM situation, so simply "jumping" out of it is rarely the best idea - said without having a clue about the Hyper internals. |
Currently Rust standard library doesn't attempt to handle OOM (for instance, |
Personally, I don’t believe we should rely on unwinding to handle OOM errors. The devices which have the greatest need to recover from OOM are also the ones where the code size overhead of unwinding is the greatest concern. I greatly prefer the Zig approach, where any function that takes an allocator is passed one explicitly, and where running out of memory is just another error. Maybe I will write an RFC for that someday. |
I strongly disagree. A system library, any system library, has the obligation to clean up after itself. In all and any circumstances. If a library function call fails, it needs to return an error and it it needs to never leak memory whatever error happened. Applications using these libraries should be able to depend on that, and if the application thinks a memory error, or any other serious error, being returned from said library is a show-stopper then the application can decide to abort or exit. It is not the choice to make for the library. It never is. This gets even more emphasized if a library (like Hyper) is used by another library (like libcurl), where this second library thinks of this behavior and "rock solidness" to be something to be proud of and is even pushed as a "selling point" to users, and if the sub-library then cannot deliver on this promise then libcurl cannot either and then: 😞 |
@badger the comment you replied to did not advocate for the library handling OOMs itself (by exiting, aborting or whatever), just for catching OOMs without unwinding. This would require use of data types with built-in fallible allocation, which is not yet the case for Rust's |
@jplatte Thanks for clarifying that for me! (reading this again, I see it might come off as an ironic comment but it isn't, as said before I'm a Rust cluebie and I appreciate getting told about facts I seem to have gotten wrong. 👍 from me). |
@bagder No offence taken. Your comment is one of the best arguments I have seen as to why OOM should not be a fatal error, even though much code and even some system libraries (glib, macOS’s Cocoa, librsvg, cmark, gumbo) considers it to be such. Ironically, I believe that Rust, due to its memory safety and automatic resource cleanup, is in a better position than C to recover from OOM errors, once libstd is updated to provide fallible allocation. |
I've just tried some experiments, and it indeed works just fine to do some
I think it needs to be part of hyper's C API, since it requires setting a global (Rust's alloc error hook), and it'd be problematic if different libraries all try to set that automatically. |
I think we really need a Rust RFC that makes the default hook panic. |
Unwinding in OOM is undefined behavior, as |
That attribute is going away together with the "panic" default hook: rust-lang/rust#76448. |
With the merge of #2278, I've opened some new issues that are more fine grained (in the A-ffi label), and so I think this can be closed as "done". |
If a C program would like to use hyper for h1/h2, is there a C API offered?
The text was updated successfully, but these errors were encountered: