-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support concurrent plugin execution through instances #81
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea, but I think it'd better if:
- We keep the plugin struct as is (with the same NewPlugin function we have now; i.e. keep API back-compat)
- As @bhelx has mentioned, maybe we can create a new
PluginModule
orModule
orPrecompiledPlugin
struct that acts as a intermediate state where you can intializePlugins
from quickly. AndNewPlugin
can usePrecompiledPlugin
under the covers - Do we have any benchmarks showing the performance gains of having this intermediate step?
- We will need to sure all of the tests still pass, maybe we need some more tests to ensure the thread safety of the intermediate struct
Another concern for me is that (while this might make sense in a general sense) this changes the general model of interaction with the Extism API, so this change might need to be replicated on all other SDKs for consistency 🤔 |
Some of this design is mutually exclusive with these changes. Under this design, you can't get the benefits of a shared plugin with multiple plugin instances while also preserving the API. What I would propose for preserving the API is moving basically all of these changes to an
You're suggesting that the instance be called
No, mainly because in my opinion the performance gains are obvious in the sense that we're not talking about a small loop optimization or something that may have negligible impact, we're talking about eliminating the need for multiple compilations when you need multiple concurrent instances. For example, if compiling my module takes 80MB of memory (which it actually does, this is a real example) and I need to run 10 instances concurrently, then prior to the changes in this PR, that would be 800MB of allocated memory just for compilation. The vast majority of that goes away when I can compile once and instantiate 10 times instead. We could definitely write some benchmarks to quantify exactly what the benefits are and the scenarios in which you would see those benefits if you feel that would be helpful, but if the question is "does this approach improve performance", I would argue -- definitely yes.
Agreed. I haven't even audited my changes to make sure they're conceptually thread-safe so once we're comfortable with an API, we'll want to do a thorough pass through this in addition to tests to make sure everything checks out.
Hm... yeah I see that all the SDKs are identical in their verbiage and usability. I'm sort of surprised if this use case -- concurrent execution and isolated execution instances -- hasn't been an issue across any of the other SDKs. |
Yeah, this change could help memory usage a lot! It seems worth porting this to the Rust/C SDKs too, but I think the naming is important if we're going to consider that. I agree with @mhmd-azeez that something like |
@zshipko understood on the naming, I'll get that adjusted. What's the direction we want to take on preserving the API? |
It would be nice if we could mostly preserve the current API and just add to it, but that might not be possible. I will experiment a little with the Rust SDK and come up with some more concrete suggestions. |
@zshipko curious if you've had a chance to give this any more thought. Are there areas of the Go SDK we can keep moving forward on in the meantime? |
I just pushed up a rough implementation of this for Rust/C yesterday: extism/extism#784 - I'm still working on some tests and benchmarks. Based on that it seems like we could expand the API with a I think the I'm also open to suggestions if there is something Go-specific that might make more sense. |
@zshipko Hm... I'm not sure how I feel about this. If I create Let me illustrate p1 := NewCompiledPlugin(...)
p2 := NewPluginFromCompiled(p1)
p2.Close(ctx) // should only close p2
p1.Close(ctx) // inadvertantly closes p2 as well I'm in the process of renaming the types in my PR to more closely align with what you're looking for, but I personally feel that this API does not clearly denote the parent/child relationship that needs to exist between compiled modules and instantiated modules. |
So I've been testing the changes in my branch at a few of our production customers and the performance improvements are fantastic -- single compilation + thread safety seems to be working well. I am noticing that there's some very inconsistent and elusive bugs happening specifically around the plugin.ReadBytes function. My running theory at this point is that some of the extism functions (like Edit: moving the extism module into the instance, and compiling a separate one for each plugin instance seems to have resolved the issue. I suspect there are ways to avoid that redundancy, but it would require a fundamental redesign of the approach extism takes here. |
@zshipko Alright, here's the latest update. After doing quite a bit of production testing with some key customers using these changes, we're preparing to release to our entire customer base behind a feature flag for more wide-spread testing. As for an explanation behind my most recent changes:
flowchart TD
subgraph CompiledModule
direction TB
CM_MainWASM["Compiled Main WASM"]
CM_ExtismWASM["Compiled Extism WASM"]
CM_EnvModule["Instantiated Stateless Env Module"]
end
subgraph InstantiatedModule
direction TB
IM_MainWASM["Instantiated Main WASM"]
IM_AnonExtismWASM["Instantiated Extism WASM (Anonymous)"]
IM_EnvModuleRef["Reference to Env Module in CompiledModule"]
end
CM_MainWASM --> IM_MainWASM
CM_ExtismWASM --> IM_AnonExtismWASM
CM_EnvModule --> IM_EnvModuleRef
IM_EnvModuleRef -- "Calls functions in Extism by passing context reference" --> IM_AnonExtismWASM
IM_MainWASM -- "Calls Env Module Functions" --> IM_EnvModuleRef
|
Thanks, this looks great! I am still just concerned about the naming since it should be consistent across all the SDKs, and right now the other SDKs don't have a concept of Compiled vs Instantiated plugins. It seems like renaming It also looks like your base commit is missing some changes from |
@zshipko Could you possibly provide a small pseudo-example of what you have in mind? I'm not sure that I follow exactly how this would work/look. |
It looks like that isn't even needed - but I have a commit here that shows how we would like to preserve most of the existing I also noticed that host functions are moved into |
Got it, this makes sense. I guess my only concern is this leaks memory. Instances of plugins can't close compiled plugins (for good reason if there are multiple instances). I think we can sort of hack around this by passing a closer that the instance that closes the compiled plugin when the instance is closed. We can make that private so that it can't be misused. |
Great point - some way to include that closer is definitely necessary! |
@zshipko okay, I've made those changes. The other thing that feels a little clunky is this. Any ideas? Edit: for context, the reason we need the module config per-instance, not per-compiled plugin is because (for example) the module config is used to specify the stdout/stdin. When running multiple instances you most likely want those outputs to be separate |
Hm, I can't really see anywhere else to put it. I think it's okay to leave that how it is since it's explained in the comment. |
@clarkmcc - If you're done making changes, I can do one more review and work on rebasing this PR |
@zshipko I am done. I was partially through a rebase but one or two unit tests started failing and I haven't had a change to dig in to figure out why. If you have a spare second to attempt a rebase, that would be very much appreciated. |
2d682fc
to
a3a0c61
Compare
just rebased and fixed the failing test - I will do a final review and merge this after discussing with the rest of the team early next week thanks so much for your work on this! |
Awesome, thanks for pulling everything together and for the reviews. Let me know if there's anything else I can do to help on this! |
See Discord thread.
It seems like there should be a
Plugin
type that is thread-safe, and aPluginInstance
type that is not thread-safe. ThePlugin
contains the runtime and the compiled module, and you'd have something likefunc(p *Plugin) NewInstance() *PluginInstance
which would instantiate a wazero module that would then be used in a single-threaded context.The idea here is I can compile my plugin one time, and call functions from the module as many times as I want, concurrently, without having to re-compile.