New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contextual modules #7199
Comments
deprecating dir views: #6857 In a standalone module, the
Where does the
|
This also allowed, right?
The top-level module (under
and perhaps dagger-in-dagger call that module's functions which would have access to the |
I like the idea of calling it "context". It's familiar terminology from Docker/Buildkit. It would probably make sense to default to contextual modules in Standalone modules could be created with Although I've been advocating for making contextual modules first-class citizen in Dagger, I'm not sure how I would explain the difference to someone. Starting with the name: what does contextual mean? There is a context for standalone modules after all. When you think about "modules" in the software world in general, there is no such distinction between reusable packages and applications (eg. Go modules). So while I like the idea of making the user experience better for this use case, I'm not sure about the name and how to explain the difference to others. |
What happens when I try to |
??? That was fast! 😄 |
Sorry I fat-fingered while reading your insightful comments 😁 |
I'd just like to confirm this suggestion is a start to simplify my "Dagger functioning in a Temporal Activity" use case, where the Temporal Activity is the "wrapping context" needing a
From my lurking the community, it has been said by the Dagger team a couple of times IIRC that this kind of usage is considered the "niche" way to use Dagger (and why Dagger has gone CLI-centric). However, I also agree with your suggestion and if my question above is confirmed, I applaud this push!!! 👏 😁 Scott |
I just hit this problem with monorepo at my work recently. I really like the idea. And I also have a questions
|
This part is correct:
But this part is not:
A standalone module by definition doesn't have a context. If it tries to access its context, it will get either an error or an empty directory (TBD).
Yes exactly.
I renamed |
Yes, all modules should always be loadable, and therefore installable. See the section "Loading":
So, in the context of your project:
If you remove the root
|
Agreed, I added this to the proposal. Thanks!
I tried to address this in the proposal, by making the comparison to Dockerfiles more explicit. |
I'm guessing you mean the module under
Yes that would be possible, BUT it would not be necessary, because you can simply Example: // contextual/.dagger/main.go
// Build foo and bar together
func Build() *Directory {
fooSource := dag.Foo().Source()
barSource := dag.Bar().Source()
// continue integration logic
} // contextual/foo/.dagger/main.go
func Source() *Directory {
return dag.client().context(ClientContextOpts{Include: []string{"*.js", "*.mjs", "package.json"}})
} // contextual/bar/.dagger/main.go
func Source() *Directory {
return dag.client().context(ClientContextOpts{Include: []string{"*.go", "go.mod", "go.sum"}})
} This way the include/exclude logic is neatly encapsulated into each module. This greatly improves the experience in large monorepos, because the dependency graph of the monorepo's components can be modeled directly as a dependency graph of dagger modules. |
From #7199 (comment):
Not in Go, but it’s actually pretty common in my experience, when you have a package registry and a manifest file. I see the terms “library” vs “application” more commonly. Libraries are meant for distribution (and thus installing and importing) while applications are meant to be facing users directly. There’s not necessarily a technical limitation that prevents you to distribute an application or vice versa, it’s just that to distribute you have more requirements because you need the right package metadata or file structure. But there’s also all sorts of conventions and best practices when building a library, for example for not using upper bounds version constraints. In our case it’s mostly intent, besides more access to the current “environment”. If it’s a reusable module you want to be a good citizen in thinking how others will use it, and not make any assumptions about their environment. While an application module is likely to be supporting a specific code base and so it benefits from having easier access to it and you don’t care that it’s reusable or not. This brings me back to one of the early ideas in:
So I’m wondering if “contextual modules” could have a bit more access and solve a few other issues as well. As for the trust model, we could for example have a limitation that a main module can’t be installed unless it’s from a relative path inside the same git repo. That would allow bringing together application silos in a monorepo (like our own repo with the SDKs), but not be able to run a module off the Internet that could be accessing something that it shouldn’t. Not necessarily pushing for this specific idea to solve that, especially not right now (my preference is in another solution). Just food for thought as contextual modules looks like a step in that direction by starting this distinction. |
I love how it makes things simpler, both in removing the need for I wonder if anyone will miss the
This isn’t the only use case for it. Views are useful in standalone modules too: Let’s say you have a function for linting go files so you only want But I’d rather define those patterns in code somehow, or make the view names match
👍 BikesheddingName: “contextual module”I’m not sure about this name. Since we’re making a distinction here, if we want to expand this module’s capabilities a bit at some point, we may need to change the name again as it’s very tied to the parent directory. To be clear, we already have a “context” directory currently, and it’s accessible, but only via normal OS file system access since More context for those that don’t know what I meanModules have 3 directories:
In runtime modules, the working directory is in an empty Directory name:
|
Thanks to the feedback here, and a mind-bending discussion on Discord, we are hitting on some interesting ideas. Here is a slightly updated thought experiment:
Comparison: |
Nice! I really want the experience for monorepos to be excellent. I think this could help.
Yes, in my opinion any module should always be importable by any other module (as long as you can reach the files of course).
That is kind of what we had before with the I'm looking for the perfect balance of control and constraints. If we find it, the experience will be 10x better than today. |
What about just "context"? func Source() *Directory {
return dag.Context().Directory(".", ContextDirectory{Include: []string{"*.go", "go.mod", "go.sum"})
}
func Dockerfile() *File {
return dag.Context().File("Dockerfile")
} Note that I'm keeping the option open to adding more resources to the context, beyond files and directories :)
|
That's going to end up with a file tree like this for "standalone" modules:
Or with a dev/test modules:
It feels kinda weird to be honest.
Also, how do you interact with it?
I'm not sure Also, the contents of From what I can tell there are two questions to be answered:
The answer to the first question seems easier to answer: wherever The second answer depends on the type of module (application or library). Maybe the complexity can be hidden inside Dagger: |
Exactly what I was about to suggest! |
Actually there would no longer be the need for
Maybe the feeling of weirdness is superficial, and will subside if it actually solves a bunch of painful problems for us? Give it a chance :)
This is mostly confusing because we're trying to hold two systems in our heads simultaneously: the system we know, and the one described here. Beginners won't have this problem though. If all they know is the new system, then I think it can be as simple to understand as today, and probably simpler. It's just a few simple rules:
|
I'm willing to give it a chance and fully admit that this was only my first reaction. But the feeling of weirdness is mostly due to this:
Maybe a better analogy would be Git with submodules, but we all hate submodules, so let's not do that. :)
Maybe I wasn't clear: the confusion comes from the fact that both modules are called |
What if standalone modules were really overlays for existing upstream repos? Taking your daggerverse repo as an example:
To enable this, modules could optionally configure a remote source as their context directory, with a new dagger init gh --context https://github.com/cli/cli
dagger init bats --context https://github.com/bats-core/bats-core
dagger init kafka --context https://github.com/apache/kafka
dagger init checksum # no remote context With this pattern, every daggerverse repo can basically become a distro :) |
Continuing the list of possible extensions of
Example of job dispatch with func Build(source *Directory, arch string) *Container {
if dag.Context().Platform().Arch() == arch {
return actualBuildLogic(source)
} else {
return dag.Context().Peer(PeerOpts{Arch: arch}).Build(source, arch)
} |
Would it make sense to split this into two issues? Code organization and context seems like two separate ones. |
Aren't we back to the It feels like these are separate issues. |
I think it would be hard to split 1) contextual modules, and 2) how to manage the context. Since "context" doesn't exist without contextual modules. They are part of the same design.
Obviously it's a related problem ("where do the directories go?") but "context" in this design is not the same as "source" in the current design or "root" in the earlier versions. These words are tightly coupled to the overall design they're part of. |
I don't see how |
Ah I see. Yes that's fine. I'll go over my last comments, and either update this proposal or start a new one. Agreed on scope. |
Follow-up after discussing with @kpenfound
|
What happened to the
How about the module source (
Maybe don't generate that function at all for standalone modules? Or would that introduce too much complexity into codegen?
Does this serve a specific purpose? Doing the same thing one way is better IMO. Maybe don't autopublish contextual modules to daggerverse.dev? |
Me too! That's an oversight, I'll update the proposal.
I thought about it, but I think it would create inconsistency. There's already a separate call for getting your source code. In fact that could be moved to
I want to leave the door open to the same module code being usable with or without context. Also it feels weird to change the API depending on a property of your module.
Mostly that if we picked only one way to do it, the more logical one to keep would be the actual path of the module. Which means we would be forced to type Anyway, module loading is definitely an area that needs some bikeshedding.
We would definitely not publish them in the same section. But daggerverse could now show you a fantastic catalog of examples of daggerized projects. Also, there are legitimate use cases for installing a contextual module. Why bother downloading a binary build of kubectl, when I can |
Done
I also updated that section, because expanding the context API beyond the context directory, changes the definition of "context". Now every module always has a context: it just may or may not have a non-empty context directory. |
I finally got through most of this discussion. I like where we stand now in terms of evolution of this idea. I think the "context" will make more sense to a lot of folks (myself included). I love the similarities to docker and git.
This is funny, but something worth re-visiting if we are planning on exposing a bunch of contextual constructs to the module. The local environment would also fall in that category. Will definitely have to think through how to put some guardrails to prevent developers from writing context bounded modules that aren't very reproducible in a different context. But then again, that's where standalone modules come into play right? Feels good to have that flexibility. All in all, great idea! |
Plus one to this one. I like the high-level proposal and this solves so many issues I encountered before. I hope this will bring to sub-modules love they deserve. About future extensions, I agreed with the scope but especially the I like this madness @shykes 🔥 |
My general thoughts are:
In terms of implementing, if a first step is just:
Then it's very straightforward. The context directory is already available internally, we just chose not to expose it to clients. And the rest is just re-arrangement/configuration bikeshedding. |
✅ I do think it was the right call to not rush to doing it right away. The resulting design will be better, because it will be more cleanly layered, and informed by specific user feedback.
I completely agree. The proposal has
I think it's important to make
That's actually an argument in favor of forbidding non-contextual modules from ever accessing a context directory. So if we add a
See above for why core feature vs. best practice.
✅
Agreed, in the current proposal that configuration is done via the module name (
✅
I have a question about that. @helderco also mentioned that we already have a concept of "context directory" internally. Are you referring to the same thing? And if so - are we sure that internal "context directory" is exactly the same as the user-facing context directory proposed here? If you're talking about the git repository root that contains the module - a directory the current loader needs in order to support local dependencies within a repo - then I see that as different. In this proposal, a monorepo could contain multiple contextual modules, each named |
Thanks Nipuna! This iteration feels right to me too.
I think the key is to shift how we use env variables:
So, in my opinion we have an opportunity to solve the underlying problem that prompt users to ask for |
The current internal "context directory" is different from this proposal, yes.
That's right. The internal "context" is either the repo root or the root directory if not in a repo.
Yes, that's right. They're different concepts but may overlap if Not familiar with how it factors into being able to reference another module in the same monorepo by relative path though, or for a |
My thinking is that we would keep that orthogonal, so relative imports and shared sdk material would work the same. But open to changing that ofc. |
@sipsma just double-checking how you feel about my response to this particular concern of yours? |
Problem
There are two types of Dagger modules: those that are standalone software projects, and those that exist in the context of another software project. Let's call those standalone modules and contextual modules, respectively.
Dagger is designed to support both, which is good, but causes friction for users in some areas.
dagger call MYFUNC --source=.
), which is verbose.source
is.
. For contextual modules, it is./dagger
. The current default value is./dagger
, which hurts the experience of creating a standalone module. If we change it to.
, it will hurt the experience of creating contextual modules. Either way, the user experience suffers.Solution
I propose making contextual modules a first-class concept in Dagger. Here's how it would work.
What is a contextual module?
A contextual module is a module that exists in the context of a larger software project, and needs special access to its context directory to perform tasks.
This is similar to Dockerfiles, which usually exist in a context directory, which they can access with operations such as
COPY
. These Dockerfiles are contextual.A module that doesn't have a context is called a standalone module.
Conventions for contextual modules
The only difference between contextual and standalone module is its path. The contents of the module directory is always the same.
A contextual module is recognizable by its directory name:
.dagger
. The module's parent directory (which contains.dagger
) is the context.Creating a contextual module
By default,
dagger init
will create a contextual module: the current directory is used as context, and.dagger
is created to contain the module. Use this when using Dagger to configure your project's CI.To create a standalone module, call
dagger init --standalone
. This will initialize the module in the current directory, without creating.dagger
. Use this when creating a standalone module that is its own software project.Loading
Contextual modules can be loaded directly (at their exact path), or indirectly (at the path of their context).
If the context for a module is itself a module, the context wins.
To summarize the loader algorithm:
Installing
Since a contextual module can be loaded (either directly or indirectly), it can also be installed.
Accessing the context directory
Functions in a contextual module can access their context directory with a new core API call:
Example in Go:
Future expansion of context API
In the future, the
context()
API could be expanded to centralize access to the current execution context.These are not in scope for this proposal, but here are examples to give a general idea of what could be added later:
context().service()
to connect to network services in the caller's context (this would replacehost.service()
)context().module()
to replacecurrentModule()
context().terminal()
to access the caller's terminalcontext().status()
for a more advanced status API than "return error or not"? Perhaps a possible bridge to integrating with eg. Github Checks and CI job statuscontext().ssh().agent()
to get an ssh agent socketcontext().docker().socket()
to get a docker engine unix socketcontext().docker().auth()
to get docker credentialscontext().aws().auth()
to get aws credentialscontext().platform()
to get client's platform informationcontext().gpu()
to access GPUs (insert future webgpu device streaming here)context().watch()
to watch for changes on the context directory (for running dev environments)context().environment()
to get environment variables from the context (this is not an endorsement! my objections to - doing this are well-known... but listing for completeness)context().editor()
to open the user's file editor (for IDE integration?)context().window()
orcontext.dom()
to render a web window to the user (to add GUI capabilities, backstage-style)context().cache()
for hypothetical future interaction with the caching subsystem (?)context().peer()
for hypothetical future clustering features: lookup another engine to dispatch jobs to. A peer exposes its own dag object recursively.FYI @vito @sipsma @jedevc @helderco, lots of speculation and extrapolation in this part, let me know if any of them triggers a positive or negative reaction. ☝️
Do standalone modules have a context?
.dagger
directory) have a non-empty context directory.Deprecations
With native support for contextual modules, some features become redundant and can be deprecated:
dagger develop --source
anddagger init --source
source
field indagger.json
Status
Request For Bikeshedding :)
cc @sagikazarmark @helderco @vito @sipsma @jedevc @kpenfound @jpadams
The text was updated successfully, but these errors were encountered: