Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better IDE integration #3806

Open
DanTup opened this issue Jan 30, 2025 · 41 comments
Open

Better IDE integration #3806

DanTup opened this issue Jan 30, 2025 · 41 comments
Assignees
Labels
type-enhancement A request for a change that isn't a bug

Comments

@DanTup
Copy link
Contributor

DanTup commented Jan 30, 2025

If use of build_runner is expected to increase with macros going away, it could be good to have better integration with IDEs. Currently in VS Code we have some tasks that make it easier to start build_runner, but they're basically just run as shell commands for convenience and have no real interaction with the editor. Presumably if you have multiple projects that use it, you might end up running many processes too.

It could be nice if there something more like the analysis server... a single process the IDE can spawn that can handle multiple projects in the workspace (workspace in the VS Code sense, not Pub Workspace... since a user could open a folder that has a mix of workspace + isolate pub packages, and a single build_runner process for all of them would probably be better).

I don't use build_runner much so I don't have any current ideas about what else could be better integrated, but maybe this issue would be a good place for others to add some ideas.

(I'll also note that the IDEs are spawning the Dart Tooling Daemon which knows what the open roots are... I don't know to what extent it might make sense to integrate via that... on one hand it would be nice to not be spawning so many separate processes.. but I'm also not sure DTD should turn into some kind of god-service).

@tylandercasper
Copy link

This, to me, is the dream goal. If we could have a streamlined auto build_runner watch experience and something like (Don't require "part file.*.dart" for files in the same directory with the same base name. #60018l)

Than we'd effectively have 99% of the advantages of macros without any of the downsides.

@jakemac53
Copy link
Contributor

jakemac53 commented Jan 30, 2025

One area I have been exploring is an analyzer plugin, which you would use instead of build_runner (see this branch). I am waiting on some features in the SDK to land to help with some of the more pathological cases here since the plugin gets invoked on every key stroke, so you really want to be able to cancel pending async work.

But, I do think it has a lot of really great advantages even as it exists today:

  • Faster (no dependency tracking of its own, which is where most work goes in build_runner for big projects)
  • Simpler (no phases or ordering, everything just kind of runs when it can, in the background)
  • Can send diagnostics back to the IDE, which is something I really wanted from macros
    • Could do quick fixes to add part 'foo.g.dart'; etc, which should be a good middle ground between not having to write part directives at all and the current state.
  • Automatic (runs as a part of analysis server, no extra tool)

@tylandercasper
Copy link

yeah, that would be amazing!

I think per keystroke would a bit of overkill (cpu intensive), but on save would be great. Diagnostics would be fantastic as well.
The big downside of full on analyzer plugin is what happens if a generator crashes? Right now, build_runner watch just exits and you relaunch it when the issue is fixed, but if the whole analyzer went down it would be painful to restart it. (My biggest issue with Macros)

A potential huge win for this would be if a developer could choose between analysis builds and build_runner builds. This would make developing the generator a lot easier since we could use build_runner for breakpoints and variable exploration, and analysis server for generators we're not debugging.

@davidmorgan davidmorgan self-assigned this Jan 31, 2025
@davidmorgan davidmorgan added the type-enhancement A request for a change that isn't a bug label Jan 31, 2025
@davidmorgan
Copy link
Contributor

Thanks Dan.

Yes, we definitely want better IDE integration for codegen :) so I removed the question mark from the issue name ;)

There's lots of discussion but nothing concrete on this right now, I will share if/when we have something. And, yes, input welcome :)

@davidmorgan davidmorgan changed the title Better IDE integration? Better IDE integration Jan 31, 2025
@jakemac53
Copy link
Contributor

I think per keystroke would a bit of overkill (cpu intensive), but on save would be great.

Yes I agree per keystoke isn't really what you want - however plugins today can't distinguish I don't think. That doesn't mean they couldn't in the future though.

Diagnostics would be fantastic as well. The big downside of full on analyzer plugin is what happens if a generator crashes? Right now, build_runner watch just exits and you relaunch it when the issue is fixed, but if the whole analyzer went down it would be painful to restart it. (My biggest issue with Macros)

The whole analyzer wouldn't go down, but the plugin might, so you would have to restart the analyzer to get the plugin working again. In general plugins are quite painful right now to develop, so lots of kinks to work out there, but I think they are pretty nice to use as a consumer.

A potential huge win for this would be if a developer could choose between analysis builds and build_runner builds. This would make developing the generator a lot easier since we could use build_runner for breakpoints and variable exploration, and analysis server for generators we're not debugging.

Yes an option is definitely to just let people choose between the plugin and build_runner. For example for Dart web projects they have to use build_runner for compilation anyways and probably would just stick with that for codegen too.

@davidmorgan
Copy link
Contributor

How about an analyzer plugin that talks to build_runner?

@jakemac53
Copy link
Contributor

You might be able to get an analyzer plugin going that talks to the build_daemon - build_runner itself today doesn't use build_daemon though but it really should be.

The APIs also really don't match up well for existing plugins - they expect to be able to ask for diagnostics for a file on demand based on the in memory contents (in the IDE), where build_runner operates based only on disk contents.

I think the only point of it would be to surface the diagnostics which would be nice but might not be worth supporting just for that.

@bwilkerson
Copy link
Member

One of the critical questions here is this: In an ideal world, would our code generation support (whether that's based on build_runner or not) be able to generate code based on unsaved changes in the IDE (independent of the mechanism that triggers code generation), or would it only be able to generate code based on file content saved to disk?

I suspect that we want to support the former, but I don't work with code generators enough to know for sure.

@rrousselGit
Copy link

How about an analyzer plugin that talks to build_runner?

I could make one if you want.
I have quite a lot of experience with both generators and plugins. That'd be interesting to look into

@tylandercasper
Copy link

One of the critical questions here is this: In an ideal world, would our code generation support (whether that's based on build_runner or not) be able to generate code based on unsaved changes in the IDE (independent of the mechanism that triggers code generation), or would it only be able to generate code based on file content saved to disk?

I suspect that we want to support the former, but I don't work with code generators enough to know for sure.

Given the fact that even hot reload doesn't work on unsaved changes I think this is out of scope. Especially with error generation it could even get annoying. The other thing too is that if code generators have to write files to disk than there's a reasonable expectation that they would be based off of the file thats on disk as well.

@rrousselGit
Copy link

I've got a working POC of an analyzer_plugin that generates code.

Here's a video of an analyzer_plugin that generates a mixin with a comment listing all fields in the object. (It's a dumb generator)
The generator is used by the top-right file, and the generated code is in the bottom right.

You can see that as I modify the left file, the generated code was updated directly (even though the change was unsaved)

It's difficult to tell whether this is a better user-experience though. I only have a small test project.
Although the update is fairly instantaneous in this limited context.

Screen.Recording.2025-02-07.at.22.54.34.mov

@amrgetment
Copy link

actually, IDE support is a brilliant solution for build runner
on IDE instead of using build_runner build--build-filter I will just tap by mouse on build and voilà it works only for the file I changed

Image

@davidmorgan
Copy link
Contributor

I've got a working POC of an analyzer_plugin that generates code.

Here's a video of an analyzer_plugin that generates a mixin with a comment listing all fields in the object. (It's a dumb generator) The generator is used by the top-right file, and the generated code is in the bottom right.

That's definitely an interesting thing to experiment with :)

Does your experiment use the generators directly or via build_runner?

My guess for how this should work in the end--if we go in this direction--is that it should use build_runner, and that work happens out of process. So rather than pulling the build system into the analyzer--which was a problem with macros--it would just call out to the build system.

It's true that plugins get their own process, but I think it's still not okay to put a build in there: builds are allowed to scale to any size, they shouldn't run in a process where they can block things that are supposed to be interactive.

Assume that build_runner performance issues will be solved :) and then it's useful to pull in as you get consistency with other ways of doing builds, and all the generator config for free.

Re: generation on in-memory files or on save, I find it hard to form a strong opinion here because I always save very frequently while editing to trigger dartfmt. Generating too often anyway doesn't make a lot of sense--it will cause churn in the output.

@davidmorgan
Copy link
Contributor

actually, IDE support is a brilliant solution for build runner on IDE instead of using build_runner build--build-filter I will just tap by mouse on build and voilà it works only for the file I changed

Code generators using the analyzer element model can't actually work one file at a time in any meaningful sense.

They always run on top of fully analyzed code, which always (today) requires reading any transitive deps that changed.

That's the fundamental thing that needs addressing re: fine grained invalidation, which is much harder than a UI issue :) ... see #3810 "more granular invalidation".

@jakemac53
Copy link
Contributor

It's true that plugins get their own process, but I think it's still not okay to put a build in there: builds are allowed to scale to any size, they shouldn't run in a process where they can block things that are supposed to be interactive.

Analyzer plugins do not generally block anything, they run in a separate process and can provide diagnostics whenever they are ready. This is a key part of why I actually think they are a great solution. It is effectively equivalent to build_runner in this regard.

They always run on top of fully analyzed code, which always (today) requires reading any transitive deps that changed.

This is most certainly true, but the tracking of that invalidation (the slow part today) is handled a lot more efficiently in analyzer than in build_runner. And running the builders is also actually typically quite fast, so it isn't actually a large issue.

@davidmorgan
Copy link
Contributor

It's true that plugins get their own process, but I think it's still not okay to put a build in there: builds are allowed to scale to any size, they shouldn't run in a process where they can block things that are supposed to be interactive.

Analyzer plugins do not generally block anything, they run in a separate process and can provide diagnostics whenever they are ready. This is a key part of why I actually think they are a great solution. It is effectively equivalent to build_runner in this regard.

They block other plugins; if you make a build plugin that does build work in-process then you have now made all other plugins worse.

But I don't think that's a problem--it should be possible to farm the work out to another process. It would just be triggering builds and feeding back the output.

Maybe analyzer plugins should not be a separate process from the main analyzer: maybe they should just have a hard limit on execution time, to force that they do any work out of process.

@rrousselGit
Copy link

Considering how memory expensive Analyzer is, IMO the fewer processes using Analyzer we have, the better.
So I think it could make sense to generate code directly from a plugin (or the main analyzer if possible).

@jakemac53
Copy link
Contributor

jakemac53 commented Feb 10, 2025

They block other plugins; if you make a build plugin that does build work in-process then you have now made all other plugins worse.

Are you sure? Afaik each one is its own process (which is its own problem, but it does avoid this issue). The next version of plugins is supposed to resolve this though (so they will be all in one process).

But, either way it is pretty trivial to write it in such a way that it yields often to other things wanting to do work.

@bwilkerson
Copy link
Member

In the legacy plugin support every plugin ran in its own isolate. This solved a lot of problems, but used too much memory because every isolate had its own copy of the analysis results.

In the experimental plugin support all of the plugins run together in a single isolate. The analysis results are duplicated, but only once for all plugins. This helps with the memory use, but does mean that every plugin can impact the performance of all the other plugins.

@davidmorgan
Copy link
Contributor

The way we use the analyzer in google3 solves all these problems, we should probably do something similar.

The analyzer supports working from summaries on disk, which means it can quickly spin up, do some computation, write the results and exit. And, it also supports "worker mode" for a persistent process for cases where you need to repeatedly do work on the same data. You can dynamically spin up and spin down worker processes based on available resources.

It's very fast and it's easy to tune to the amount of RAM you want to use.

Every machine has multiple cores, we need to get away from single process builds. We definitely should not build a new way of doing builds that is designed as a single process.

@bwilkerson
Copy link
Member

That sounds fine for the command-line analyzer, but I don't (at least yet) believe that this will work for the analysis server's use case. I don't think we can't spin up a new analysis server every time the user wants code completion results, for example. I'm skeptical that it could deserialize the data from disk and compute the results within the required time frame, especially when taking into account the time for the VM to start running the server's code and the initialization process to connect the server to the client.

You and I need to talk.

@davidmorgan
Copy link
Contributor

That sounds fine for the command-line analyzer, but I don't (at least yet) believe that this will work for the analysis server's use case. I don't think we can't spin up a new analysis server every time the user wants code completion results, for example. I'm skeptical that it could deserialize the data from disk and compute the results within the required time frame, especially when taking into account the time for the VM to start running the server's code and the initialization process to connect the server to the client.

You and I need to talk.

I meant as a way to run generators--not necessarily moving any existing functionality.

@rrousselGit
Copy link

Personally I'd:

  • merge analyzer_plugins with the main Analyzer isolate. This should cut gigabytes of memory.
    Users tend to have 8Go of ram used by Analyzer, and another 8Go for plugins ; even if all plugins run in the same isolate.
    It also makes Analyzer more responsive. Currently, plugins and the main Analyzer isolate boot sequentially.
    Analyzer start, then it spawns plugins ; which themselves need to start. That makes plugins available a few seconds after Analyzer.
  • If we're worried about plugins locking Analyzer, I'd have the IDE tell users "the plugin foo is unresponsive".
    This should be more than enough for users to not blame Analyzer but the relevant plugin.
  • Code-generators could be an analyzer plugin. This could make generation faster.
    And it has the cool benefit of enabling generators to emit diagnostics and fixes.

I really don't think the "Plugins could do an infinite loop" is a concern.
We just need to relay to users that a plugin did a stupid thing.

That's not really different from, say, if a lint rule in the SDK took 2 seconds to complete.
Analyzer already knows which rule too longer to complete than others.

@davidmorgan
Copy link
Contributor

I agree with most of that--except, it's okay to have codegen coordinated by an analyzer plugin, but it should not run in an analyzer plugin. There is no build system that would prefer to be forced to run in a single process it does not control :)

@rrousselGit
Copy link

If running in an analyzer plugin is optional, I don't think that's an issue.

We would still have a CLI to manually run generation. And we'd likely have options to enable/disable generation through plug-in.

Macros were some sort of analyzer plug-ins to begin with.

@davidmorgan
Copy link
Contributor

If running in an analyzer plugin is optional, I don't think that's an issue.

We would still have a CLI to manually run generation. And we'd likely have options to enable/disable generation through plug-in.

Macros were some sort of analyzer plug-ins to begin with.

One of the problems with macros was exactly this: macros were not fast enough to run "inline" in the analyzer, by which I mean, blocking whatever you do next with the analyzer.

So it's very much the same problem. Whatever is integrated into the analyzer's main loop, has to be fast; builds are not always fast. So it needs to be coordination of the build that's integrated, not the build itself.

@jakemac53
Copy link
Contributor

I would push back and say I do think running builders inside an analyzer plugin is actually very reasonable. It results in the smallest amount of total work being done.

If you look at the actual work a builder wants to do compared with a typical custom lint, its actually roughly the same, maybe even less in a lot of cases (they don't look at the full AST just the element model often).

So, anywhere it's ok to run plugins it should also be ok to run builders, from a performance perspective.

The one caveat to that is that we don't want to thrash the analyzer too much by emitting files in the middle of analysis - but I have been playing around with a way to cancel work that was previously scheduled and that seems to mostly solve the problem (basically, whenever a plugin is told about some changed files this allows it to stop its previous analysis - or code generation - of those same files immediately).

Also, once you have built all the files in your app then the amount of actual on disk churn is low anyways.

@bwilkerson
Copy link
Member

merge analyzer_plugins with the main Analyzer isolate. This should cut gigabytes of memory.

If you have ideas for how to do that I'd love to hear about them and what the tradeoffs would be.

Analyzer already knows which rule too longer to complete than others.

We do have the ability to measure the time taken by a lint, but we currently don't measure it in production so we can't use it to report problem lints to the user.

... it's okay to have codegen coordinated by an analyzer plugin ...

I look forward to hearing more about what kind of coordination you have in mind. But I agree with Jake that it isn't necessarily a bad idea to have code generation be running in a plugin.

The one caveat to that is that we don't want to thrash the analyzer too much by emitting files in the middle of analysis - but I have been playing around with a way to cancel work that was previously scheduled and that seems to mostly solve the problem (basically, whenever a plugin is told about some changed files this allows it to stop its previous analysis - or code generation - of those same files immediately).

We had hypothesized a slightly different approach (but haven't yet had time to try it). We thought it might make sense to add a small delay to the code handling watch events. Right now, when a watch event is discovered we immediately start analyzing the potentially affected files. But if we were to wait a short time, resetting the timer every time an event comes in, then we'd presumably do less work overall without a significant additional latency.

Not only would this potentially deal well with builders, it would also presumably improve our behavior around git operations that often change large numbers of files.

@davidmorgan
Copy link
Contributor

So, anywhere it's ok to run plugins it should also be ok to run builders, from a performance perspective.

That doesn't scale. It might be fast for some examples, but it's not always going to be fast.

It does not make sense to design a combined build system and interactive tool that simply becomes unusable when you hit a certain code size / complexity. Build systems must scale.

If there are problems making it work as we want, let's fix those, but the end goal needs to be that builds never block interactive actions.

@DanTup
Copy link
Contributor Author

DanTup commented Feb 13, 2025

merge analyzer_plugins with the main Analyzer isolate. This should cut gigabytes of memory.

I'd worry that running more code in the main isolate can only worsen responsiveness in the editor. I feel like recently I've seen fewer complaints about memory usage and more of "When I try to save in VS Code, the formatter blocks the save for a long time" (admittedly, I mostly see VS Code issues so that doesn't necessarily mean it's a more common issue). In most cases this is unrelated to the formatter but rather that the format-on-save request is held up by other code tying up the isolate (I think often long analysis caused by large library cycles).

It's unfortunate that the format request is held up this way (it's pretty disruptive in the editor), because either a) it produces no changes in which case the delay was for nothing or b) it produces changes that the editor applies and then triggers the analysis again.

Sometimes I wonder if the solution is more - not fewer - isolates 🙂. The formatting is done on a string (which the formatter then parses) and doesn't require any analysis results/resolution info. Probably it could be done on another isolate and keep Saving* more responsive, but it would also require the messages from the client were processed in an isolate that wasn't tied up by the analysis (I don't know how big that change would be, or how much complexity it would add).

* Some users enable fix-on-save which unfortunately does require a resolved AST and therefore having format on another isolate would likely not change anything for them.

@jakemac53
Copy link
Contributor

That doesn't scale. It might be fast for some examples, but it's not always going to be fast.

I understand where you are coming from with this statement, but if you look across all our tooling, it is almost all whole world and single threaded.

Fundamentally, there is no reason that code generation using this model would not scale at least as well as the rest of the tooling, if it is built directly into the analyzer and uses the same strategies. It is not doing anything fundamentally more expensive.

If we decide that the entire system does not scale, then whatever fix we do should also benefit code generators, if they are built directly into the same tooling.

@davidmorgan
Copy link
Contributor

davidmorgan commented Feb 13, 2025

That doesn't scale. It might be fast for some examples, but it's not always going to be fast.

I understand where you are coming from with this statement, but if you look across all our tooling, it is almost all whole world and single threaded.

The SDK already does support multiple processes, it supports it for codegen, and we already use it that way in google3.

The single process model is dead--it made sense with the hardware of ten years ago, but the world has moved on. We should not build anything new following that model, and any new work should actively move away from it.

There is simply no way to beat a free 5-10x performance gain from today's consumer hardware.

@jakemac53
Copy link
Contributor

I don't think there is a whole lot of point in continuing to argue this point right now, but suffice to say that if you look at our existing tools which do use multiple local cores to compile modules in parallel, they are in fact slower than the single threaded incremental compiler is. The only time that our existing modular compilation beats the incremental compiler is when you are sharing work across compilation of many applications (common example being tests).

In theory, yes more cores is better, I don't think anybody would argue that. However, it doesn't come for free because there is overhead in terms of doing the delegation of work across cores, and duplication of creating the same representations of the same objects across all the worker threads in order to do that work, and the fact that the actual work being done is often very small, so it ends up being dwarfed by all the other overhead.

This is especially true when our smallest unit of compilation is a library cycle, and most apps end up being one large library cycle, there is really no win to be had there.

@davidmorgan
Copy link
Contributor

Yes, multi process needs work; but I think that's an argument for working on it, not an argument against working on it.

@jodinathan
Copy link

So, anywhere it's ok to run plugins it should also be ok to run builders, from a performance perspective.

That doesn't scale. It might be fast for some examples, but it's not always going to be fast.

It does not make sense to design a combined build system and interactive tool that simply becomes unusable when you hit a certain code size / complexity. Build systems must scale.

If there are problems making it work as we want, let's fix those, but the end goal needs to be that builds never block interactive actions.

maybe building should block everything else?
one of the issues I have with source generation is that analyzer go nuts while things are being generated and I really wish it halted until the build has finished its job.

@davidmorgan
Copy link
Contributor

maybe building should block everything else? one of the issues I have with source generation is that analyzer go nuts while things are being generated and I really wish it halted until the build has finished its job.

Yes, for sure, the analyzer needs to coordinate with codegen so it's not surprised by what's going on :)

@tylandercasper
Copy link

I think it's worth clarifying, from a performance perspective, what exactly are we trying to optimize for?

It sounds like a lot of this is in regards to getting it to perform fast enough that we can perform on every keypress without causing any issues, but that's an incredibly high bar (and it seems a lot of refactoring to achieve).

On the other hand if we only rebuilt on save, than even if it takes a full second or two extra, nobodies going to care.

Is that a correct assessment, or am I missing something?

@rrousselGit
Copy link

merge analyzer_plugins with the main Analyzer isolate. This should cut gigabytes of memory.

If you have ideas for how to do that I'd love to hear about them and what the tradeoffs would be.

I'd have analyzer itself be a plug-in

Then the analyzer binary would do nothing but spawn plug-ins and let them do the work.

This is the architecture used by build_runner for running generators

@bwilkerson
Copy link
Member

So the biggest tradeoffs that I can think of would be

  • an increased startup time because of having to generate a main, and
  • not being able to AOT compile the analysis server.

Anything I missed?

@rrousselGit
Copy link

We should be about to AOT compile the analysis server anyway.
I expect that the compilation could be cached somewhere ; such that if you use the same plugins (or none), you can reuse a previous build.

@bwilkerson
Copy link
Member

We should be about to AOT compile the analysis server anyway.

Yes. Macros and the old plugin mechanism were the two blockers. Macros are gone and we're working on fixing plugins.

... such that if you use the same plugins (or none), you can reuse a previous build.

The problem is that the set of plugins we need to run can change every time you open a directory containing a package that uses a plugin we hadn't seen before. And every time the version of the analysis server or any of the plugins changes. There's probably some way to do some kind of caching, but that won't completely negate the need to sometimes recompile the server on start-up (or more frequently if we need to restart when the set of plugins grows because of opening a directory).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests

8 participants