-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Edge Browser Scheduled Job timeout #1752
base: master
Are you sure you want to change the base?
Conversation
7d16ca0
to
8195777
Compare
I would additionally reduce the timeout introduced in #1677 to 10 seconds (+5 seconds from your new timeout) just to make sure that your change here actually covers all failing tests and use cases. If it does then I would expect to see no timeouts hitting 10s, only timeouts after 5s. WDYT? |
Test Results 493 files - 1 493 suites - 1 9m 34s ⏱️ +10s For more details on these errors, see this check. Results for commit 98da9eb. ± Comparison against base commit 081eb9b. This pull request removes 37 tests.
♻️ This comment has been updated with latest results. |
8195777
to
b93ae44
Compare
Alright, I'll add it here. |
04d2500
to
ae4fcac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to have another thorough look at what may cause Edge initialization to block the UI thread. When trying to test the timeout logic, I came across several places where I can introduce blocking operations that will not be caught by this proposal.
When systematically going through the public methods of Edge, I see that the calls to getWebView*()
and isWebView*Availabile()
are the ones that block in case initialization or any of the subsequently scheduled futures do not finish timely.
Then, taking a look at which calls may potentially block inside those methods, I see these two:
- waitForFutureToFinish()
- future.join()
The latter will be captured for the initialization future by the timeout added in this PR. The former will only partly be resolved, as inside waitForFutureToFinish()
there is a call to processNextOSMessage()
, which spins a loop that may be blocking as well. Thus, I think we need to break that potentially blocking operation as well. Maybe we can simply spin Display.readAndDisplay()
in the loop checking for future completion (inside waitForFutureToFinish()
) instead of having the additional loop inside processNextOSMessage()
?
Then, what about the other callers of processNextOSMessage()
? In particular, createEnvironment()
is called synchronously when creating the first Edge instance. May that one block as well? I also see further calls of that method of which I am not sure if they are correct. We need to check them as well.
webViewFuture.orTimeout(1, TimeUnit.MILLISECONDS).exceptionally(exception -> { | ||
// Throw exception on the Display thread directly to prevent CompletableFuture | ||
// to wrap the exception and throw it silently | ||
browser.getDisplay().execute(() -> SWT.error(SWT.ERROR_UNSPECIFIED, exception, "Edge Browser initialization timed out")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case we run into a timeout, the initialization needs to be properly rolled back in terms of the state of Edge
being cleaned up and potentially OS resources to be freed, like in the abortion logic of createControllerInitializationCallback()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have extracted the environment release method and added that here as now we have a webview wrapper which makes sure that we dont have multiple futures dealing with multiple instances of webView but one future initializing everything as a whole. And I have verified, the initialization of webview* happens very quickly after obtaining the webview instance which gives the initialization an atomic behaviour. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally sounds good, but I am sure whether it works like it is implemented now. You will run into the timeout handler and release the environment while the webview initialization is still running asynchronously at the OS. Is that the right thing to do? I would rather expect that to eventually be done when the initialization callback is executed (at some point in time after the callback) or, maybe, after a rather high amount of further waiting time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. It's better to let the other handlers handle the dispose
I have changed the flow of how things are initialized and disposed in case there's a time out. Its a little something bit of extra change but now I can handle exceptions a bit better and the flow also looks clearer to me. Can you have a look?
webViewFuture.orTimeout(1, TimeUnit.MILLISECONDS).exceptionally(exception -> { | ||
// Throw exception on the Display thread directly to prevent CompletableFuture | ||
// to wrap the exception and throw it silently | ||
browser.getDisplay().execute(() -> SWT.error(SWT.ERROR_UNSPECIFIED, exception, "Edge Browser initialization timed out")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of only throwing an error, should we maybe hide the (uninitialized) browser widget and add a label informing the user about the failed initialization instead? Calling something like this inside the execution on display might do the trick:
private void replaceWithErrorLabel() {
browser.setVisible(false);
Label errorLabel = new Label(browser.getParent(), SWT.WRAP);
errorLabel.setForeground(browser.getDisplay().getSystemColor(SWT.COLOR_RED));
errorLabel.setText("Edge browser initialization failed");
errorLabel.setLocation(0, 0);
errorLabel.setSize(browser.getSize());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like his idea. But how about we combine this with also throwing an error even though it doesn't pop up in a dialog. The error is atleast visible in the Error Log of the IDE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also if we don't use this execute method here, it will sill throw the error but the error will be wrapped in CompletionException and not SWTException since, the errors in the CompletableFuture are caught implicitly and wrapped with CompletionException. In case of SWT.error as well, I see the exception to be silent and there's no dialog coming up. What do you prefer? Is that okay to not wrap the exception with SWT.error and let it throw the CompletableFuture CompletionException and just replace the browser with a label?
The image above is for when I dont throw a wrapped SWT exception and let the future handle it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still be in favor of logging some kind of error. Even though the user will primarily see the information in the UI, someone debugging issues may only have a look at the logs, which is why having such an error documented there is reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so replaceBrowser with label and log an error? the error is logged automatically by the future of exception. Do you mean that or should we throw an exception explicitly using, browser.getDisplay().execute(() -> SWT.error(SWT.ERROR_UNSPECIFIED, exception, "Edge Browser initialization timed out"));
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong opinion this, but I see two things to consider:
- If possible, we should avoid throwing a future-specific exception to consumers of SWT/Edge and rather provide proper information inside an SWTException to them.
- We should have a look at the where consumers may face that exception and how that conforms to the existing APIs they are using (usually those of
Browser
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if one is interested in the exception, get()
is better than join()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going with browser.getDisplay().execute(() -> SWT.error(SWT.ERROR_UNSPECIFIED, exception, "Edge Browser initialization timed out"));
since with this we make sure our SWT exception is not caught by CompletableFuture and silenced on timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very strange way to print an exception... and it will require a synchronous call to the SWT thread... why not use display.getErrorHandler ()
directly?!?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree that looks better. But do you think its is good to check whether to use RunTimeHandler or ErrorHandler?
8b86b9b
to
893b4dd
Compare
I evaluated your points and i have implemented everything into a webviewWrapper which is responsible for containing all the webviews and we need just one future (webViewWrapperFuture) now which provides us this wrapper from where all the webview* can be obtained, so now the dependency flow is like this: webViewWrapperFuture -> lastWebViewTask. for every webView*, it will be responsible - in case of a webViewWrapperFuture timeout, everything times out. |
I am wondering if addressing processNextOSMessage is necessary after this, since we are looping until the future is done eventually and it can happen either if the initialization completes or the timeout, so there always exists a path with which future will be done. In terms of temporal logic, this is verified: A F webViewWrapperFuture.isDone() The question about if processNextOSMessage is a blocking anything: what kind of messages are processed in there and what control do we have? Looks like, we are just calling display.readAndDispatch(), which is just processing the messages on the display thread from the OS, if something unrelated to Edge was blocking it here, it could have been blocked anywhere in the platform. And We have all the handlers well defined to handling the Edge messages in the callbacks properly, my concern is about the block:
Other than that, I think we should retest the new implementation and try to see if we still have tests / workspcaes freezing. If no freezes, at least we will have logs for the timeout. |
That sounds good. There are several good proposals in this PR now. Can we please separate into different smaller PRs to ease reviewing and testing? In particular, putting the web views inside a wrapper captured by one future is something that we might do as a simple "cleanup" and preparatory PR for this. |
This code will put the thread to sleep until some OS message is received. If for some reason no message arrives anymore, this will block the whole application. Not sure how likely it is, but I would be in favor of avoiding the risk. As mentioned above, it might be possible to simplify this code anway. |
This commit contributes to refactoring of all the webView* instances into a wrapper class WebViewWrapper for better encapsulation and fault proof initialization using a single future. contributes to eclipse-platform#62 and eclipse-platform#127
893b4dd
to
70f68cd
Compare
70f68cd
to
98da9eb
Compare
Can it be that this already happened? I'm looking at the failing tests and I see this thread-dump:
Maybe because the tests produce less OS messages and therefore the issue with the empty queue is more likely to happen? What I'm thinking is: an environment without a mouse (e.g. the test environment) probably generates way less OS events. |
@fedejeanne |
@@ -316,10 +316,23 @@ ICoreWebView2 initializeWebView(ICoreWebView2Controller controller) { | |||
return webView; | |||
} | |||
|
|||
private CompletableFuture<WebViewWrapper> initializeWebViewFutureWithTimeOut() { | |||
CompletableFuture<WebViewWrapper> webViewWrapperFuture = new CompletableFuture<>(); | |||
webViewWrapperFuture.orTimeout(3, TimeUnit.SECONDS).exceptionally(exception -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
webViewWrapperFuture.orTimeout(3, TimeUnit.SECONDS).exceptionally(exception -> { | |
webViewWrapperFuture.orTimeout(3, TimeUnit.SECONDS).exceptionallyAsync(exception -> {...}, browser.getDisplay()) |
Just a general remark here:The usage of Instead one should probably better use a Task-List design here, so that different task are processed one after the other, this could be for example a (single thread) If Beside that, if a browser control fails to initialize, I think it is fine to just show nothing / something broken / whatever to the user so no need to replace it with something else. |
I think the key is actually: What public API methods really require blocking operations. These methods should not use Lines 111 to 121 in 5d535ce
Lines 136 to 140 in 5d535ce
the wake is important to break out the sleep in a timely way! |
Thanks for the feedback! The proposals make sense and revising the design again might be a good thing. But we should move that discussion into a separate issue. This PR is part of efforts to mitigate the risk of freezes / UI thread blocks by using Edge for the upcoming release. For that reason, we need to improve the current design instead of revising the design, which we may defer to the next development cycle. |
@HeikoKlare as a workaround replace all parts that do a
Then you don't need special timeout handling and won't see UI freeze. |
it looks odd, but I think private <T> void waitForFutureToFinish(CompletableFuture<T> future) {
while(!future.isDone()) {
Display display = Display.getCurrent();
future.handle((nil1, nil2) -> {
if (!display.isDisposed()) {
try {
display.wake();
} catch (SWTException e) {
// ignore then, this can happen due to the async nature between our check for
// disposed and the actual call to wake the display can be disposed
}
}
return null;
});
MSG msg = new MSG();
while (!OS.PeekMessage (msg, 0, 0, 0, OS.PM_NOREMOVE)) {
if (!future.isDone()) {
display.sleep();
}
}
display.readAndDispatch();
}
} would probably prevent blocking (future itself does not use (a)sync exec). |
This PR will probably be superseded by #1776 , right? |
98da9eb
to
3c7da0e
Compare
3c7da0e
to
df7ae8c
Compare
No description provided.