Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node-api: run Node-API finalizers in GC second pass #42515

Closed
wants to merge 1 commit into from

Conversation

vmoroz
Copy link
Member

@vmoroz vmoroz commented Mar 29, 2022

The issue

Currently Reference finalizers are run inside of SetImmediate.
In case if user code creates a lot of native objects in the main script, it could cause a significant memory pressure, even if the objects are properly released. This is because they are "collected" only inside of SetImmediate that follows the script run.
See issue: nodejs/node-addon-api#1140

In the a74a6e3 commit the processing of finalizers was moved from the GC second pass to the SetImmediate because finalizers may throw JavaScript exceptions which can affect behavior of other functions.
If the JavaScript exception is thrown inside of SetImmediate, then it causes an unhandled exception which can be handled process-wide with help of process.on('uncaughtException', ...) event handler.

The solution

In this PR we are switching back to processing finalizers in the GC second pass which may happen any time after GC calls.
To address the issue of handling JS exceptions from finalizers we do the following in this PR:

  • By default, all JS exceptions from finalizers cause the Node.js unhandled exceptions. We apply the same error handling mechanism as used by immediate tasks created by SetImmediate. Thus, we address the previous issue with the finalizer JS errors interrupting other functions, and align the error handling behavior with SetImmediate.
  • In addition to it, we are adding new node_api_set_finalizer_error_handler public API to setup an error handler per napi_env instance which is created per each native module. Each native module can handle its finalizer JS errors on its own.

Tests

New test_finalizer_exception test is added to js-native-api to verify the new behavior.
All other js-native-api, node-api, and cctests are passing.

Documentation

The n-api.md documentation is updated with the new node_api_set_finalizer_error_handler public API.
See the n-api.md for the details about the node_api_set_finalizer_error_handler function.

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/gyp
  • @nodejs/node-api

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. labels Mar 29, 2022
Copy link
Member

@addaleax addaleax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to it, we are adding new node_api_set_finalizer_error_handler public API to setup an error handler per napi_env instance which is created per each native module. Each native module can handle its finalizer JS errors on its own.

Why? This seems like a bad idea. Addons that need to handle errors should explicitly check for exceptions rather than installing semi-global handlers for that.

@addaleax
Copy link
Member

In the a74a6e3 commit the processing of finalizers was moved from the GC second pass to the SetImmediate because finalizers may throw JavaScript exceptions which can affect behavior of other functions.

This is only half the truth – the significantly bigger issue is that the finalizers may run at any point during other synchronous JavaScript execution if GC happens to kick in.

@legendecas legendecas added the node-api Issues and PRs related to the Node-API. label Mar 30, 2022
@legendecas
Copy link
Member

FWIW, in most cases, the second pass callbacks are called in the platform idle tasks queue. However, there are cases where the second pass callbacks are invoked synchronously :

  1. reached isolate external memory limit, or
  2. with Isolate::RequestGarbageCollectionForTesting (or the equivalent one global.gc())

I'd agree that allowing arbitrary JavaScript to run during second callbacks is a bad idea. I'm wondering if there are chances that we can call into addons with disallowing JavaScript execution to give addons a chance to release external memory, especially when the program reached the isolate's external memory limit?

@vmoroz
Copy link
Member Author

vmoroz commented Mar 30, 2022

In addition to it, we are adding new node_api_set_finalizer_error_handler public API to setup an error handler per napi_env instance which is created per each native module. Each native module can handle its finalizer JS errors on its own.

Why? This seems like a bad idea. Addons that need to handle errors should explicitly check for exceptions rather than installing semi-global handlers for that.

It is a "safety net" for developers who work on a big native addons.

When a big team works on an addon, they individually may not know all aspects of the addon. Having an error handler per addon may help the team to weed out any accidental issues that they introduced. Otherwise, an unhandled error will cause Node.js to crash globally, and someone else in production environment must do the extra work to attribute that crash to the specific addon and notify the team.

@devsnek
Copy link
Member

devsnek commented Mar 30, 2022

i'm also not a fan of this approach. it sounds like your issue is less about finalizer errors and more about review culture on these teams.

@vmoroz
Copy link
Member Author

vmoroz commented Mar 30, 2022

In the a74a6e3 commit the processing of finalizers was moved from the GC second pass to the SetImmediate because finalizers may throw JavaScript exceptions which can affect behavior of other functions.

This is only half the truth – the significantly bigger issue is that the finalizers may run at any point during other synchronous JavaScript execution if GC happens to kick in.

This is the main goal of this PR. Without running the finalizers as a part of the GC we cannot free the native resources on-time and in some scenarios it causes a pseudo memory leak (see nodejs/node-addon-api#1140).

Is your concern of running JS code as a part of finalizers as @legendecas described above?
If we force the finalizers that run as part of GC to not run JS-related code, then would you agree with such approach?

@vmoroz
Copy link
Member Author

vmoroz commented Mar 30, 2022

i'm also not a fan of this approach. it sounds like your issue is less about finalizer errors and more about review culture on these teams.

Agree. Having a better review culture can address some issues.

Though, in real code that may have multiple levels of indirections it is practically impossible to find all places that may accidently cause JS errors thrown, unless we completely prohibit touching JS code in the finalizers.

Note that our docs, samples, tests, or examples never mention or show how to handle JS errors in finalizers.
I doubt that many developers are even aware of that problem.

Also, I do not completely understand why handling unhandled exceptions in the global process-wide level is a good and a practical thing but giving the opportunity to the native addon developers to do the same for their code is a bad practice.
Is it because these JS errors maybe caused by the JS code that the native addon developers do not control?
E.g. the finalizer tries to set a property, but the object happened to be frozen by some other JS code.
In such case the local addon finalizer error handler can hide JS code issue from another module.

@vmoroz
Copy link
Member Author

vmoroz commented Mar 30, 2022

I'd agree that allowing arbitrary JavaScript to run during second callbacks is a bad idea. I'm wondering if there are chances that we can call into addons with disallowing JavaScript execution to give addons a chance to release external memory, especially when the program reached the isolate's external memory limit?

It sounds like we may need to have two classes of finalizers:

  • "pure" finalizers that do not touch JS that can run synchronously by GC second or even the first pass.
  • "traditional" finalizers that may touch JS and which must always be run asynchronously by SetImmediate.

Is it correct?

Copy link
Contributor

@gabrielschulhof gabrielschulhof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the finalizer error handler have to be a napi_value? Why not a C function pointer?

@vmoroz
Copy link
Member Author

vmoroz commented Apr 1, 2022

This PR was discussed today in Node-API meeting. The focus was mostly on whether we must introduce new "pure" finalizers.
The consensus was that we are going to proceed with their implementation. To summarize the work planned:

  • A new PR will be added to implement the "pure" finalizers, while this PR will be rescoped only to the custom finalizer error handlers.
  • Pure finalizers are going to use the same finalizer callback function signature, but the napi_env parameter is going to be NULL. This is to discourage calling other Node-API functions. In future we can address the safety of pure finalizers by different means and may start passing non-null value.
  • Marking all Node-API function as safe when they are called from the pure finalizers seems to be expensive at this point and we are going to address it in future.
  • We are going to create duplicates of all functions that accept finalizer callbacks with node_api_ prefix instead of napi_ prefix. These new functions are going to have a new flags parameter that control the finalizer behavior.
  • Non-pure finalizer are going to be called from the SetImmediate as we do it today.
  • Pure finalizers are going to be called from the GC finalizer call.

@vmoroz
Copy link
Member Author

vmoroz commented Apr 8, 2022

PR #42651 adds the "pure" finalizers.
This PR is going to be modified to keep only finalizer error handlers.

@vmoroz
Copy link
Member Author

vmoroz commented Apr 8, 2022

Why does the finalizer error handler have to be a napi_value? Why not a C function pointer?

Seeing that unhandled exceptions can be handled today from JS code, I thought that making the finalizer error handler to be a JS function will make it follow the same lines.

@vmoroz
Copy link
Member Author

vmoroz commented May 13, 2022

Closing this PR since the idea of having a handler for finalizer errors seems to be unpopular.

@vmoroz vmoroz closed this May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. lib / src Issues and PRs related to general changes in the lib or src directory. needs-ci PRs that need a full CI run. node-api Issues and PRs related to the Node-API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants