diff --git a/rfcs/0014-task-workers.md b/rfcs/0014-task-workers.md index c4046262d9..333f882657 100644 --- a/rfcs/0014-task-workers.md +++ b/rfcs/0014-task-workers.md @@ -15,7 +15,7 @@ ## Summary -Concept for a new API provided to UI5 Tooling build tasks, enabling easy use of Node.js [Worker Threads](https://nodejs.org/api/worker_threads.html) to execute CPU intensive operations outside of the main thread. +Concept for a new API provided to UI5 Tooling tasks, enabling easy use of Node.js [Worker Threads](https://nodejs.org/api/worker_threads.html) to execute CPU intensive operations outside of the main thread. ## Motivation @@ -30,65 +30,128 @@ The pool should also be re-used when multiple projects are being built, either i ### Terminology * **`Worker`**: A Node.js [Worker thread](https://nodejs.org/api/worker_threads.html) instance -* **`Build Task`**: A UI5 Tooling build task such as "minify" or "buildThemes" (standard tasks) or any [custom task](https://sap.github.io/ui5-tooling/stable/pages/extensibility/CustomTasks/) -* **`Task Processor`**: A module associated with a UI5 Tooling Build Task (standard or custom) that can be executed in a `Worker` -* **`Build Context`**: An already existing ui5-project module, coupled to the lifecycle of a Graph Build. It shall be extended to provide access to the `Work Dispatcher` by forwarding requests from `Build Tasks` -* **`Thread Runner`**: A ui5-project module that will be loaded in a `Worker`. It handles communication with the main thread and executes a `Task Processor` on request -* **`Work Dispatcher`**: A ui5-project singleton module which uses a library like [`workerpool`](https://github.com/josdejong/workerpool) to spawn and manage `Worker` instances in order to have them execute any `Task Processor` requested by the Build Task - - Handles the `Worker` lifecycle +* **`Task`**: A UI5 Tooling task such as `minify` or `buildThemes` (both standard tasks) or any [custom task](https://sap.github.io/ui5-tooling/stable/pages/extensibility/CustomTasks/) +* **`Task Processor`**: A module associated with a UI5 Tooling task (standard or custom) that can be executed in a worker +* **`Build Context`**: An already existing ui5-project module, coupled to the lifecycle of a Graph Build. It shall be extended to provide access to the Work Dispatcher` by forwarding requests from tasks +* **`Thread Runner`**: A `@ui5/project` module that will be loaded in a worker. It handles communication with the main thread and executes a task processor on request +* **`Work Dispatcher`**: A `@ui5/project` singleton module which uses a library like [`workerpool`](https://github.com/josdejong/workerpool) to spawn and manage worker instances in order to have them execute any task processor requested by the task + - Handles the worker lifecycle ![](./resources/0014-task-workers/Overview.png) ### Key Design Decisions -* Task Processors shall be called with a well defined signature as described [below](#task-processor) -* A Task Processor should not be exposed to Worker-specific API - - I.e. it can be executed on the main thread as well as in a Worker - - This allows users as well as UI5 Tooling logic to control whether Workers are used or not - - For example in CI environments where only one CPU core is available to the build, Workers are expected to produce overhead - - Users might want to disable Workers to easily debug issues in Processors - - The UI5 Tooling build itself might already be running in a Worker +* Task processors shall be called with a defined signature as described [below](#task-processor) +* A task processor should not be exposed to Worker-specific API + - i.e. it can be executed on the main thread as well as in a Worker + - This allows UI5 Tooling to dynamically decide whether to use Workers or not + + For example in CI environments where only one CPU core is available to the build, Workers might cause unnecessary overhead + + Users might want to disable Workers to easily debug issues in processors + + The UI5 Tooling build itself might already be running in a Worker * The Work Dispatcher and Thread Runner modules will handle all inter-process communication - This includes serializing and de-serializing `@ui5/fs/Resource` instances -* Custom tasks can opt into this feature by defining one ore more "Task Processor" modules in its ui5.yaml -* A Task can only invoke its own Task Processor(s) +* Custom tasks can opt into this feature by defining one ore more task processor modules in their ui5.yaml +* A task can only invoke its own task processor(s) +* The work dispatcher or thread runners have no understanding of dependencies between the workloads + - Tasks are responsible for waiting on the completion of their processors + - Task processors should be executed in a first in, first out order ### Assumptions -* A Task Processor is assumed to utilize a CPU thread by 90-100% +* A task processor is assumed to utilize a single CPU thread by 90-100% - Accordingly they are also assumed to execute little to no I/O operations - - This means one Worker should never execute more than one Task Processor at the same time -* A Task Processor is stateless +* A Worker should never execute more than one task processor at a time +* Task processors are generally stateless ### Task Processor -Similar to Tasks, Task Processors shall be invoked with a well defined signature: +[Processors](https://sap.github.io/ui5-tooling/stable/pages/Builder/#processors) are an established concept in UI5 Tooling but not yet exposed to custom tasks. The basic idea is that tasks act as the glue code that connects a more generic processor to UI5 Tooling. For example, UI5 Tooling processors make use of very little UI5 Tooling API, making them easily re-usable in different environments like plain Node.js scripts. -* **`resources`**: An array of `@ui5/fs/Resource` provided by the Build Task -* **`options`**: An object provided by the Build Task -* **`fs`**: An optional fs-interface provided by the Build Task -* *[To be discussed] **`workspace`**: An optional workspace __reader__ provided by the Build Task* -* *[To be discussed] **`dependencies`**: An optional dependencies reader provided by the Build Task* -* *[To be discussed] **`reader`**: An optional generic reader provided by the Build Task* +With this RFC, we extend this concept to custom tasks. A task can define one or more processors and execute them with a defined API. Their execution is managed by UI5 Tooling, which might execute them on the main thread or in a worker. + +#### Input Parameters + +* **`resources`**: An array of `@ui5/fs/Resource` provided by the task +* **`options`**: An object provided by the task +* **`fs`**: An optional fs-interface provided by the task +* **`resourceFactory`** Specification-version dependent object providing helper functions to create and manage resources. + - **`resourceFactory.createResource`** Creates a `@ui5/fs/Resource` (similar to [TaskUtil#resourceFactory.createResource](https://sap.github.io/ui5-tooling/stable/api/@ui5_project_build_helpers_TaskUtil.html#~resourceFactory)) + - No other API for now and now general "ProcessorUtil" or similar, since processors should remain as UI5 Tooling independent as possible + +**_Potential future additions:_** +* _**`workspace`**: An optional workspace __reader__ provided by the task_ +* _**`dependencies`**: An optional dependencies reader provided by the task_ +* _**`reader`**: An optional generic reader provided by the task_ + +#### Return Values + +The allowed return values are rather generic. But since UI5 Tooling needs to serialize and de-serialize the values while transferring them back to the main thread, there are some limitations. + +The thread runner shall validate the **return value must be either**: +1. A value that adheres to the requirements stated in [Serializing Data](#serializing-data) +2. A flat object (`[undefined, Object].includes(value.constructor)`, to detect `Object.create(null)` and `{}`) with property values adhering to the requirements stated in [Serializing Data](#serializing-data) +3. An array (`Array.isArray(value)`) with values adhering to the requirements stated in [Serializing Data](#serializing-data) + +Note that nested objects or nested arrays must not be allowed until we become aware of any demand for that. + +Processors should be able to return primitives and `@ui5/fs/Resource` instances directly: +```js +return createResource({ + path: "resource/path" + string: "content" +}); +```` + +It should also be possible to return simple objects with primitive values or `@ui5/fs/Resource` instances: + +```js +return { + code: "string", + map: "string", + counter: 3, + someResource: createResource({ + path: "resource/path" + string: "content" + }), +} +``` + +Alternatively, processors might also return a lists of primitives or `@ui5/fs/Resource` instances: + +```js +return [ + createResource({ + path: "resource/path" + string: "content" + }), + createResource({ + path: "resource/path" + string: "content" + }), + //... +] +``` + +#### Example ```js /** * Task Processor example * * @param {Object} parameters Parameters - * @param {@ui5/fs/Resource[]} parameters.resources Array of resources provided by the build task + * @param {@ui5/fs/Resource[]} parameters.resources Array of resources provided by the task * @param {Object} parameters.options Options provided by the calling task * @param {@ui5/fs/fsInterface} parameters.fs [fs interface]{@link module:@ui5/fs/fsInterface}-like class that internally handles communication with the main thread - * @returns {Promise} Promise resolving with either a flat object containing Resource instances as values, or an array of Resources + * @param {@ui5/project/ProcessorResourceFactory} parameters.resourceFactory Helper object providing functions for creating and managing resources + * @returns {Promise} Promise resolving with either a flat object containing Resource instances as values, or an array of Resources */ -module.exports = function({resources, options, fs}) { +module.exports = function({resources, options, fs, resourceFactory}) { // [...] }; ```` ### Task Configuration - ```yaml specVersion: "3.3" kind: extension @@ -101,9 +164,18 @@ task: computePi: lib/tasks/piProcessor.js ``` - ### Task API +Tasks defining processors in their `ui5.yaml` configuration shall be provided with a new `processors` object, allowing them to trigger execution of the configured processors. + +The `processors.execute` function shall accept the following parameters: +* `resources` _(optional)_: Array of `@ui5/fs/Resource` instances if required by the processor +* `options` _(optional)_: An object with configuration for the processor. +* `reader` _(optional)_: An instance of `@ui5/fs/AbstractReader` which will be used to read resources requested by the task processor. If supplied, the task processor will be provided with a `fs` parameter to read those resources + + +The `execute` function shall validate that `resources` only contains `@ui5/fs/Resource` instances and that `options` adheres to the requirements stated in [Serializing Data](#serializing-data). + ```js /** * Custom task example @@ -119,22 +191,29 @@ task: */ module.exports = function({workspace, options, processors}) { const res = await processors.execute("computePi", { - resources: [workspace.byPath("/already-computed.txt")] - options: { + resources: [workspace.byPath("/already-computed.txt")] // Input resources + options: { // Processor configuration digits: 1_000_000_000_000_000_000_000 }, - fs: fsInterface(workspace) // To allow reading additional files if necessary + reader: workspace // To allow the processor to read additional files if necessary }); await workspace.write(res); // [...] }; ```` +### Serializing Data + +In order to ensure all data supplied to- and returned from- a processor can be serialized correctly, the following checks must be implemented: + +In case of an object, all property values and in case of an array, all values must be either [**primitives**](https://developer.mozilla.org/en-US/docs/Glossary/Primitive) (except `symbol`?) or **`@ui5/fs/Resource`** instances (do not use `instanceof` checks since Resource instances might differ depending on the specification version). + +Note: Instances of `@ui5/fs/Resource` might loose their original `stat` value since it is not fully serializable. Any serializable information will be preserved however. ## How we teach this -**TODO** +* Documentation for custom task developers on how to decide whether a task should use processors or not. For instance depending on their CPU demand ## Drawbacks diff --git a/rfcs/resources/0014-task-workers.graffle b/rfcs/resources/0014-task-workers.graffle index ef4b0b3bbe..ab607254ad 100644 Binary files a/rfcs/resources/0014-task-workers.graffle and b/rfcs/resources/0014-task-workers.graffle differ diff --git a/rfcs/resources/0014-task-workers/Overview.png b/rfcs/resources/0014-task-workers/Overview.png index abbe2881b7..eeac7fe0cf 100644 Binary files a/rfcs/resources/0014-task-workers/Overview.png and b/rfcs/resources/0014-task-workers/Overview.png differ