feat: Add AbortSignal support across library components #1193

sroussey · 2025-02-14T07:58:48Z

Introduce abort_signal parameter to multiple methods and constructors
Update file retrieval and loading mechanisms to support request cancellation
Add AbortSignal handling in pipelines, models, tokenizers, and utility functions

- Introduce `abort_signal` parameter to multiple methods and constructors - Update file retrieval and loading mechanisms to support request cancellation - Add AbortSignal handling in pipelines, models, tokenizers, and utility functions

sroussey

@xenova -- thoughts?

emojiiii · 2025-02-14T09:15:06Z

Looks like the same as #1190

sroussey · 2025-02-14T16:33:55Z

Looks like the same as #1190

I will have a look

sroussey · 2025-02-14T16:42:00Z

@xenova a few thoughts on what i did:

I added abort_signal to PretrainedOptions. Most places just pass the options around, but many places rebuild them, so you will see more changes than expected because of this.

I also added it to the ModelTokenizerProcessorConstructorArgs constructor and there are edge cases like _call_text_to_spectrogram. Not sure I like this though. :/

This PR so far only handles downloading models, etc.

Do you know where I should look as far as ONNX/ORT to cancel generation?

sroussey · 2025-02-14T16:45:48Z

Looks like the same as #1190

I will have a look

I can see reasons to have a custom fetch() but I think those are orthogonal to the ability to abort.

But perhaps we could have both.

I want to be able to abort generation as well as downloads, in which case fetch is not enough.

sroussey · 2025-02-14T17:03:09Z

For reference: microsoft/onnxruntime#23703

sroussey · 2025-02-14T17:29:16Z

Also, I made abort_signal required in various places... that is to help me find the places I have missed. I will make it optional when done.

- Add optional `abort_signal` parameter to `read_audio()` function - Update `RawImage.read()` and `RawImage.fromURL()` to support AbortSignal - Modify `VLChatProcessor` to include `abort_signal` configuration - Extend image and audio preparation methods in pipelines to pass AbortSignal

joecal · 2025-02-15T01:16:45Z

Looks like the same as #1190

I will have a look

I can see reasons to have a custom fetch() but I think those are orthogonal to the ability to abort.

But perhaps we could have both.

I want to be able to abort generation as well as downloads, in which case fetch is not enough.

I’ve been trying to figure out a way to stop text generation and so far I’ve come up with the following:

Something like this abortOnSignal helper function to wrap async operations to abort:

/**
 * A helper function to wrap asynchronous operations that can be stopped/aborted by an AbortSignal.
 * @param {Promise<unknown>} promise Any promise/async operation to be aborted.
 * @param {AbortSignal | null | undefined} signal The abort signal from an AbortController that can cancel the wrapped promise/async operation.
 * @returns {Promise<unknown>} Either the param promise or a rejected promise with AbortError if aborted.
 */
export function abortOnSignal(promise, signal) {
    if (!signal) {
        return promise;
    }
    return new Promise((resolve, reject) => {
        const abortHandler = () => {
            reject(new DOMException('Aborted', 'AbortError'));
        };
        signal.addEventListener('abort', abortHandler, { once: true });
        promise.then(resolve, reject)
            .finally(() => {
                signal.removeEventListener('abort', abortHandler);
            });
    });
}

Then add an abort signal param to the sessionRun function in models.js and wrap the session.run invocation with the abortOnSignal function with the signal param passed into it:

/**
 * Executes an InferenceSession using the specified inputs.
 * NOTE: `inputs` must contain at least the input names of the model.
 *  - If additional inputs are passed, they will be ignored.
 *  - If inputs are missing, an error will be thrown.
 * 
 * @param {Object} session The InferenceSession object to run.
 * @param {Object} inputs An object that maps input names to input tensors.
 * @param {AbortSignal | null | undefined} signal Optional abort signal from an AbortController to cancel the InferenceSession.
 * @returns {Promise<Object>} A Promise that resolves to an object that maps output names to output tensors.
 * @private
 */
async function sessionRun(session, inputs, signal = null) {
    const checkedInputs = validateInputs(session, inputs);
    try {
        // pass the original ort tensor
        const ortFeed = Object.fromEntries(Object.entries(checkedInputs).map(([k, v]) => [k, v.ort_tensor]));
        let output = await abortOnSignal(session.run(ortFeed), signal);
        output = replaceTensors(output);
        return output;
    } catch (e) {
        if (e.name === 'AbortError') {
            console.log("Generation aborted by user.");
        }
        // Error messages can be long (nested) and uninformative. For this reason,
        // we apply minor formatting to show the most important information
        const formatted = Object.fromEntries(Object.entries(checkedInputs)
            .map(([k, { type, dims, data }]) => [k, {
                // Extract these properties from the underlying ORT tensor
                type, dims, data,
            }]));

        // This usually occurs when the inputs are of the wrong type.
        console.error(`An error occurred during model execution: "${e}".`);
        console.error('Inputs given to model:', formatted);
        throw e;
    }
}

To use it you pass the signal into the generator as an argument:

const prompt = “How many letter ‘r’s are in the word ‘strawberry’?” 
const controller = new AbortController();
const generator = await pipeline('text-generation', MODEL, {
    dtype: 'q4f16',
    device: 'webgpu'
});
const generationArgs = { max_new_tokens: maxTokens, streamer: streamer, signal: controller.signal };
await generator(prompt, generationArgs);

Here’s a little demo of it working (Ignore the text output nonsense. I think my laptop processor is messed up somehow):

0214.mp4

It works at stopping the generation but the way I have it now feels kinda messy passing the signal around so much.

Take TextGenerationPipeline for example. The signal gets passed in through its _call args into this.model.generate then into this.forward then into this._forward which is decoderForward then into decoderForwards invocation of sessionRun then finally into the abortOnSignal wrapped around session.run.

I'm sure there's a cleaner way to do it. Would love to get your guys take on this approach.

sroussey · 2025-02-15T06:41:53Z

I haven't tried it myself yet, but it is definitely undocumented. I don't see signal or anything in the docs.

https://onnxruntime.ai/docs/api/js/interfaces/InferenceSession.SessionOptions.html#freeDimensionOverrides

sroussey · 2025-02-15T18:05:52Z

Oh, I see now... you are simulating an abort. The text generation keeps going though.

If you are using progress callbacks, you can just ignore them (after the user hits stop) to the same effect.

joecal · 2025-02-15T22:05:39Z

Oh, I see now... you are simulating an abort. The text generation keeps going though.

If you are using progress callbacks, you can just ignore them (after the user hits stop) to the same effect.

Oh, the text generation actually does stop completely when the abort signal is triggered. I did some testing with Chrome dev tools and the CPU usage drops to 0% when aborted, showing the generation process fully terminates.

The way it works is:

When the abort signal triggers, the abortOnSignal wrapper rejects the ONNX session.run() Promise with an AbortError
This error bubbles up through the generation loop in generate(), stopping the entire process

But yeah, I'm not super happy with having to pass the signal through so many layers. I was trying to follow the pattern used by the Fetch API where you need to get the abort signal all the way down to the actual async operation being aborted. Would be cool if in the future the InferenceSession could support the abort signal internally.

But maybe there's a cleaner way to structure this. Like maybe we could store the signal at the pipeline level instead of threading it through all those args. Let me know if you have any other ideas for improving it.

emojiiii · 2025-02-17T03:20:11Z

It seems like you're looking for https://github.com/huggingface/transformers.js/blob/main/src/models.js#L1573,
you can refer to https://github.com/huggingface/transformers.js-examples/blob/main/deepseek-r1-webgpu/src/worker.js#L114.

joecal · 2025-02-17T15:21:17Z

It seems like you're looking for https://github.com/huggingface/transformers.js/blob/main/src/models.js#L1573, you can refer to https://github.com/huggingface/transformers.js-examples/blob/main/deepseek-r1-webgpu/src/worker.js#L114.

I've been using the EosTokenCriteria and found it works great at stopping text generation at known specific token outputs. However, when I encountered the funny-looking text output issue in the video I posted above and then saw these abort signal PR's, it got me thinking: it would be cool if we could use an abort signal to stop any long-running async task in the entire pipeline, whether the task is downloading with fetch, token generation, or any other time-consuming asynchronous task. What do you guys think?

…directly, always in a config object, even if alone. Default is null like other options in the codebase.

sroussey · 2025-03-06T23:17:20Z

I updated the API to make sure abort_signal is never a naked parameter. Instead it is always part of an options object so others can be added later. Also set default value to null instead of undefined just so it doesn't stick out in the codebase as different.

sroussey · 2025-03-06T23:20:04Z

TODO: at least for pipeline(), I will create an InterruptableStoppingCriteria that will activate when the signal activates. I'll add it if no other stopping criteria, and convert any other to a list version and add this one to it if there is an abort_signal passed. Thanks for the headsup @emojiiii and @joecal

[edit: that should probably be a separate PR]

sroussey · 2025-03-08T19:29:53Z

I have a version of transformers.js with abort signal handling at @sroussey/transformers

[edit, both 3.3.3 and 3.4.0 versions]

sroussey commented Feb 14, 2025

View reviewed changes

sroussey mentioned this pull request Feb 14, 2025

[Web] [Feature Request] Ability to abort microsoft/onnxruntime#23703

Closed

sroussey added 2 commits March 6, 2025 13:31

Merge branch 'main' into abort_signal

704c7a0

Standardize abort_signal parameter across library. Never a parameter …

cd7849f

…directly, always in a config object, even if alone. Default is null like other options in the codebase.

sroussey marked this pull request as ready for review March 6, 2025 23:30

feat: Add AbortSignal support across library components #1193

Are you sure you want to change the base?

feat: Add AbortSignal support across library components #1193

Uh oh!

Conversation

sroussey commented Feb 14, 2025

Uh oh!

sroussey left a comment

Choose a reason for hiding this comment

Uh oh!

emojiiii commented Feb 14, 2025

Uh oh!

sroussey commented Feb 14, 2025

Uh oh!

sroussey commented Feb 14, 2025

Uh oh!

sroussey commented Feb 14, 2025

Uh oh!

sroussey commented Feb 14, 2025

Uh oh!

sroussey commented Feb 14, 2025

Uh oh!

joecal commented Feb 15, 2025

Uh oh!

sroussey commented Feb 15, 2025

Uh oh!

sroussey commented Feb 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joecal commented Feb 15, 2025

Uh oh!

emojiiii commented Feb 17, 2025

Uh oh!

joecal commented Feb 17, 2025

Uh oh!

sroussey commented Mar 6, 2025

Uh oh!

sroussey commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sroussey commented Mar 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

sroussey commented Feb 15, 2025 •

edited

Loading

sroussey commented Mar 6, 2025 •

edited

Loading

sroussey commented Mar 8, 2025 •

edited

Loading