Add support for MLprogram in ort_coreml #116

yuygfgg · 2024-11-05T13:33:06Z

Onnxruntime supports two Core ML execution providers: NeuralNetwork and MLProgram. The NeuralNetwork provider is the default choice as it supports a wider range of operators, but it does not support FP16 precision (so all nodes falls to CPUExecutionProvider).

The MLProgram provider, while newer and currently supporting fewer operators, does support FP16 and is under active development. (Recent GitHub PRs suggest that it will mature rapidly, adding tens of new operators). Although it might be slower now due to limited operator support, once it achieves comprehensive coverage, the potential CPU/GPU acceleration through FP16 could make it perform better than the NeuralNetwork provider.

In Onnxruntime, the ONNX model is converted to a Core ML model and saved to disk, which is then loaded via Apple's CoreML framework. By choosing FP16 inputs with the MLProgram provider, we can significantly reduce both memory and disk usage as the Core ML model will be stored in a more compact FP16 format. While the ANE always performs computations in FP16 internally regardless of input precision, making FP16 acceleration unnecessary for the neural engine itself, the storage benefits remain valuable.

Moreover, FP16 inputs may accelerate computations on GPU and CPU, as both support FP16 (though not enabled by default). However, the exact behavior of FP16 handling in Onnxruntime remains unclear due to its complex execution flow: ORT first decides which nodes to assign to CoreML, uses CPUExecutionProvider for the rest, and then CoreML further distributes its nodes among CPU, GPU, and ANE.

For more details on FP16 behavior, refer to this documentation: 16-bit precision in Core ML on ANE.

ML Program relevant PRs: microsoft/onnxruntime#19347 microsoft/onnxruntime#22068 microsoft/onnxruntime#22480 microsoft/onnxruntime#22710 and so on.

It enables fp16 computation on ANE, instead of allocating all to CPU. However, the MLprogram is not well-supported currently, supporting much less EPs than regular NeuralNetwork.

yuygfgg · 2024-11-05T13:33:32Z

Need onnxruntime >= 1.20.0

WolframRhodium · 2024-11-05T14:13:07Z

Interesting and thanks for the information.

WolframRhodium · 2024-11-06T06:10:47Z

please use snake case and place the ml_program param to the end of the param list

… list

yuygfgg · 2024-11-28T13:07:29Z

test on m2pro:

ml_program=1 + fp16=True: ANE 120% usage, 10.52fps
ml_program=1 + fp16=False: GPU 97% usage, 5.40fps
ml_program=0 + fp16=True: CPU 100% usage, I have no patience to wait
ml_program=0 + fp16=False: ANE 114% usage, 10.23fps

script:

import vapoursynth as vs
from vapoursynth import core
import vsmlrt

src = core.lsmas.LWLibavSource('/path/to/source').resize.Spline36(1920//2, 1080//2, format=vs.RGBS, matrix_in_s="709") # same performance if format=RGBH
fin = vsmlrt.Waifu2x(clip=src, noise=-1, scale=2, backend=vsmlrt.Backend.ORT_COREML(ml_program=1, fp16=True), model=vsmlrt.Waifu2xModel.anime_style_art_rgb) # anime_style_art_rgb uses simplest Ops, almost all supported by tested backends

fin.set_output(0)

WolframRhodium · 2024-11-29T01:49:57Z

Thanks for your contribution!

yuygfgg added 2 commits November 5, 2024 20:53

Add support for MLprogram

28ce473

It enables fp16 computation on ANE, instead of allocating all to CPU. However, the MLprogram is not well-supported currently, supporting much less EPs than regular NeuralNetwork.

Add support for MLprogram in vsmlrt.py

789dc8a

Update vsoort/README.md

819734b

yuygfgg added 4 commits November 8, 2024 09:18

use snake case and place the ml_program param to the end of the param…

53c7857

… list

Merge branch 'AmusementClub:master' into patch-1

31e2465

get a macos build to test

60a1cfd

fix mac ort onnxruntime version

c7e014f

yuygfgg marked this pull request as ready for review November 28, 2024 13:07

remove --always in macort CI

cfdc41d

WolframRhodium merged commit a2b1a88 into AmusementClub:master Nov 29, 2024

yuygfgg mentioned this pull request Nov 29, 2024

CoreML - Writing CoreML Model on every inference session creation microsoft/onnxruntime#21761

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for MLprogram in ort_coreml #116

Add support for MLprogram in ort_coreml #116

yuygfgg commented Nov 5, 2024 •

edited

Loading

yuygfgg commented Nov 5, 2024

WolframRhodium commented Nov 5, 2024

WolframRhodium commented Nov 6, 2024 •

edited

Loading

yuygfgg commented Nov 28, 2024 •

edited

Loading

WolframRhodium commented Nov 29, 2024

Add support for MLprogram in ort_coreml #116

Add support for MLprogram in ort_coreml #116

Conversation

yuygfgg commented Nov 5, 2024 • edited Loading

yuygfgg commented Nov 5, 2024

WolframRhodium commented Nov 5, 2024

WolframRhodium commented Nov 6, 2024 • edited Loading

yuygfgg commented Nov 28, 2024 • edited Loading

WolframRhodium commented Nov 29, 2024

yuygfgg commented Nov 5, 2024 •

edited

Loading

WolframRhodium commented Nov 6, 2024 •

edited

Loading

yuygfgg commented Nov 28, 2024 •

edited

Loading