-
-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: onnx runtime upgrade #478
base: develop
Are you sure you want to change the base?
Conversation
Sure! I will do it for you. |
@kallebysantos I think your inference api branch is ahead of this PR, what do you think? If you don't intend to merge this PR, I think it's probably a good idea to close it. (But if that was a temporary branch for testing, I think it would be fine to leave this PR open.) |
Hi @nyannyacha thank you for reviewing it, you're partially right let me explain:
There's another thing, I would like to support String Tensors cause I will probably need them for inference api - Maybe I could move this task to be included on the next PR while we discuss better about this topic. I found some issues in adding String Tensors into the current code base without use serde |
bfe239b
to
7970f6a
Compare
Sidenote: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
068fb62
to
36ae2ca
Compare
I believe the reason Coveralls reported that this PR's coverage was reduced is probably due to the tests related to s3 fs. |
- `ndarray_linalg`: Since `ort` deps to `ndarray=0.16`, decided to remove this extra dependency. Normalization can be done in a simple small funtion instead of deriving it from an external crate.
using the `rc-9` apis to get the same results from older `rc-2`
Since `ONNX`version has been upgraded, the snapshots need to be recreated
- Since `ndarray_linalg` has been removed we need to manually apply L2 normalization
- Applying `dot product` between two embeddings to ensures that `mean_pool` and `normalize` is working properly
Co-authored-by: Nyannyacha <[email protected]> Co-authored-by: kallebysantos <[email protected]>
36ae2ca
to
0893e43
Compare
Hello @nyannyacha, I did rebase to Deno v1$ ORT_DYLIB_PATH="$(pwd)/temp/libonnxruntime.1.20.dylib" ./scripts/run.sh
event manager running
main function started
2025-02-22T16:02:56.831585Z ERROR ThreadId(32) cpu_timer: CPU timer: not enabled (need Linux)
{
timestamp: "2025-02-22T16:02:56.832Z",
event_type: "Boot",
event: [Object: null prototype] { boot_time: 105 },
metadata: [Object: null prototype] {
service_path: "./examples/ort-rust-backend",
execution_id: "10bb7da6-05c9-4498-8386-cbbf66d4ef21"
}
}
dtype not specified for "model". Using the default dtype (fp32) for this device (auto).
{
timestamp: "2025-02-22T16:03:01.009Z",
event_type: "Shutdown",
event: [Object: null prototype] {
reason: "EarlyDrop",
cpu_time_used: 1313,
memory_used: [Object: null prototype] {
total: 145202195,
heap: 9199016,
external: 136003179,
mem_check_captured: [Object: null prototype] {
current: [Object: null prototype] {
totalHeapSize: 12025856,
totalHeapSizeExecutable: 524288,
totalPhysicalSize: 11206656,
totalAvailableSize: 148545372,
totalGlobalHandlesSize: 24576,
usedGlobalHandlesSize: 20000,
usedHeapSize: 9200480,
mallocedMemory: 540772,
externalMemory: 136003179,
peakMallocedMemory: 8513536
},
exceeded: false
}
}
},
metadata: [Object: null prototype] {
service_path: "./examples/ort-rust-backend",
execution_id: "10bb7da6-05c9-4498-8386-cbbf66d4ef21"
}
} $ curl localhost:9998/ort-rust-backend
{"ort_tensor":{"cpuData":{"0":-0.03076750412583351,"1":0.0007690626662224531, ... Deno v2$ ORT_DYLIB_PATH="$(pwd)/temp/libonnxruntime.1.20.dylib" ./scripts/run.sh
event manager running
main function started
{
deno: "supabase-edge-runtime-0.1.0 (compatible with Deno v2.1.4)",
v8: "11.6.189.12",
typescript: "5.1.6"
}
2025-02-22T15:39:12.856039Z ERROR ThreadId(35) cpu_timer: CPU timer: not enabled (need Linux)
{
timestamp: "2025-02-22T15:39:12.856Z",
event_type: "Boot",
event: [Object: null prototype] { boot_time: 107 },
metadata: [Object: null prototype] {
service_path: "./examples/ort-rust-backend",
execution_id: "a85bfb82-cd8c-4ce6-ac14-4fc5ae207d1e"
}
}
dtype not specified for "model". Using the default dtype (fp32) for this device (auto).
2025-02-22T15:39:15.709103Z ERROR sb-supervisor base::worker::supervisor::strategy_per_worker: memory limit reached for the worker: isolate: a85bfb82-cd8c-4ce6-ac14-4fc5ae207d1e
2025-02-22T15:39:15.776976Z ERROR main base::worker::pool: failed to send request to user worker: request has been cancelled by supervisor
2025-02-22T15:39:15.777106Z ERROR ThreadId(03) ext_workers: user worker failed to respond: request has been cancelled by supervisor
WorkerRequestCancelled: request has been cancelled by supervisor
at async Function.allSettled (<anonymous>)
at async UserWorker.fetch (ext:user_workers/user_workers.js:78:63)
at async callWorker (file:///var/tmp/sb-compile-edge-runtime/edge-runtime/examples/main/index.ts:131:14)
at async respond (ext:runtime/http.js:202:16) {
name: "WorkerRequestCancelled"
}
{
timestamp: "2025-02-22T15:39:16.352Z",
event_type: "Shutdown",
event: [Object: null prototype] {
reason: "Memory",
cpu_time_used: 440,
memory_used: [Object: null prototype] {
total: 151218656,
heap: 15113016,
external: 136105640,
mem_check_captured: [Object: null prototype] {
current: [Object: null prototype] {
totalHeapSize: 25051136,
totalHeapSizeExecutable: 524288,
totalPhysicalSize: 24936448,
totalAvailableSize: 135253480,
totalGlobalHandlesSize: 24576,
usedGlobalHandlesSize: 20640,
usedHeapSize: 21983728,
mallocedMemory: 1867952,
externalMemory: 136817301,
peakMallocedMemory: 8480768
},
exceeded: true
}
}
},
metadata: [Object: null prototype] {
service_path: "./examples/ort-rust-backend",
execution_id: "a85bfb82-cd8c-4ce6-ac14-4fc5ae207d1e"
}
} Same happens if I checkout |
Yeah, when we upgraded to deno 2, the used heap size increased by about 2.4 x (9.2 MiB -> 21.9 MiB). So, it exceeded the 150 MiB memory limit, which was already tight. |
What kind of change does this PR introduce?
Refactor, upgrade
What is the current behavior?
Current the
ort rust backend
is usingort rc-2
&onnx v1.19.2
What is the new behavior?
This PR introduces:
ort
: library upgrade fromrc-2
torc-9
onnx
: support from1.19.2
to1.20.1
ndarray_linalg
: removal of unnecessary library, that last release was about 2y agoTODO:
String
tensors.Upgrade to upcomingLets useort rc-10
rc-9
for now, then I push another PR forrc-10
Need help:
I would like to ask @nyannyacha 💚, if possible, to add updated
tests snapshots
for other platforms as well dok6
tests comparing to the latest version of that.