[Bug]: ov::InferRequest::infer() not thread safe when having multiple models #24509

daaguirre · 2024-05-14T12:54:49Z

OpenVINO Version

2023.3.0

Operating System

Ubuntu 20.04 (LTS)

Device used for inference

CPU

Framework

None

Model used

No response

Issue description

When running concurrently multiple inferences of multiple models with OpenVino 2023.3.0 the call to method ov::InferRequest::infer() is not thread safe. However, in the previous version 2023.0.0, this method is thread safe.

It's importantant to point out that to trigger the error any model must be run first in a parent thread, later from parent thread if we spawn more threads for running same/different models concurrently the crash happens. So if commenting out the for loop marked with comment comment out below for loop to prevent crash the crash won't happen in attached code snippet.

Note that this crash is random and it does not happen on all executions, that's why provided code snippet is wrapped in a for loop to force the error.

Step-by-step reproduction

The following code snippet crashes with a segfault when linked to 2023.3.0 and works correctly with 2023.0.0.

The error can be reproduced with any generic model:
efficientnet: https://drive.google.com/file/d/1a7AoEi165ZF1dJNfD_OklQVs_QpsHcA2/view?usp=sharing
googlenet: https://drive.google.com/file/d/15lxSLWmM4EUrc63ReGD-Kwj6DlcEl6a-/view?usp=sharing

#include <string>
#include <random>
#include <thread>
#include <vector>

#include <openvino/runtime/core.hpp>
#include <openvino/core/preprocess/pre_post_process.hpp>

void RunInParallel(std::function<void()> func, uint32_t num_threads) {
    std::vector<std::thread> threads;
    threads.reserve(num_threads);
    for (size_t i = 0; i < num_threads; ++i) {
        threads.emplace_back(func);
    }
    for (auto& t : threads) {
        t.join();
    }
}

struct Nnet {
    explicit Nnet(const std::string& model_path) {
        ov::Core core;
        std::shared_ptr<ov::Model> model = core.read_model(model_path);
        ov::element::Type input_type = ov::element::f32;
        // input_shape = model->inputs()[0].get_partial_shape();
        auto partial_shape = model->inputs()[0].get_partial_shape();
        std::vector<size_t> input_shape = {1, static_cast<size_t>(partial_shape[1].get_length()),
                                           static_cast<size_t>(partial_shape[2].get_length()),
                                           static_cast<size_t>(partial_shape[3].get_length())};
        ov::preprocess::PrePostProcessor ppp(model);
        for (const auto& in : model->inputs()) {
            for (const auto& name : in.get_names()) {
                auto& actual_input = ppp.input(name);
                actual_input.tensor()
                        .set_shape(std::vector<int64_t>{input_shape.begin(), input_shape.end()})
                        .set_element_type(input_type)
                        .set_layout("NHWC");
            }
        }
        model = ppp.build();

        ov::AnyMap properties;
        properties.emplace(ov::inference_num_threads(2));
        compiled_model = core.compile_model(model, "CPU", properties);
        input_size = input_shape[0] * input_shape[1] * input_shape[2] * input_shape[3];
    }

    static std::vector<float> randomVector(size_t size) {
        std::random_device rd;
        std::mt19937 gen(rd());
        std::uniform_int_distribution<> dis(0, 255);
        std::vector<float> vector;
        vector.reserve(size);
        for (size_t i = 0; i < size; ++i) {
            vector.push_back(static_cast<float>(dis(gen)));
        }
        return vector;
    }

    float Run() const {
        std::vector<float> input = randomVector(input_size);

        ov::InferRequest infer_request;

        try {
            infer_request = compiled_model.create_infer_request();
        } catch (std::exception& e) {
            throw std::runtime_error("failed to create infer request: " + std::string(e.what()));
        }

        ov::Tensor input_tensor = infer_request.get_input_tensor();
        std::copy(input.begin(), input.end(), input_tensor.data<float>());
        try {
            infer_request.infer();
        } catch (std::exception& e) {
            throw std::runtime_error("failed to execute inference request: " + std::string(e.what()));
        }
        const ov::Tensor& output_tensor = infer_request.get_output_tensor();
        return output_tensor.data<float>()[0];
    }

    mutable ov::CompiledModel compiled_model;
    size_t input_size;
};

struct Runner {
    explicit Runner(const std::vector<std::string>& models) {
        nnets.reserve(models.size());
        for (const auto& model : models) {
            nnets.push_back(std::make_shared<Nnet>(model));
        }
    }

    void Run() {
        size_t num_threads = nnets.size();
        std::vector<std::thread> threads;
        threads.reserve(num_threads);
        // comment out below for loop to prevent crash
        for (auto nnet : nnets) {
            nnet->Run();
        }
        for (auto nnet : nnets) {
            threads.emplace_back([nnet]() { return nnet->Run(); });
        }
        for (auto& t : threads) {
            t.join();
        }
    }

    std::vector<std::shared_ptr<Nnet>> nnets;
};

int main() {
    std::cout << "Runner:" << std::endl;
    std::vector<std::string> models{
            "efficientnet-lite4-11.onnx",
            "googlenet-9.onnx",
    };
    Runner runner(models);
    for (size_t i = 0; i < 20; ++i) {
        std::cout << "i: " << i << std::endl;
        RunInParallel([&runner]() { runner.Run(); }, 2);
    }

    return 0;
}

Relevant log output

ThreadSanitizer:DEADLYSIGNAL
==22332==ERROR: ThreadSanitizer: SEGV on unknown address (pc 0x7fa049230f21 bp 0x7fa02b0fcd50 sp 0x7fa02b0fcc70 T22457)
==22332==The signal is caused by a READ memory access.
==22332==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
ThreadSanitizer:DEADLYSIGNAL
    #0 <null> <null> (libopenvino.so.2330+0xa30f21) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #1 <null> <null> (libopenvino.so.2330+0xa66164) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #2 <null> <null> (libtbb.so.12+0xe499) (BuildId: e5d9a936de2ae01503a3a5235d75f26ef30831a5)
    #3 ov::threading::CPUStreamsExecutor::execute(std::function<void ()>) <null> (libopenvino.so.2330+0xa6edc2) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #4 <null> <null> (libopenvino.so.2330+0xa2c8db) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #5 ov::IAsyncInferRequest::run_first_stage(__gnu_cxx::__normal_iterator<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> >*, std::vector<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> >, std::allocator<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> > > > >, __gnu_cxx::__normal_iterator<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> >*, std::vector<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> >, std::allocator<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> > > > >, std::shared_ptr<ov::threading::ITaskExecutor>) <null> (libopenvino.so.2330+0xa308b0) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #6 ov::IAsyncInferRequest::infer() <null> (libopenvino.so.2330+0xa33e4d) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #7 ov::InferRequest::infer() <null> (libopenvino.so.2330+0xac70fb) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #8 Nnet::Run() const /home/daguirre/dev/idlive-doc-sdk/sdk/utils/public/force_ov_crash.cpp:75:27 (force_ov_crash+0xd7e14) (BuildId: a2c49a9f5e43d6bf7a0c37aa7c8daf3ae5289edf)
    #9 Runner::Run()::'lambda'()::operator()() const /home/daguirre/dev/idlive-doc-sdk/sdk/utils/public/force_ov_crash.cpp:103:58 (force_ov_crash+0xd98df) (BuildId: a2c49a9f5e43d6bf7a0c37aa7c8daf3ae5289edf)
    #10 float std::__invoke_impl<float, Runner::Run()::'lambda'()>(std::__invoke_other, Runner::Run()::'lambda'()&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:61:14 (force_ov_crash+0xd98df)
    #11 std::__invoke_result<Runner::Run()::'lambda'()>::type std::__invoke<Runner::Run()::'lambda'()>(Runner::Run()::'lambda'()&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:96:14 (force_ov_crash+0xd98df)
    #12 float std::thread::_Invoker<std::tuple<Runner::Run()::'lambda'()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:279:13 (force_ov_crash+0xd98df)
    #13 std::thread::_Invoker<std::tuple<Runner::Run()::'lambda'()> >::operator()() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:286:11 (force_ov_crash+0xd98df)
    #14 std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runner::Run()::'lambda'()> > >::_M_run() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:231:13 (force_ov_crash+0xd98df)
    #15 <null> <null> (libstdc++.so.6+0xdc252) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2)
    #16 __tsan_thread_start_func <null> (force_ov_crash+0x51768) (BuildId: a2c49a9f5e43d6bf7a0c37aa7c8daf3ae5289edf)
    #17 start_thread nptl/./nptl/pthread_create.c:442:8 (libc.so.6+0x94ac2) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348)
    #18 <null> misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 (libc.so.6+0x12684f) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348)

ThreadSanitizer can not provide additional info.
SUMMARY: ThreadSanitizer: SEGV (/home/daguirre/dev/idlive-doc-sdk/prebuilt/openvino/openvino-2023.3.0/x64-gcc7.5-glibc2.27/lib/libopenvino.so.2330+0xa30f21) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5) 
==22332==ABORTING

Issue submission checklist

I'm reporting an issue. It's not a question.
I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
There is reproducer code and related data files such as images, videos, models, etc.

The text was updated successfully, but these errors were encountered:

riverlijunjie · 2024-05-15T14:24:38Z

I can reproduce the similar issue in master branch with above test code. The race condition occurred during one thread doing infer() and another thread doing create_infer_request(). Further to narrow down into ov::threading::CPUStreamsExecutor::Impl::CustomThreadLocal::_stream_map issue, but will go to figure out solution to solve this race condition.

daaguirre · 2024-05-16T15:15:17Z

great, thank you!

riverlijunjie · 2024-05-17T07:50:40Z

@daaguirre could you have a try this PR? #24562

daaguirre · 2024-05-20T17:59:55Z

thank you! I'll give it a try tomorrow and let you know.
Additionally, is it possible to use async API to bypass this error?

riverlijunjie · 2024-05-20T23:15:52Z

async API

Likely async API also faces the similar problem.

daaguirre · 2024-05-23T07:49:46Z

@daaguirre could you have a try this PR? #24562

Hi! sorry the late response, I already tried the PR and it is working now! thank you!

what is the plan to release this fix?

riverlijunjie · 2024-05-23T11:30:24Z

It will be in 24.2 release, do you need port it back to 23.3?

### Details: - *I lost code `t_stream_count_map[(void*)this] = item.first;` in #19832* - *the thread safe issue happen in below workflow* - create thread A call CustomThreadLocal:local() in thread A -> create stream A (the count of stream A is 2) destory thread A (the count of stream A is 1) create thread B (same thread id with thread A) call CustomThreadLocal:local() in thread B -> use stream A(the count of stream A is 1, so it's broken) - *add testcase, also fix https://github.com/openvinotoolkit/openvino/pull/19986/files#r1332774754* ### Tickets: - Closes #24509 --------- Signed-off-by: HU Yuan2 <[email protected]> Co-authored-by: Wanglei Shen <[email protected]>

daaguirre · 2024-05-23T13:28:23Z

oh yes please, could we have a patch of 23.3?, thank you!

…t#24562) - *I lost code `t_stream_count_map[(void*)this] = item.first;` in - *the thread safe issue happen in below workflow* - create thread A call CustomThreadLocal:local() in thread A -> create stream A (the count of stream A is 2) destory thread A (the count of stream A is 1) create thread B (same thread id with thread A) call CustomThreadLocal:local() in thread B -> use stream A(the count of stream A is 1, so it's broken) - *add testcase, also fix https://github.com/openvinotoolkit/openvino/pull/19986/files#r1332774754* - Closes openvinotoolkit#24509 --------- Signed-off-by: HU Yuan2 <[email protected]> Co-authored-by: Wanglei Shen <[email protected]>

### Details: - *port PR #24562 to release 2023.3* - *add support for InferAPI1 and api_version* ### Tickets: - *[issue-24509](#24509 --------- Signed-off-by: HU Yuan2 <[email protected]> Co-authored-by: Wanglei Shen <[email protected]>

daaguirre added bug Something isn't working support_request labels May 14, 2024

ilya-lavrenov assigned riverlijunjie May 14, 2024

riverlijunjie assigned tiger100256-hu May 17, 2024

tiger100256-hu mentioned this issue May 17, 2024

[core] fix thread safe issue in #19832 #24562

Merged

ilya-lavrenov added this to the 2024.2 milestone May 23, 2024

wangleis closed this as completed in #24562 May 23, 2024

ilya-lavrenov added the category: inference OpenVINO Runtime library - Inference label May 23, 2024

tiger100256-hu mentioned this issue May 24, 2024

port PR 24562 to release/2023/3 #24670

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: ov::InferRequest::infer() not thread safe when having multiple models #24509

[Bug]: ov::InferRequest::infer() not thread safe when having multiple models #24509

daaguirre commented May 14, 2024 •

edited

riverlijunjie commented May 15, 2024

daaguirre commented May 16, 2024

riverlijunjie commented May 17, 2024

daaguirre commented May 20, 2024

riverlijunjie commented May 20, 2024

daaguirre commented May 23, 2024

riverlijunjie commented May 23, 2024

daaguirre commented May 23, 2024

[Bug]: ov::InferRequest::infer() not thread safe when having multiple models #24509

[Bug]: ov::InferRequest::infer() not thread safe when having multiple models #24509

Comments

daaguirre commented May 14, 2024 • edited

OpenVINO Version

Operating System

Device used for inference

Framework

Model used

Issue description

Step-by-step reproduction

Relevant log output

Issue submission checklist

riverlijunjie commented May 15, 2024

daaguirre commented May 16, 2024

riverlijunjie commented May 17, 2024

daaguirre commented May 20, 2024

riverlijunjie commented May 20, 2024

daaguirre commented May 23, 2024

riverlijunjie commented May 23, 2024

daaguirre commented May 23, 2024

daaguirre commented May 14, 2024 •

edited