Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: ov::InferRequest::infer() not thread safe when having multiple models #24509

Closed
3 tasks done
daaguirre opened this issue May 14, 2024 · 8 comments · Fixed by #24562
Closed
3 tasks done

[Bug]: ov::InferRequest::infer() not thread safe when having multiple models #24509

daaguirre opened this issue May 14, 2024 · 8 comments · Fixed by #24562
Assignees
Labels
bug Something isn't working category: inference OpenVINO Runtime library - Inference support_request
Milestone

Comments

@daaguirre
Copy link

daaguirre commented May 14, 2024

OpenVINO Version

2023.3.0

Operating System

Ubuntu 20.04 (LTS)

Device used for inference

CPU

Framework

None

Model used

No response

Issue description

When running concurrently multiple inferences of multiple models with OpenVino 2023.3.0 the call to method ov::InferRequest::infer() is not thread safe. However, in the previous version 2023.0.0, this method is thread safe.

It's importantant to point out that to trigger the error any model must be run first in a parent thread, later from parent thread if we spawn more threads for running same/different models concurrently the crash happens. So if commenting out the for loop marked with comment comment out below for loop to prevent crash the crash won't happen in attached code snippet.

Note that this crash is random and it does not happen on all executions, that's why provided code snippet is wrapped in a for loop to force the error.

Step-by-step reproduction

The following code snippet crashes with a segfault when linked to 2023.3.0 and works correctly with 2023.0.0.

The error can be reproduced with any generic model:
efficientnet: https://drive.google.com/file/d/1a7AoEi165ZF1dJNfD_OklQVs_QpsHcA2/view?usp=sharing
googlenet: https://drive.google.com/file/d/15lxSLWmM4EUrc63ReGD-Kwj6DlcEl6a-/view?usp=sharing

#include <string>
#include <random>
#include <thread>
#include <vector>

#include <openvino/runtime/core.hpp>
#include <openvino/core/preprocess/pre_post_process.hpp>

void RunInParallel(std::function<void()> func, uint32_t num_threads) {
    std::vector<std::thread> threads;
    threads.reserve(num_threads);
    for (size_t i = 0; i < num_threads; ++i) {
        threads.emplace_back(func);
    }
    for (auto& t : threads) {
        t.join();
    }
}

struct Nnet {
    explicit Nnet(const std::string& model_path) {
        ov::Core core;
        std::shared_ptr<ov::Model> model = core.read_model(model_path);
        ov::element::Type input_type = ov::element::f32;
        // input_shape = model->inputs()[0].get_partial_shape();
        auto partial_shape = model->inputs()[0].get_partial_shape();
        std::vector<size_t> input_shape = {1, static_cast<size_t>(partial_shape[1].get_length()),
                                           static_cast<size_t>(partial_shape[2].get_length()),
                                           static_cast<size_t>(partial_shape[3].get_length())};
        ov::preprocess::PrePostProcessor ppp(model);
        for (const auto& in : model->inputs()) {
            for (const auto& name : in.get_names()) {
                auto& actual_input = ppp.input(name);
                actual_input.tensor()
                        .set_shape(std::vector<int64_t>{input_shape.begin(), input_shape.end()})
                        .set_element_type(input_type)
                        .set_layout("NHWC");
            }
        }
        model = ppp.build();

        ov::AnyMap properties;
        properties.emplace(ov::inference_num_threads(2));
        compiled_model = core.compile_model(model, "CPU", properties);
        input_size = input_shape[0] * input_shape[1] * input_shape[2] * input_shape[3];
    }

    static std::vector<float> randomVector(size_t size) {
        std::random_device rd;
        std::mt19937 gen(rd());
        std::uniform_int_distribution<> dis(0, 255);
        std::vector<float> vector;
        vector.reserve(size);
        for (size_t i = 0; i < size; ++i) {
            vector.push_back(static_cast<float>(dis(gen)));
        }
        return vector;
    }

    float Run() const {
        std::vector<float> input = randomVector(input_size);

        ov::InferRequest infer_request;

        try {
            infer_request = compiled_model.create_infer_request();
        } catch (std::exception& e) {
            throw std::runtime_error("failed to create infer request: " + std::string(e.what()));
        }

        ov::Tensor input_tensor = infer_request.get_input_tensor();
        std::copy(input.begin(), input.end(), input_tensor.data<float>());
        try {
            infer_request.infer();
        } catch (std::exception& e) {
            throw std::runtime_error("failed to execute inference request: " + std::string(e.what()));
        }
        const ov::Tensor& output_tensor = infer_request.get_output_tensor();
        return output_tensor.data<float>()[0];
    }

    mutable ov::CompiledModel compiled_model;
    size_t input_size;
};

struct Runner {
    explicit Runner(const std::vector<std::string>& models) {
        nnets.reserve(models.size());
        for (const auto& model : models) {
            nnets.push_back(std::make_shared<Nnet>(model));
        }
    }

    void Run() {
        size_t num_threads = nnets.size();
        std::vector<std::thread> threads;
        threads.reserve(num_threads);
        // comment out below for loop to prevent crash
        for (auto nnet : nnets) {
            nnet->Run();
        }
        for (auto nnet : nnets) {
            threads.emplace_back([nnet]() { return nnet->Run(); });
        }
        for (auto& t : threads) {
            t.join();
        }
    }

    std::vector<std::shared_ptr<Nnet>> nnets;
};

int main() {
    std::cout << "Runner:" << std::endl;
    std::vector<std::string> models{
            "efficientnet-lite4-11.onnx",
            "googlenet-9.onnx",
    };
    Runner runner(models);
    for (size_t i = 0; i < 20; ++i) {
        std::cout << "i: " << i << std::endl;
        RunInParallel([&runner]() { runner.Run(); }, 2);
    }

    return 0;
}

Relevant log output

ThreadSanitizer:DEADLYSIGNAL
==22332==ERROR: ThreadSanitizer: SEGV on unknown address (pc 0x7fa049230f21 bp 0x7fa02b0fcd50 sp 0x7fa02b0fcc70 T22457)
==22332==The signal is caused by a READ memory access.
==22332==Hint: this fault was caused by a dereference of a high value address (see register values below).  Disassemble the provided pc to learn which register was used.
ThreadSanitizer:DEADLYSIGNAL
    #0 <null> <null> (libopenvino.so.2330+0xa30f21) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #1 <null> <null> (libopenvino.so.2330+0xa66164) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #2 <null> <null> (libtbb.so.12+0xe499) (BuildId: e5d9a936de2ae01503a3a5235d75f26ef30831a5)
    #3 ov::threading::CPUStreamsExecutor::execute(std::function<void ()>) <null> (libopenvino.so.2330+0xa6edc2) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #4 <null> <null> (libopenvino.so.2330+0xa2c8db) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #5 ov::IAsyncInferRequest::run_first_stage(__gnu_cxx::__normal_iterator<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> >*, std::vector<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> >, std::allocator<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> > > > >, __gnu_cxx::__normal_iterator<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> >*, std::vector<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> >, std::allocator<std::pair<std::shared_ptr<ov::threading::ITaskExecutor>, std::function<void ()> > > > >, std::shared_ptr<ov::threading::ITaskExecutor>) <null> (libopenvino.so.2330+0xa308b0) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #6 ov::IAsyncInferRequest::infer() <null> (libopenvino.so.2330+0xa33e4d) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #7 ov::InferRequest::infer() <null> (libopenvino.so.2330+0xac70fb) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5)
    #8 Nnet::Run() const /home/daguirre/dev/idlive-doc-sdk/sdk/utils/public/force_ov_crash.cpp:75:27 (force_ov_crash+0xd7e14) (BuildId: a2c49a9f5e43d6bf7a0c37aa7c8daf3ae5289edf)
    #9 Runner::Run()::'lambda'()::operator()() const /home/daguirre/dev/idlive-doc-sdk/sdk/utils/public/force_ov_crash.cpp:103:58 (force_ov_crash+0xd98df) (BuildId: a2c49a9f5e43d6bf7a0c37aa7c8daf3ae5289edf)
    #10 float std::__invoke_impl<float, Runner::Run()::'lambda'()>(std::__invoke_other, Runner::Run()::'lambda'()&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:61:14 (force_ov_crash+0xd98df)
    #11 std::__invoke_result<Runner::Run()::'lambda'()>::type std::__invoke<Runner::Run()::'lambda'()>(Runner::Run()::'lambda'()&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/invoke.h:96:14 (force_ov_crash+0xd98df)
    #12 float std::thread::_Invoker<std::tuple<Runner::Run()::'lambda'()> >::_M_invoke<0ul>(std::_Index_tuple<0ul>) /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:279:13 (force_ov_crash+0xd98df)
    #13 std::thread::_Invoker<std::tuple<Runner::Run()::'lambda'()> >::operator()() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:286:11 (force_ov_crash+0xd98df)
    #14 std::thread::_State_impl<std::thread::_Invoker<std::tuple<Runner::Run()::'lambda'()> > >::_M_run() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_thread.h:231:13 (force_ov_crash+0xd98df)
    #15 <null> <null> (libstdc++.so.6+0xdc252) (BuildId: e37fe1a879783838de78cbc8c80621fa685d58a2)
    #16 __tsan_thread_start_func <null> (force_ov_crash+0x51768) (BuildId: a2c49a9f5e43d6bf7a0c37aa7c8daf3ae5289edf)
    #17 start_thread nptl/./nptl/pthread_create.c:442:8 (libc.so.6+0x94ac2) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348)
    #18 <null> misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 (libc.so.6+0x12684f) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348)

ThreadSanitizer can not provide additional info.
SUMMARY: ThreadSanitizer: SEGV (/home/daguirre/dev/idlive-doc-sdk/prebuilt/openvino/openvino-2023.3.0/x64-gcc7.5-glibc2.27/lib/libopenvino.so.2330+0xa30f21) (BuildId: 95678c5563f059cd7cfa6a95d9fc601abe3f4bf5) 
==22332==ABORTING

Issue submission checklist

  • I'm reporting an issue. It's not a question.
  • I checked the problem with the documentation, FAQ, open issues, Stack Overflow, etc., and have not found a solution.
  • There is reproducer code and related data files such as images, videos, models, etc.
@riverlijunjie
Copy link
Contributor

I can reproduce the similar issue in master branch with above test code. The race condition occurred during one thread doing infer() and another thread doing create_infer_request(). Further to narrow down into ov::threading::CPUStreamsExecutor::Impl::CustomThreadLocal::_stream_map issue, but will go to figure out solution to solve this race condition.

@daaguirre
Copy link
Author

great, thank you!

@riverlijunjie
Copy link
Contributor

@daaguirre could you have a try this PR? #24562

@daaguirre
Copy link
Author

thank you! I'll give it a try tomorrow and let you know.
Additionally, is it possible to use async API to bypass this error?

@riverlijunjie
Copy link
Contributor

async API

Likely async API also faces the similar problem.

@daaguirre
Copy link
Author

@daaguirre could you have a try this PR? #24562

Hi! sorry the late response, I already tried the PR and it is working now! thank you!

what is the plan to release this fix?

@riverlijunjie
Copy link
Contributor

It will be in 24.2 release, do you need port it back to 23.3?

@ilya-lavrenov ilya-lavrenov added this to the 2024.2 milestone May 23, 2024
github-merge-queue bot pushed a commit that referenced this issue May 23, 2024
### Details:
- *I lost code `t_stream_count_map[(void*)this] = item.first;` in
#19832*
 - *the thread safe issue happen in below workflow*
 - create thread A
call CustomThreadLocal:local() in thread A -> create stream A (the count
of stream A is 2)
   destory thread A (the count of stream A is 1)
   create thread B (same thread id with thread A)
call CustomThreadLocal:local() in thread B -> use stream A(the count of
stream A is 1, so it's broken)
- *add testcase, also fix
https://github.com/openvinotoolkit/openvino/pull/19986/files#r1332774754*

### Tickets:
 - Closes #24509

---------

Signed-off-by: HU Yuan2 <[email protected]>
Co-authored-by: Wanglei Shen <[email protected]>
@daaguirre
Copy link
Author

oh yes please, could we have a patch of 23.3?, thank you!

@ilya-lavrenov ilya-lavrenov added the category: inference OpenVINO Runtime library - Inference label May 23, 2024
tiger100256-hu added a commit to tiger100256-hu/openvino that referenced this issue May 24, 2024
…t#24562)

- *I lost code `t_stream_count_map[(void*)this] = item.first;` in
 - *the thread safe issue happen in below workflow*
 - create thread A
call CustomThreadLocal:local() in thread A -> create stream A (the count
of stream A is 2)
   destory thread A (the count of stream A is 1)
   create thread B (same thread id with thread A)
call CustomThreadLocal:local() in thread B -> use stream A(the count of
stream A is 1, so it's broken)
- *add testcase, also fix
https://github.com/openvinotoolkit/openvino/pull/19986/files#r1332774754*

 - Closes openvinotoolkit#24509

---------

Signed-off-by: HU Yuan2 <[email protected]>
Co-authored-by: Wanglei Shen <[email protected]>
wangleis added a commit that referenced this issue May 27, 2024
### Details:
 - *port PR #24562 to release 2023.3*
 - *add support for InferAPI1 and api_version*

### Tickets:
-
*[issue-24509](#24509

---------

Signed-off-by: HU Yuan2 <[email protected]>
Co-authored-by: Wanglei Shen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working category: inference OpenVINO Runtime library - Inference support_request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants