Skip to content

Latest commit

 

History

History

deepstream-imagedata-multistream-cupy

################################################################################
# SPDX-FileCopyrightText: Copyright (c) 2022-2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

Prerequisites:
- DeepStreamSDK 7.1
- Python 3.10
- Gst-python
- NumPy package <2.0, >=1.22 (2.0 and above not supported)
- OpenCV package
- CuPy for Cuda 12.1 or later

** This application is currently only supported on x86.

To install required packages:
$ sudo apt update
$ sudo apt install python3-numpy python3-opencv -y
$ pip3 install cupy-cuda12x

To run:
  $ python3 deepstream_imagedata-multistream_cupy.py -i <uri1> [uri2] ... [uriN]
e.g.
  $ python3 deepstream_imagedata-multistream_cupy.py -i file:///home/ubuntu/video1.mp4 file:///home/ubuntu/video2.mp4
  $ python3 deepstream_imagedata-multistream_cupy.py -i rtsp://127.0.0.1/video1 rtsp://127.0.0.1/video2

This document describes the sample deepstream-imagedata-multistream-cupy application.

This sample builds on top of the deepstream-imagedata-multistream sample to demonstrate how to:

* Access imagedata buffer from GPU in a multistream source as CuPy array
* Modify the images in-place. Changes made to the buffer will reflect in the downstream but  
  color format, resolution and CuPy transpose operations are not permitted.
* Extract the stream metadata, imagedata, which contains useful information about the
  frames in the batched buffer.
* Use multiple sources in the pipeline.
* Use a uridecodebin so that any type of input (e.g. RTSP/File), any GStreamer
  supported container format, and any codec can be used as input.
* Configure the stream-muxer to generate a batch of frames and infer on the
  batch for better resource utilization.

NOTE:
- This application is only supported on x86, not Jetson.
- Only RGBA color format is supported for access from Python. Color conversion
  is added in the pipeline for this reason.
- See the last paragraph of the README for details about GPU buffer access

This sample accepts one or more H.264/H.265 video streams as input. It creates
a source bin for each input and connects the bins to an instance of the
"nvstreammux" element, which forms the batch of frames. The batch of
frames is fed to "nvinfer" for batched inferencing. The batched buffer is
composited into a 2D tile array using "nvmultistreamtiler." The rest of the
pipeline is similar to the deepstream-test3 and deepstream-imagedata sample.

The "width" and "height" properties must be set on the stream-muxer to set the
output resolution. If the input frame resolution is different from
stream-muxer's "width" and "height", the input frame will be scaled to muxer's
output resolution.

The stream-muxer waits for a user-defined timeout before forming the batch. The
timeout is set using the "batched-push-timeout" property. If the complete batch
is formed before the timeout is reached, the batch is pushed to the downstream
element. If the timeout is reached before the complete batch can be formed
(which can happen in case of rtsp sources), the batch is formed from the
available input buffers and pushed. Ideally, the timeout of the stream-muxer
should be set based on the framerate of the fastest source. It can also be set
to -1 to make the stream-muxer wait infinitely.

The "nvmultistreamtiler" composite streams based on their stream-ids in
row-major order (starting from stream 0, left to right across the top row, then
across the next row, etc.).

As opposed to the deepstream-imagedata-multistream app, the pipeline is run on device
memory instead of unified memory since we access the buffer directly on GPU from CuPy
rather than as a numpy array. The process to retrieve the buffer as a CuPy array can be
broken down into the following steps: 
 1. Bindings call to retrieve buffer info pyds.get_nvds_buf_surface_gpu()
 2. Pointer retrieval using ctypes
 3. Construction of cupy.cuda.UnownedMemory object from pointer
 4. Construction of cupy.cuda.MemoryPointer from UnownedMemory
 5. Construction of cupy.ndarray from MemoryPointer and other buffer info retrieved from bindings call

When performing operations on the image array, we use a CUDA null stream to prevent access of buffer memory by other CUDA operations.