Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



58 Commits

Repository files navigation


An open-source high performance library for image processing. including CPU optimization and GPU optimization. PRs are welcome.

For more details on DeltaCV ,please go to

 author Haibo     contributions welcome

1. Shared Memory


  • Boost




#include "deltaCV/cpu/shm.hpp"

For more details, see my blog;



  • OpenCV(This library dose not depend on OpenCV, but the input of the function is often type-cv::Mat)
  • SSE
  • AVX

Compile options

You need put these compile options in your CMakeLists.txt.

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=haswell")


All samples are in cpu/examples/.

  • inRange
  • ycrcbWithSeg
  • weightedGrayWithSeg
  • grayBRWithSeg
  • grayBRWithSegStandard

Performance Table

Image Size: 1024 x 1280(H x W)

Function OpenCV/ms DeltaCV/ms Speed-up
inRange 1.06 - 1.18 0.29 - 0.30 3.5 - 4.0
ycrcbWithSeg 6.68 - 6.75 0.88 - 0.90 7.4 - 7.6
weightedGrayWithSeg 1.56 - 1.69 0.39 - 0.46 3.39 - 4.33
grayBRWithSeg 3.28 - 3.35 0.69 - 0.71 4.6 - 4.8
grayBRWithSegStandard 1.19 - 1.22 0.23 - 0.25 4.76 - 5.30



  • CUDA
  • OpenCV


All samples are in gpu/examples/.

  • binarization
  • colorSpace
  • edgeDetection
  • erode_dilate
  • getHist
  • equalizeHist
  • blur

Performance Table

Image Size: 480 x 640(H x W)

Function GPU/ms (NVIDIA GTX 1070 8G) CPU/ms (OpenCV on i5 7500) Speed-up
RGB2GRAY 0.008 - 0.010 0.340 - 0.360 3.4 - 45
RGB2HSV 0.150 - 0.200 3.900 - 4.400 19.5 - 29.3
thresholdBinarization 0.005 - 0.008 0.035 - 0.045 4.4 - 9.0
ostu 0.16-0.17 1.280-1.432 8.0-8.9
sobel / scharr 0.032 - 0.038 - -
erode / dilate (3*3 rect) 0.045 - 0.049 - -
getHist (bin:256) 0.145 - 0.149 - -
equalizeHist(bin:256) 0.16-0.17 0.31-0.32 1.8-2.0
blur(3*3 guassian kernel) 0.036-0.040 - -

Function List

Color space transformation

  • RGB2GRAY(uchar3* dataIn,unsigned char* dataOut,int imgRows,int imgCols): in gpu/src/ Converting RGB images to gray-scale images.

  • RGB2HSV(uchar3* dataIn,uchar3* dataOut,int imgRows,int imgCols,uchar3 minVal,uchar3 maxVal): in gpu/src/ Converting RGB images to HSV images, and using threshold segmentation to RGB images based on minVal and maxVal.


  • thresholdBinarization(unsigned char* dataIn,unsigned char* dataOut,short int imgRows,short int imgCols,unsigned char thresholdMin,unsigned char thresholdMax,unsigned char valMin,unsigned char valMax): in gpu/src/ Similar to OpenCV function threshold(), I designed 5 modes: THRESH_BINARY, THRESH_BINARY_INV, THRESH_TRUNC, THRESH_TOZERO, THRESH_TOZERO_INV.
 * Compare 'threshold()' funciton in OpenCV
 * When:
 *      thresholdMin = thresholdMax and valMin = 0  ==> THRESH_BINARY
 *      thresholdMin = thresholdMax and valMax = 0  ==> THRESH_BINARY_INV
 *      thresholdMax = valMax and thresholdMin = 0  ==> THRESH_TRUNC
 *      thresholdMax = 255 and valMin = 0  ==> THRESH_TOZERO
 *      thresholdMin = 0 and valMax = 0  ==> THRESH_TOZERO_INV
  • ostu_gpu(unsigned char* dataIn,unsigned char* dataOut,unsigned int* hist,float* sum_Pi,float* sum_i_Pi,float* u_0,float* varance,int* thres,short int imgRows,short int imgCols): in gpu/src/ Binarization using ostu.

Edge Detection

  • sobel(unsigned char* dataIn,unsigned char* dataOut,short int imgRows,short int imgCols): in gpu/src/ Edge detection using sobel operator.

  • scharr(unsigned char* dataIn,unsigned char* dataOut,short int imgRows,short int imgCols): in gpu/src/ Edge detection using scharr operator.

Erode and Dilate

  • erode(unsigned char* dataIn,unsigned char* dataOut,short int imgRows,short int imgCols,short int erodeElementRows,short int erodeElementCols): in gpu/src/

  • dilate(unsigned char* dataIn,unsigned char* dataOut,short int imgRows,short int imgCols,short int dilateElementRows,short int dilateElementCols): in gpu/src/


  • getHist(unsigned char* dataIn, unsigned int* hist): in gpu/src/
  • [wrapper]equalizeHist_gpu(unsigned char* dataIn,unsigned int* hist,unsigned int* sum_ni,unsigned char* dataOut,short int imgRows,short int imgCols,dim3 tPerBlock,dim3 bPerGrid): in gpu/src/

Guassian Blur

  • guassianBlur3_gpu(unsigned char* dataIn,unsigned char* dataOut,short int imgRows,short int imgCols,dim3 tPerBlock,dim3 bPerGrid): in gpu/src/ Guassian blur with 3*3 kernel