A framework for saliency-guided High Efficiency Video Coding (HEVC), based on Region of Interest (ROI) detection and tracking.
- Real-time performance
- Energy-efficient
- Accurate ROI tracking capabilities
- Optimized bitrates
- C/C++ development tools (gcc, make)
- Autotools (autoconf, automake, libtool)
- Python 3.6+
- Git
- ZeroMQ
apt-get install libzmq3-dev
# Initialize and update the Kvazaar submodule
git submodule init
git submodule update
# Apply the required patch
cd kvazaar
git apply ../patches/kvazaar_framework_7fe86344.patch
cd ..
# Compile and install
cd kvazaar
./autogen.sh
./configure
make -j
sudo make install
sudo ldconfig
cd ..
NOTE: For the most up-to-date installation information, please refer to the Kvazaar repository.
pip install -r requirements.txt
Important: For granting Intel Energy Permission, run:
sudo chmod -R a+r /sys/class/powercap/intel-rapl
The pretrained weights are provided in the weights
directory of this repository. Download our pretrained weights to quickly get started with ROI detection and tracking:
Model | Type | Trained On | Size |
---|---|---|---|
yolov8n-head | Detection | HollywoodHeads | 24.5MB |
light-tracker-head | Tracking | UVG-ROI dataset | 40.8kB |
enhanced-tracker-head | Tracking | UVG-ROI dataset | 19.4MB |
python train_tracker.py
NOTE: For the most up-to-date training instructions and parameters, please refer to the Ultralytics documentation.
python train_yolo.py
python test_tracker.py
python kvazaar_roi_encode.py
python kvazaar_traditional_encode.py
If you use this framework in your research, please cite this paper:
@ARTICLE{10820524,
author={Partanen, Tero and Hoang, Minh and Mercat, Alexandre and Sainio, Joose and Vanne, Jarno},
journal={IEEE Journal on Emerging and Selected Topics in Circuits and Systems},
title={Energy-Efficient Saliency-Guided Video Coding Framework for Real-Time Applications},
year={2025},
volume={15},
number={1},
pages={44-57},
keywords={Encoding;Streaming media;Video coding;Image coding;Saliency detection;Energy efficiency;Energy consumption;Visualization;Object tracking;Computational modeling;Saliency-guided encoding;region-of-interest (ROI);ROI tracking;deep learning (DL);motion vector (MV)},
doi={10.1109/JETCAS.2024.3525339}}