Skip to content

UCLA-VAST/Merlin-UCLA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

0b256a0 · Jan 16, 2024

History

56 Commits
Jun 24, 2023
May 25, 2021
Oct 25, 2023
May 25, 2021
Jan 16, 2024
May 25, 2021
Jun 23, 2023
Jun 23, 2023

Repository files navigation

Merlin-UCLA

Use Merlin-UCLA with Docker

Download

docker pull ghcr.io/ucla-vast/merlin-ucla:latest

Configuration

Please update the file run_docker.sh with you own paths.

Run in interactive mode

sh run_docker.sh

or

# Please change these paths
xilinx_path=/opt/xilinx
tools_path=/opt/tools
XILINX_VITIS=/opt/tools/xilinx/Vitis/2021.1
XILINX_XRT=/opt/xilinx/xrt
XILINX_VIVADO=/opt/tools/xilinx/Vivado/2021.1
LM_LICENSE_FILE=

########################

CURRENT_PATH=$(cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd)
imagename="ghcr.io/ucla-vast/merlin-ucla:latest"

docker run -v /home:/home -v $tools_path:$tools_path -v $xilinx_path:$xilinx_path -e XILINX_VITIS=$XILINX_VITIS -e XILINX_XRT=$XILINX_XRT -e XILINX_VIVADO=$XILINX_VIVADO -e LM_LICENSE_FILE=$LM_LICENSE_FILE -w="$CURRENT_PATH" -it "$imagename"

Install Merlin-UCLA

Prequest dependency environment:

  1. python >= 3.6.8
  2. cmake >= 3.19.0
  3. boost == 1.67.0
  4. clang == 6.0.0
  5. gcc == 4.9.4
  6. llvm == 6.0.0

How to build:

  1. In merlin_setting.sh, change MERLIN_COMPILER_HOME to your absolute path
  2. Download gcc4.9.4 to $gcc_path specified in merlin_settings.sh and compile, add built library path to LD_LIBRARY_PATH in merlin_setting.sh
  3. Download llvm6.0.0, clang 6.0.0, boost1.67.0 to the path specified in merlin_settings.sh, and compile all the packages with gcc4.9.4 built in step 2
  4. source merlin_setting.sh
  5. cd trunk/build;
  6. cmake3 -DCMAKE_BUILD_TYPE=Release ..;
  7. make -j;

Run Merlin-UCLA

Please first source all the necessary paths e.g., source /opt/tools/xilinx/Vitis_HLS/2021.1/settings64.sh.

Run

merlincc

To print the help:

merlincc -h

Compilation Options

To select the platform please use the option:

--platform=<the platform>
#for example
--platform=vitis::/opt/xilinx/platforms/xilinx_u200_xdma_201830_2/xilinx_u200_xdma_201830_2.xpfm  

To change frequency please add the option:

--kernel_frequency <frequency in MHz>
# for example
--kernel_frequency 250

To automatically do tree reduction in logarithmic time please add the option:

-funsafe-math-optimizations

To change the burst single size threshold please add the option:

--attribute burst_single_size_threshold=<size>
# for example
--attribute burst_single_size_threshold=36700160

To change the burst total size threshold please add the option:

--attribute burst_total_size_threshold=<size>
# for example
--attribute burst_total_size_threshold=36700160

You can include the path of the include folder with the option -I. For example we use the options:

CFLAGS="-I $XILINX_HLS/include" merlincc --attribute burst_total_size_threshold=36700160 --attribute burst_single_size_threshold=36700160 --kernel_frequency 250  -funsafe-math-optimizations --platform=vitis::/opt/xilinx/platforms/xilinx_u200_xdma_201830_2/xilinx_u200_xdma_201830_2.xpfm  -I $XILINX_HLS/lnx64/tools/gcc/lib/gcc/x86_64-unknown-linux-gnu/4.6.3/include/ -I $XILINX_HLS/include/ -I /opt/merlin/sources/merlin-compiler/trunk/source-opt/include/apint_include/ -c -o mykernel_merlincc_polyopt --report=estimate

Pragmas

All pragmas are applied above the loop contrary to Xilinx Vitis.

Mandatory Pragma

Please add #pragma ACCEL kernel above the main function. For example:

#pragma ACCEL kernel
void kernel_bicg(int m,int n,float A[2100][1900],float s[1900],float q[2100],float p[1900],float r[2100])
{
  int i;
  int j;    
    for (i = 0; i < 1900; i++) {
      s[i] = 0;
    }
    for (i = 0; i < 2100; i++) {
      q[i] = 0.0;
      for (j = 0; j < 1900; j++) {
        s[j] = s[j] + r[i] * A[i][j];
        q[i] = q[i] + A[i][j] * p[j];
      }
    }
}
Hardware directives
Parallel / Unroll

#pragma ACCEL parallel factor=<uf>

If factor is not specify the loop is fully unrolled.

Pipeline

#pragma ACCEL pipeline flatten

The loop bellow the pragma will be pipeline and the innermost loop will be fully unrolled.

Note: #pragma ACCEL pipeline flatten can also be use as #pragma ACCEL pipeline if all the innermost loop are fully unrolled.

Tile 1D / Strip mining

#pragma ACCEL tile factor=<tile size>

The loop bellow the pragma will be strip mined and the innermost loop create will have a trip count equal to tile size.

Memory communication

#pragma ACCEL cache variable=<name array>

The transfer from off-chip to on-chip will be done at the position of the pragma cache.

Double buffer

#pragma ACCEL pipeline

The communication of the array will be done at the position of the pragma and transfer with double buffer.