Venus RISC-V Assembly is the assembly that can run on Venus RISC-V simulator.
This tutorial is designed to help you convert Venus RISC-V Assembly to real chip Kendryte 210 (K210) RISC-V Assembly. Finally, you should be able to run RISC-V Assembly that you modified from Venus on a real chip K210.
However, if you are interested in writing a compiler with RISC-V Assembly as the target language or writing RISC-V Assembly by hand, you may also glance over this tutorial.
Please ensure you are comfortable
with (in descending order of importance) Linux shell, RISC-V Assembly, GDB, GCC, GNU Toolchain, C/C++ Memory Allocation, a little bit of Operating System, CMake.
Since Venus is usually used for education purpose, this tutorial will assume that you have undergraduate level knowledge of Computer Architecture and Organization & Principles of Compiler.
Notice that this tutorial is more like experience than a detailed step-by-step guide, so you need to do a lot of debugging and development work yourself, and basic knowledge and patience are very important to success.
When I first decided to port Venus assembly to real chip's assembly, I first investigated the available RISC-V development boards. Then, I decided to choose the development board that has the potential to run Linux (now (2020.12), the performance of RISC-V chip on most development boards is too low to run Linux), but the price is not too high. Thus, Maix Bit
was chosen finally.
If you are willing to use the same development board as me, please read and complete this tutorial to build a bare metal debugging and development environment. Notice that you need a debugger
to complete the tutorial. The cost of the development board and debugger is about $20.
You may notice that Maix Bit is 64-bit architecture, while Venus simulates 32-bit instructions, which means that we need some extra work to deal with the OFFSET problem, that is, 4->8.
(In fact, the cost-performance ratio of the 32-bit development board is not as good as K210 so I still choose K210 here.)
TL;DR:
- Sipeed Maix Bit is about $12.9
- Sipeed RV Debugger is about $7.6
& complete this tutorial
[OpenOCD is connecting with debugger]------------------------|
|---|
[minicom/serial communication is connecting with debugger]---| |
printf()/scanf() |
|
modify |
(Kendryte Standalone SDK Project Test: |
RISC-V Assembly file: Test/test.s) |
| |
| cmake PROJ=Test |
V |
1. (test) for debugging on chip & 2. (test.bin) |
| for flashing to chip, |
| NOT used here |
|-------------------------------------------->>[GDB]
as file input
- Keep OpenOCD and minicom alive.
- Any RISC-V code changes in
Test/test.s
in SDK. - Then,
cmake
again once you changedTest/test.s
. - The build result
test
is used by GDB. - Check the output and change the RISC-V code by Step 1.
If you have any questions about the shape of RISC-V assembly to be generated, the best reference is the code in k210asms
of this project.
Good start point of GNU Assembler: GNU Assembler Examples (although it is x86-64 asm) & The GNU Assembler
Ensure you read GDB Cheat Sheet first.
If you doubt the value of the registers is correct, you can run your Venus assembly on
https://chocopy.org/venus.html and compare the register values in Venus with the register values in GDB.
i r (info registers)
If you need pc
, current line
, sp
, by https://sourceware.org/gdb/current/onlinedocs/gdb/Registers.html
p/x $pc
x/i $pc
set $sp += 4
If you need to print frame,
info frame
print stack,
backtrace
If you need to print heap, by https://ftp.gnu.org/old-gnu/Manuals/gdb-5.1.1/html_chapter/gdb_9.html#SEC56
x/xh addr
print hex with length double(64-bit)
x/uh addr
print unsigned decimal with length double(64-bit)
x/dh addr
print signed decimal with length double(64-bit)
It may be a bit annoying to find the line causing crash. By my experience, you need to step into manually, otherwise the program will fall into the trap_entry()
in bsp
of K210 after crush.
However, I notice that the IDE PlatformIO for embedded systems is very complete and mature. Maybe PlatformIO
's debugging and development experience is great and worth a try. Although I did not try it, I hope to see a PlatformIO
experience for this porting project :)
Or consider using Kendryte IDE (not sure about the quality of the beta version).
Here, we will focus on how to run let RISC-V code in Test/test.s
can run on K210.
Always check RISC-V Reference Card
First, we will deal with the conversion issue of 32->64.
I will assume that you are in a compiler project, guiding you how to modify the program for outputting RISC-V code. In fact, my compiler project is based on UCLA CS132 RISC-V version.
- If you hardcoded the
OFFSET
offp
/sp
to4
in the memory allocation part (object, array) of your program. Add aconst int OFFSET=8
, and change all these4
toOFFSET
.
If you have usedOFFSET
offp
/sp
in your program, just change it to8
.Be careful with the locations and interface functions for calculating offset.
- In most cases, you have some hardcoded offset in strings. For example:
-4(fp)
or8(sp)
, and they should be double to-8(fp)
or16(sp)
. - If you have a program to generate
IR
before current program in your compiler project.
The memory allocation part (object, array) of your IR generating program should also have a offset of4
. The offsets in your IR generating program should be modified to8
like in Step 1. - RISC-V allows mixing different length instrucions. That is, we can both use
lw t0, 8(fp)
andld t0, 8(fp)
in 64-bit.
Then, it may cause(This can be solved by add----------16(fp) ---------- 8(fp) value: 0x0000000000800168 t0: 0xffffffffffffffff ========================================== After lw t0, 8(fp), t0: 0xffffffff00800168 After ld t0, 8(fp), t0: 0x0000000000800168
li t0, 0
beforelw t0, 8(fp)
, but elegance is lost)
Although this feature is a way to save space, for the simplicity of our implement, we will NOT uselw
andsw
here.
Alllw
andsw
should be replaced byld
andsd
. If you useexact match
first and then useglobal replacement
, there should be no problem. - If you have some strings with
.asciz
,string
, etc., change its correspoding.align 2
to.align 3
.
If you see core dump: misaligned load
, check this part again.
See RISC-V Assembly Programmer's Manual
Notice that many directives in Venus are not supported officially and many official directives are not supported by K210.
.equiv
and.equ
are not supported by K210, just replace all constant definitions. Then, delete them all.- For the simplicity of our implement,
errors/exceptions
/exit
, will not be implemented. Just delete all related part. - At the beginning of program, if you have something like:
just change to
.text - jal Main - li a0, @exit - ecall
.text
- Make sure the main function call
main
, notMain
. Also, add a infinite loop beforemain
function's last linejr ra
. This can preventpc
from falling out of the boundary of the program, which is convenient for debugging and burning (pressRESET
button on Maix Bit can rerun the program easily). The end ofmain
:something + .inf_loop: + j .inf_loop jr ra
- At the beginning of every functions (include
main
), it should be.align 1 .globl your_function_name your_function_name:
Delete old Util Functions.
Do not use ecall
for K210.
Add this part to the end of your program:
Note the parameter register and the return register in comment.
Notice that the function's prefix is
.
, this is by the convention of the product ofriscv64-unknown-elf-gcc -S
.
.align 1
.globl .print_int
# need save a0, a1 before call
# a1: num -> void
.print_int:
sd fp, -16(sp) # Store old fp
mv fp, sp # Set new fp
addi sp, sp, -16
sd ra, -8(fp)
la a0, .rep_int
call printf
ld t6, _impure_ptr
ld t6, 16(a7)
mv a0, t6
call fflush
ld ra, -8(fp) # Restore ra register
ld fp, -16(fp) # Restore old fp
addi sp, sp, 16
jr ra
.align 1
.globl .alloc
# need save a0, a1 before call
# a0: num -> a0: pointer
.alloc:
sd fp, -16(sp) # Store old fp
mv fp, sp # Set new fp
addi sp, sp, -16
sd ra, -8(fp)
# int size is 4, but we are in 64-bit, use 8
li a1, 8
call calloc
ld ra, -8(fp) # Restore ra register
ld fp, -16(fp) # Restore old fp
addi sp, sp, 16
jr ra
.section .rodata
.rep_int:
.string "%d\n"
.align 3
Since the function call
ed in the Util Function may destroy all caller-save registers
, we need to save and restore
all a reg
and t reg
.
!!! Notice that I did not use a0, a1, t6
in register allocation so I did not save them.
!!! You should save and restore all caller-save registers you used.
This is just an example:
addi sp, sp, -96
sd a2, 0(sp)
sd a3, 8(sp)
sd a4, 16(sp)
sd a5, 24(sp)
sd a6, 32(sp)
sd a7, 40(sp)
sd t0, 48(sp)
sd t1, 56(sp)
sd t2, 64(sp)
sd t3, 72(sp)
sd t4, 80(sp)
sd t5, 88(sp)
jal .print_int
ld a2, 0(sp)
ld a3, 8(sp)
ld a4, 16(sp)
ld a5, 24(sp)
ld a6, 32(sp)
ld a7, 40(sp)
ld t0, 48(sp)
ld t1, 56(sp)
ld t2, 64(sp)
ld t3, 72(sp)
ld t4, 80(sp)
ld t5, 88(sp)
addi sp, sp, 96
Since the function call
ed in the Util Function may destroy all caller-save registers
, we need to save and restore
all a reg
and t reg
.
!!! Notice that I did not use a0, a1, t6
in register allocation so I did not save them.
!!! You should save and restore all caller-save registers you used.
For Java Compiler, in Java spec, the default values of
new int[]
is0
. Thus,calloc()
is used here instead ofmalloc()
.
Notice that there is NO free step for the pointers, if you would like to free the pointers, you need to create a Util Function by yourself.
This is just an example:
addi sp, sp, -96
sd a2, 0(sp)
sd a3, 8(sp)
sd a4, 16(sp)
sd a5, 24(sp)
sd a6, 32(sp)
sd a7, 40(sp)
sd t0, 48(sp)
sd t1, 56(sp)
sd t2, 64(sp)
sd t3, 72(sp)
sd t4, 80(sp)
sd t5, 88(sp)
jal .alloc
ld a2, 0(sp)
ld a3, 8(sp)
ld a4, 16(sp)
ld a5, 24(sp)
ld a6, 32(sp)
ld a7, 40(sp)
ld t0, 48(sp)
ld t1, 56(sp)
ld t2, 64(sp)
ld t3, 72(sp)
ld t4, 80(sp)
ld t5, 88(sp)
addi sp, sp, 96
- Write a minimal program similar to
Hello_World
in C.
Then, add the C function to your program you would like to run in RISC-V.
Do not write your program in C++ or the compiled product can become very complicated and difficult to understand. - Use
riscv64-unknown-elf-gcc -S
to compile your C program.
Then, you will get the product of compilation, that is, RISC-V assembly. - By looking for the relationship between the source code in C and RISC-V assembly, we can infer the assembly fragment we need.
Note the parameter register, the return register, andlw
ld
sw
sd
.
This project's idea is mainly derived from the RISC-V version of UCLA CS132 Compiler Construction. The target language of the compiler project is Venus RISC-V assembly. This project can verify that the RISC-V assembly output by the modified compiler can run on the RISC-V chip K210.
Copyright (C) 2020 Qingpeng Li
This work is licensed under a Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) License.