本文记录MIT-OS6.S081 Lab4 traps的RISC-V assembly的实现过程。
文章目录
- [1. 作业要求](#1. 作业要求)
-
- [RISC-V assembly (easy)](#RISC-V assembly (easy))
- [2. 解答](#2. 解答)
1. 作业要求
RISC-V assembly (easy)
It will be important to understand a bit of RISC-V assembly, which you were exposed to in 6.004. There is a file
user/call.cin your xv6 repo.make fs.imgcompiles it and also produces a readable assembly version of the program inuser/call.asm.Read the code in call.asm for the functions
g,f, andmain. The instruction manual for RISC-V is on the reference page. Here are some questions that you should answer (store the answers in a file answers-traps.txt):
- Which registers contain arguments to functions? For example, which register holds 13 in main's call to
printf?- Where is the call to function
fin the assembly code for main? Where is the call tog? (Hint: the compiler may inline functions.)- At what address is the function
printflocated?- What value is in the register
rajust after thejalrtoprintfinmain?- Run the following code.
c
unsigned int i = 0x00646c72;
printf("H%x Wo%s", 57616, &i);
What is the output? Here's an ASCII table that maps bytes to characters.
The output depends on that fact that the RISC-V is little-endian. If the RISC-V were instead big-endian what would you set
ito in order to yield the same output? Would you need to change57616to a different value? Here's a description of little- and big-endian and a more whimsical description.
- In the following code, what is going to be printed after
'y='? (note: the answer is not a specific value.) Why does this happen?
c
printf("x=%d y=%d", 3);
2. 解答
我们先看一下user/call.c:
c
#include "kernel/param.h"
#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"
int g(int x) {
return x+3;
}
int f(int x) {
return g(x);
}
void main(void) {
printf("%d %d\n", f(8)+1, 13);
exit(0);
}
我们使用下面命令编译得到call.asm
c
make fs.img
对应的call.asm:
c
0000000000000000 <g>:
#include "kernel/param.h"
#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"
int g(int x) {
0: 1141 addi sp,sp,-16
2: e422 sd s0,8(sp)
4: 0800 addi s0,sp,16
return x+3;
}
6: 250d addiw a0,a0,3
8: 6422 ld s0,8(sp)
a: 0141 addi sp,sp,16
c: 8082 ret
000000000000000e <f>:
int f(int x) {
e: 1141 addi sp,sp,-16
10: e422 sd s0,8(sp)
12: 0800 addi s0,sp,16
return g(x);
}
14: 250d addiw a0,a0,3
16: 6422 ld s0,8(sp)
18: 0141 addi sp,sp,16
1a: 8082 ret
000000000000001c <main>:
void main(void) {
1c: 1141 addi sp,sp,-16
1e: e406 sd ra,8(sp)
20: e022 sd s0,0(sp)
22: 0800 addi s0,sp,16
printf("%d %d\n", f(8)+1, 13);
24: 4635 li a2,13
26: 45b1 li a1,12
28: 00000517 auipc a0,0x0
2c: 7c850513 addi a0,a0,1992 # 7f0 <malloc+0xe6>
30: 00000097 auipc ra,0x0
34: 61a080e7 jalr 1562(ra) # 64a <printf>
exit(0);
38: 4501 li a0,0
3a: 00000097 auipc ra,0x0
3e: 298080e7 jalr 664(ra) # 2d2 <exit>
- Which registers contain arguments to functions? For example, which register holds 13 in main's call to
printf?`
查看call.asm,我们看到函数参数对应的寄存器是a0、a1、a2,而13放在a2寄存器里。
Where is the call to function
fin the assembly code for main? Where is the call tog? (Hint: the compiler may inline functions.)
f和g的调用找不到,f(8)+1代入应该等于12,对应的是:
c
26: 45b1 li a1,12
At what address is the function
printflocated?
0x64a。
一方面我们可以查看printf的地址得到答案:
c
000000000000064a <printf>:
void
printf(const char *fmt, ...)
...
另一方面查看main这里的汇编:
c
30: 00000097 auipc ra,0x0
34: 61a080e7 jalr 1562(ra) # 64a <printf>
#后面也暗示了0x64a,怎么算的呢,先看auipc ra,0x0,从riscv-spec.pdf了解到:
AUIPC (add upper immediate to pc) is used to build pc-relative addresses and uses the U-type format. AUIPC forms a 32-bit offset from the 20-bit U-immediate, filling in the lowest 12 bits with zeros, adds this offset to the address of the AUIPC instruction, then places the result in register rd.
所以AUIPC 的执行步骤可简化为:
- 取指令中的 20 位 U-immediate,左移 12 位(最低 12 位填 0),生成 32 位偏移量;
- 计算 "AUIPC 指令地址 + 32 位偏移量";
- 将计算结果写入目标寄存器 rd。
也就是说auipc <目标寄存器>, <立即数>的计算逻辑是:
c
目标寄存器 = PC + (立即数 << 12)
在这里auipc ra,0x0就是ra=PC+(0 << 12)=0x30(PC是0x30第一列的数),下一句话jalr 1562(ra)我们从手册页可以找到:
The indirect jump instruction JALR (jump and link register) uses the I-type encoding. The target address is obtained by adding the sign-extended 12-bit I-immediate to the register rs1, then setting the least-significant bit of the result to zero. The address of the instruction following the jump (pc+4) is written to register rd. Register x0 can be used as the destination if the result is not required.
跳转的地址为ra(这里是0x30)加上12位符号扩展的立即数(这里是1562),然后会吧最低位清零保证字节对齐:
c
0x30+1562=0x30+0x61a=0x64a
What value is in the register ra just after the jalr to printf in main?
上面讲到:The address of the instruction following the jump (pc+4) is written to register rd.
c
34: 61a080e7 jalr 1562(ra) # 64a <printf>
printf的地址为0x34,那么在执行了jalr到printf以后地址ra为pc+4为0x38。
在gdb里验证一下确实是0x38:

Run the following code. What is the output?
c
unsigned int i = 0x00646c72;
printf("H%x Wo%s", 57616, &i);
我们在call里输入这个代码,输出为HE110 World。

%x取的是十六进制,57616的十六进制为E110,用python验证了一下:

%s为按字符串输入,xv6是小端序,0x00646c72依次为0x72,0x6c,0x64,0x00,前三个是字符,最后一个为字符串的结尾\0,前三个字符是r、l、d,用python验证一下:

也可以用gdb验证一下,i的地址是0x2fcc,它指向的内容是0x72(r):

If the RISC-V were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?
如果是大端序,那么i就反过来:0x726c6400。不需要修改57616,字节序不影响整数的数值本身。
In the following code, what is going to be printed after 'y='? (note: the answer is not a specific value.) Why does this happen?
c
printf("x=%d y=%d", 3);
先打在call.c里看看,显示y=5253。
c
// user/call.c
#include "kernel/param.h"
#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"
int g(int x) {
return x+3;
}
int f(int x) {
return g(x);
}
void main(void) {
//printf("%d %d\n", f(8)+1, 13);
//unsigned int i = 0x00646c72;
//printf("H%x Wo%s", 57616, &i);
printf("x=%d y=%d", 3);
exit(0);
}

函数压栈a0是字符串,a1是3,a2没有输入,没有输入y所以a2是之前的垃圾值。

gdb打印出来a0,a1,a2,这里的a2就是传递进去的参数,所以我们打印的是5253。

具体的汇编的解析也可以查看MIT 6.S081 2020 LAB4记录 - 炼金术士的文章 - 知乎 ,写得也比较清楚了。
完活!