【MIT-OS6.S081作业-4-1】Lab4-traps-RISC-V assembly

本文记录MIT-OS6.S081 Lab4 traps的RISC-V assembly的实现过程。

文章目录

[1. 作业要求](#1. 作业要求)
- [RISC-V assembly (easy)](#RISC-V assembly (easy))
[2. 解答](#2. 解答)

1. 作业要求

RISC-V assembly (easy)

It will be important to understand a bit of RISC-V assembly, which you were exposed to in 6.004. There is a file user/call.c in your xv6 repo. make fs.img compiles it and also produces a readable assembly version of the program in user/call.asm.

Read the code in call.asm for the functions g, f, and main. The instruction manual for RISC-V is on the reference page. Here are some questions that you should answer (store the answers in a file answers-traps.txt):

Which registers contain arguments to functions? For example, which register holds 13 in main's call to printf?

Where is the call to function f in the assembly code for main? Where is the call to g? (Hint: the compiler may inline functions.)

At what address is the function printf located?

What value is in the register ra just after the jalr to printf in main?

Run the following code.

c 复制代码

	unsigned int i = 0x00646c72;
	printf("H%x Wo%s", 57616, &i);

What is the output? Here's an ASCII table that maps bytes to characters.

The output depends on that fact that the RISC-V is little-endian. If the RISC-V were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value? Here's a description of little- and big-endian and a more whimsical description.

In the following code, what is going to be printed after 'y='? (note: the answer is not a specific value.) Why does this happen?

c 复制代码

	printf("x=%d y=%d", 3);

2. 解答

我们先看一下user/call.c：

c 复制代码

#include "kernel/param.h"
#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

int g(int x) {
  return x+3;
}

int f(int x) {
  return g(x);
}

void main(void) {
  printf("%d %d\n", f(8)+1, 13);
  exit(0);
}

我们使用下面命令编译得到call.asm

c 复制代码

make fs.img

对应的call.asm:

c 复制代码

0000000000000000 <g>:
#include "kernel/param.h"
#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

int g(int x) {
   0:	1141                	addi	sp,sp,-16
   2:	e422                	sd	s0,8(sp)
   4:	0800                	addi	s0,sp,16
  return x+3;
}
   6:	250d                	addiw	a0,a0,3
   8:	6422                	ld	s0,8(sp)
   a:	0141                	addi	sp,sp,16
   c:	8082                	ret

000000000000000e <f>:

int f(int x) {
   e:	1141                	addi	sp,sp,-16
  10:	e422                	sd	s0,8(sp)
  12:	0800                	addi	s0,sp,16
  return g(x);
}
  14:	250d                	addiw	a0,a0,3
  16:	6422                	ld	s0,8(sp)
  18:	0141                	addi	sp,sp,16
  1a:	8082                	ret

000000000000001c <main>:

void main(void) {
  1c:	1141                	addi	sp,sp,-16
  1e:	e406                	sd	ra,8(sp)
  20:	e022                	sd	s0,0(sp)
  22:	0800                	addi	s0,sp,16
  printf("%d %d\n", f(8)+1, 13);
  24:	4635                	li	a2,13
  26:	45b1                	li	a1,12
  28:	00000517          	auipc	a0,0x0
  2c:	7c850513          	addi	a0,a0,1992 # 7f0 <malloc+0xe6>
  30:	00000097          	auipc	ra,0x0
  34:	61a080e7          	jalr	1562(ra) # 64a <printf>
  exit(0);
  38:	4501                	li	a0,0
  3a:	00000097          	auipc	ra,0x0
  3e:	298080e7          	jalr	664(ra) # 2d2 <exit>

Which registers contain arguments to functions? For example, which register holds 13 in main's call to printf?`

查看call.asm，我们看到函数参数对应的寄存器是a0、a1、a2，而13放在a2寄存器里。

Where is the call to function f in the assembly code for main? Where is the call to g? (Hint: the compiler may inline functions.)

f和g的调用找不到，f(8)+1代入应该等于12，对应的是:

c 复制代码

26:	45b1                	li	a1,12

At what address is the function printf located?

0x64a。

一方面我们可以查看printf的地址得到答案：

c 复制代码

000000000000064a <printf>:

void
printf(const char *fmt, ...)
...

另一方面查看main这里的汇编：

c 复制代码

  30:   00000097                auipc   ra,0x0
  34:   61a080e7                jalr    1562(ra) # 64a <printf>

#后面也暗示了0x64a，怎么算的呢，先看auipc ra,0x0，从riscv-spec.pdf了解到：

AUIPC (add upper immediate to pc) is used to build pc-relative addresses and uses the U-type format. AUIPC forms a 32-bit offset from the 20-bit U-immediate, filling in the lowest 12 bits with zeros, adds this offset to the address of the AUIPC instruction, then places the result in register rd.

所以AUIPC 的执行步骤可简化为：

取指令中的 20 位 U-immediate，左移 12 位（最低 12 位填 0），生成 32 位偏移量；
计算 "AUIPC 指令地址 + 32 位偏移量"；
将计算结果写入目标寄存器 rd。

也就是说auipc <目标寄存器>, <立即数>的计算逻辑是：

c 复制代码

目标寄存器 = PC + (立即数 << 12)

在这里auipc ra,0x0就是ra=PC+(0 << 12)=0x30（PC是0x30第一列的数），下一句话jalr 1562(ra)我们从手册页可以找到：

The indirect jump instruction JALR (jump and link register) uses the I-type encoding. The target address is obtained by adding the sign-extended 12-bit I-immediate to the register rs1, then setting the least-significant bit of the result to zero. The address of the instruction following the jump (pc+4) is written to register rd. Register x0 can be used as the destination if the result is not required.

跳转的地址为ra（这里是0x30）加上12位符号扩展的立即数（这里是1562），然后会吧最低位清零保证字节对齐：

c 复制代码

0x30+1562=0x30+0x61a=0x64a

What value is in the register ra just after the jalr to printf in main?

上面讲到：The address of the instruction following the jump (pc+4) is written to register rd.

c 复制代码

34:   61a080e7                jalr    1562(ra) # 64a <printf>

printf的地址为0x34，那么在执行了jalr到printf以后地址ra为pc+4为0x38。

在gdb里验证一下确实是0x38：

Run the following code. What is the output?

c 复制代码

	unsigned int i = 0x00646c72;
	printf("H%x Wo%s", 57616, &i);

我们在call里输入这个代码，输出为HE110 World。

%x取的是十六进制，57616的十六进制为E110，用python验证了一下：

%s为按字符串输入，xv6是小端序，0x00646c72依次为0x72,0x6c,0x64，0x00,前三个是字符，最后一个为字符串的结尾\0，前三个字符是r、l、d，用python验证一下：

也可以用gdb验证一下，i的地址是0x2fcc，它指向的内容是0x72（r）：

If the RISC-V were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?

如果是大端序，那么i就反过来：0x726c6400。不需要修改57616，字节序不影响整数的数值本身。

In the following code, what is going to be printed after 'y='? (note: the answer is not a specific value.) Why does this happen?

c 复制代码

printf("x=%d y=%d", 3);

先打在call.c里看看，显示y=5253。

c 复制代码

// user/call.c
#include "kernel/param.h"
#include "kernel/types.h"
#include "kernel/stat.h"
#include "user/user.h"

int g(int x) {
  return x+3;
}

int f(int x) {
  return g(x);
}

void main(void) {
  //printf("%d %d\n", f(8)+1, 13);

  //unsigned int i = 0x00646c72;
  //printf("H%x Wo%s", 57616, &i);

  printf("x=%d y=%d", 3);
  exit(0);
}

函数压栈a0是字符串，a1是3，a2没有输入，没有输入y所以a2是之前的垃圾值。

gdb打印出来a0,a1,a2，这里的a2就是传递进去的参数，所以我们打印的是5253。

具体的汇编的解析也可以查看MIT 6.S081 2020 LAB4记录 - 炼金术士的文章 - 知乎，写得也比较清楚了。

完活！