【MIT-OS6.S081作业4.3】Lab4-traps-Alarm

本文记录MIT-OS6.S081 Lab4 traps的Alarm的实现过程

文章目录

[1. 作业要求](#1. 作业要求)
- Alarm (hard)
- - [test0: invoke handler](#test0: invoke handler)
  - [test1/test2(): resume interrupted code](#test1/test2(): resume interrupted code)
[2. 实现过程](#2. 实现过程)
- [2.1 代码实现](#2.1 代码实现)
- - [2.1.1 test0](#2.1.1 test0)
  - [2.1.2 test1/test2](#2.1.2 test1/test2)

1. 作业要求

Alarm (hard)

In this exercise you'll add a feature to xv6 that periodically alerts a process as it uses CPU time. This might be useful for compute-bound processes that want to limit how much CPU time they chew up, or for processes that want to compute but also want to take some periodic action. More generally, you'll be implementing a primitive form of user-level interrupt/fault handlers; you could use something similar to handle page faults in the application, for example. Your solution is correct if it passes alarmtest and usertests.

You should add a new sigalarm(interval, handler) system call. If an application calls sigalarm(n, fn), then after every n "ticks" of CPU time that the program consumes, the kernel should cause application function fn to be called. When fn returns, the application should resume where it left off. A tick is a fairly arbitrary unit of time in xv6, determined by how often a hardware timer generates interrupts. If an application calls sigalarm(0, 0), the kernel should stop generating periodic alarm calls.

You'll find a file user/alarmtest.c in your xv6 repository. Add it to the Makefile. It won't compile correctly until you've added sigalarm and sigreturn system calls (see below).
alarmtest calls sigalarm(2, periodic) in test0 to ask the kernel to force a call to periodic() every 2 ticks, and then spins for a while. You can see the assembly code for alarmtest in user/alarmtest.asm, which may be handy for debugging. Your solution is correct when alarmtest produces output like this and usertests also runs correctly:

d 复制代码

$ alarmtest
test0 start
........alarm!
test0 passed
test1 start
...alarm!
..alarm!
...alarm!
..alarm!
...alarm!
..alarm!
...alarm!
..alarm!
...alarm!
..alarm!
test1 passed
test2 start
................alarm!
test2 passed
$ usertests
...
ALL TESTS PASSED
$

When you're done, your solution will be only a few lines of code, but it may be tricky to get it right. We'll test your code with the version of alarmtest.c in the original repository. You can modify alarmtest.c to help you debug, but make sure the original alarmtest says that all the tests pass.

test0: invoke handler

Get started by modifying the kernel to jump to the alarm handler in user space, which will cause test0 to print "alarm!". Don't worry yet what happens after the "alarm!" output; it's OK for now if your program crashes after printing "alarm!". Here are some hints:

You'll need to modify the Makefile to cause alarmtest.c to be compiled as an xv6 user program.

The right declarations to put in user/user.h are:

c 复制代码

int sigalarm(int ticks, void (*handler)());
int sigreturn(void);

Update user/usys.pl (which generates user/usys.S), kernel/syscall.h, and kernel/syscall.c to allow alarmtest to invoke the sigalarm and sigreturn system calls.

For now, your sys_sigreturn should just return zero.

Your sys_sigalarm() should store the alarm interval and the pointer to the handler function in new fields in the proc structure (in kernel/proc.h).

You'll need to keep track of how many ticks have passed since the last call (or are left until the next call) to a process's alarm handler; you'll need a new field in struct proc for this too. You can initialize proc fields in allocproc() in proc.c.

Every tick, the hardware clock forces an interrupt, which is handled in usertrap() in kernel/trap.c.

You only want to manipulate a process's alarm ticks if there's a timer interrupt; you want something like

c 复制代码

if(which_dev == 2) ...

Only invoke the alarm function if the process has a timer outstanding. Note that the address of the user's alarm function might be 0 (e.g., in user/alarmtest.asm, periodic is at address 0).

You'll need to modify usertrap() so that when a process's alarm interval expires, the user process executes the handler function. When a trap on the RISC-V returns to user space, what determines the instruction address at which user-space code resumes execution?

It will be easier to look at traps with gdb if you tell qemu to use only one CPU, which you can do by running

c 复制代码

make CPUS=1 qemu-gdb

You've succeeded if alarmtest prints "alarm!".

test1/test2(): resume interrupted code

Chances are that alarmtest crashes in test0 or test1 after it prints "alarm!", or that alarmtest (eventually) prints "test1 failed", or that alarmtest exits without printing "test1 passed". To fix this, you must ensure that, when the alarm handler is done, control returns to the instruction at which the user program was originally interrupted by the timer interrupt. You must ensure that the register contents are restored to the values they held at the time of the interrupt, so that the user program can continue undisturbed after the alarm. Finally, you should "re-arm" the alarm counter after each time it goes off, so that the handler is called periodically.

As a starting point, we've made a design decision for you: user alarm handlers are required to call the sigreturn system call when they have finished. Have a look at periodic in alarmtest.c for an example. This means that you can add code to usertrap and sys_sigreturn that cooperate to cause the user process to resume properly after it has handled the alarm.

Some hints:

Your solution will require you to save and restore registers---what registers do you need to save and restore to resume the interrupted code correctly? (Hint: it will be many).

Have usertrap save enough state in struct proc when the timer goes off that sigreturn can correctly return to the interrupted user code.

Prevent re-entrant calls to the handler----if a handler hasn't returned yet, the kernel shouldn't call it again. test2 tests this.
Once you pass test0, test1, and test2 run usertests to make sure you didn't break any other parts of the kernel.

2. 实现过程

2.1 代码实现

2.1.1 test0

You'll need to modify the Makefile to cause alarmtest.c to be compiled as an xv6 user program.

在Makefile加上：

c 复制代码

UPROGS=\
	$U/_cat\
	$U/_echo\
	$U/_forktest\
	$U/_grep\
	$U/_init\
	$U/_kill\
	$U/_ln\
	$U/_ls\
	$U/_mkdir\
	$U/_rm\
	$U/_sh\
	$U/_stressfs\
	$U/_usertests\
	$U/_grind\
	$U/_wc\
	$U/_zombie\
	$U/_alarmtest\ # 添加在这里

The right declarations to put in user/user.h are:

c 复制代码

int sigalarm(int ticks, void (*handler)());
int sigreturn(void);

在user/user.h里加上上面的函数声明即可。

c 复制代码

// system calls
...
int sleep(int);
int uptime(void);
int sigalarm(int, void (*)()); // 添加在这里
int sigreturn(void); // 添加在这里

Update user/usys.pl (which generates user/usys.S), kernel/syscall.h, and kernel/syscall.c to allow alarmtest to invoke the sigalarm and sigreturn system calls.

打开user/usys.pl，添加：

c 复制代码

...
entry("chdir");
entry("dup");
entry("getpid");
entry("sbrk");
entry("sleep");
entry("uptime");
entry("sigalarm"); // 添加在这里
entry("sigreturn"); // 添加在这里

打开kernel/syscall.h，添加：

c 复制代码

...
#define SYS_mknod  17
#define SYS_unlink 18
#define SYS_link   19
#define SYS_mkdir  20
#define SYS_close  21
#define SYS_sigalarm 22 // 添加在这里
#define SYS_sigreturn 23 // 添加在这里

打开kernel/syscall.c，添加：

c 复制代码

extern uint64 sys_wait(void);
extern uint64 sys_write(void);
extern uint64 sys_uptime(void);
extern uint64 sys_sigalarm(void); // 添加在这里
extern uint64 sys_sigreturn(void); // 添加在这里

static uint64 (*syscalls[])(void) = {
...
[SYS_sleep]   sys_sleep,
[SYS_uptime]  sys_uptime,
[SYS_open]    sys_open,
[SYS_write]   sys_write,
[SYS_mknod]   sys_mknod,
[SYS_unlink]  sys_unlink,
[SYS_link]    sys_link,
[SYS_mkdir]   sys_mkdir,
[SYS_close]   sys_close,
[SYS_sigalarm] sys_sigalarm, // 添加在这里
[SYS_sigreturn] sys_sigreturn, // 添加在这里
};

For now, your sys_sigreturn should just return zero.

在kernel/sysproc.c里添加：

c 复制代码

uint64
sys_sigreturn(void)
{
  return 0;
}

Your sys_sigalarm() should store the alarm interval and the pointer to the handler function in new fields in the proc structure (in kernel/proc.h).

sys_sigalarm要在进程里存放alarm的时间间隔和回调的指针。我们先在kernel/proc.h添加：

c 复制代码

struct proc {
  struct spinlock lock;

  // p->lock must be held when using these:
  enum procstate state;        // Process state
  struct proc *parent;         // Parent process
  void *chan;                  // If non-zero, sleeping on chan
  int killed;                  // If non-zero, have been killed
  int xstate;                  // Exit status to be returned to parent's wait
  int pid;                     // Process ID

  // these are private to the process, so p->lock need not be held.
  uint64 kstack;               // Virtual address of kernel stack
  uint64 sz;                   // Size of process memory (bytes)
  pagetable_t pagetable;       // User page table
  struct trapframe *trapframe; // data page for trampoline.S
  struct context context;      // swtch() here to run process
  struct file *ofile[NOFILE];  // Open files
  struct inode *cwd;           // Current directory
  char name[16];               // Process name (debugging)
  int sigalarmTicksTarget;       // 添加在这里
  void (*sigalarmHandler)();  // 添加在这里
};

我们在kernel/sysproc.c里添加sys_sigalarm：

c 复制代码

uint64
sys_sigalarm(void)
{
  int ticks;
  uint64 handler;
  if (argint(0, &ticks) < 0)
    return -1;
  if (argaddr(1, &handler) < 0)
    return -1;
  struct proc* p  = myproc();
  p->sigalarmTicksTarget = ticks;
  p->sigalarmHandler = (void(*)())handler;
  return 0;
}

无论是argint还是argaddr都会调用static uint64 argraw(int n)函数：

c 复制代码

static uint64
argraw(int n)
{
  struct proc *p = myproc();
  switch (n) {
  case 0:
    return p->trapframe->a0;
  case 1:
    return p->trapframe->a1;
  case 2:
    return p->trapframe->a2;
  case 3:
    return p->trapframe->a3;
  case 4:
    return p->trapframe->a4;
  case 5:
    return p->trapframe->a5;
  }
  panic("argraw");
  return -1;
}

结合4-1的作业我们可以知道a0是第一个参数，a1是第二个参数，以此类推。sigalarm有两个参数，所以第一个参数ticks使用argint第一个参数是0，第二个参数handler使用argaddr第一个参数是1。

You'll need to keep track of how many ticks have passed since the last call (or are left until the next call) to a process's alarm handler; you'll need a new field in struct proc for this too. You can initialize proc fields in allocproc() in proc.c.

这个话的意思是我们需要track两次回调函数调用的ticks，在proc结构体里再加上一个变量：

c 复制代码

struct proc {
  struct spinlock lock;

  // p->lock must be held when using these:
  enum procstate state;        // Process state
  struct proc *parent;         // Parent process
  void *chan;                  // If non-zero, sleeping on chan
  int killed;                  // If non-zero, have been killed
  int xstate;                  // Exit status to be returned to parent's wait
  int pid;                     // Process ID

  // these are private to the process, so p->lock need not be held.
  uint64 kstack;               // Virtual address of kernel stack
  uint64 sz;                   // Size of process memory (bytes)
  pagetable_t pagetable;       // User page table
  struct trapframe *trapframe; // data page for trampoline.S
  struct context context;      // swtch() here to run process
  struct file *ofile[NOFILE];  // Open files
  struct inode *cwd;           // Current directory
  char name[16];               // Process name (debugging)
  int sigalarmTicksTarget;       // 添加在这里
  void (*sigalarmHandler)();  // 添加在这里
  int sigalarmTicksSinceLastCall; // 添加在这里
};

修改一下前面的sys_sigalarm函数

c 复制代码

```c
uint64
sys_sigalarm(void)
{
  int ticks;
  uint64 handler;
  if (argint(0, &ticks) < 0)
    return -1;
  if (argaddr(1, &handler) < 0)
    return -1;
  struct proc* p  = myproc();
  p->sigalarmTicksTarget = ticks;
  p->sigalarmHandler = (void(*)())handler;
  p->sigalarmTicksSinceLastCall = 0; // 添加在这里
  return 0;
}

我们在proc.c的allocproc函数添加我们添加的三个变量的初始化：

c 复制代码

static struct proc*
allocproc(void)
{
  ...

  // Set up new context to start executing at forkret,
  // which returns to user space.
  memset(&p->context, 0, sizeof(p->context));
  p->context.ra = (uint64)forkret;
  p->context.sp = p->kstack + PGSIZE;

  p->sigalarmTicksTarget = 0; // 添加在这里
  p->sigalarmHandler = 0; // 添加在这里
  p->sigalarmTicksSinceLastCall = 0; // 添加在这里

  return p;
}

我们在freeproc也重新恢复一下变量：

c 复制代码

static void
freeproc(struct proc *p)
{
  if(p->trapframe)
    kfree((void*)p->trapframe);
  p->trapframe = 0;
  if(p->pagetable)
    proc_freepagetable(p->pagetable, p->sz);
  p->pagetable = 0;
  p->sz = 0;
  p->pid = 0;
  p->parent = 0;
  p->name[0] = 0;
  p->chan = 0;
  p->killed = 0;
  p->xstate = 0;
  p->state = UNUSED;
  p->sigalarmTicksTarget = 0; // 添加在这里
  p->sigalarmHandler = 0; // 添加在这里
  p->sigalarmTicksSinceLastCall = 0; // 添加在这里
}

Every tick, the hardware clock forces an interrupt, which is handled in usertrap() in kernel/trap.c.

You only want to manipulate a process's alarm ticks if there's a timer interrupt; you want something like

c 复制代码

if(which_dev == 2) ...

Only invoke the alarm function if the process has a timer outstanding. Note that the address of the user's alarm function might be 0 (e.g., in user/alarmtest.asm, periodic is at address 0).

You'll need to modify usertrap() so that when a process's alarm interval expires, the user process executes the handler function. When a trap on the RISC-V returns to user space, what determines the instruction address at which user-space code resumes execution?

我们看一下usertrap，注释里写到如果有来自用户空间的中断、异常、系统调用，那么就会调用usertrap这个函数。我们需要添加which_dev == 2的逻辑来修改进程的alarm ticks。修改的前提是时间达到了sigalarm传递的参数ticks，注意用户的alarm回调函数可能是0。决定返回用户空间返回的应该是epc寄存器。

c 复制代码

//
// handle an interrupt, exception, or system call from user space.
// called from trampoline.S
//
void
usertrap(void)
{
  int which_dev = 0;

  if((r_sstatus() & SSTATUS_SPP) != 0)
    panic("usertrap: not from user mode");

  // send interrupts and exceptions to kerneltrap(),
  // since we're now in the kernel.
  w_stvec((uint64)kernelvec);

  struct proc *p = myproc();
  
  // save user program counter.
  p->trapframe->epc = r_sepc();
  
  if(r_scause() == 8){
    // system call

    if(p->killed)
      exit(-1);

    // sepc points to the ecall instruction,
    // but we want to return to the next instruction.
    p->trapframe->epc += 4;

    // an interrupt will change sstatus &c registers,
    // so don't enable until done with those registers.
    intr_on();

    syscall();
  } else if((which_dev = devintr()) != 0){
    // ok
  } else {
    printf("usertrap(): unexpected scause %p pid=%d\n", r_scause(), p->pid);
    printf("            sepc=%p stval=%p\n", r_sepc(), r_stval());
    p->killed = 1;
  }

  if(p->killed)
    exit(-1);

  // give up the CPU if this is a timer interrupt.
  if(which_dev == 2)
    yield();

  usertrapret();
}

我们修改一下which_dev == 2的逻辑：

c 复制代码

void
usertrap(void)
{
  int which_dev = 0;

  ...

  // give up the CPU if this is a timer interrupt.
  if(which_dev == 2)
  {
    if (p->sigalarmTicksTarget > 0)
    {
      if (p->sigalarmTicksSinceLastCall >= p->sigalarmTicksTarget)
      {
        p->sigalarmTicksSinceLastCall = 0;
        p->trapframe->epc = (uint64)p->sigalarmHandler;
      }
      p->sigalarmTicksSinceLastCall++;
      
    }
    yield();
  }

  usertrapret();
}

试着编译运行alarmtest，test0通过！但是test1还是有问题的：

2.1.2 test1/test2

几个Hint一起看：

Your solution will require you to save and restore registers---what registers do you need to save and restore to resume the interrupted code correctly? (Hint: it will be many).

我们需要在进入sigalarm的回调函数前保存对应寄存器的值。

Have usertrap save enough state in struct proc when the timer goes off that sigreturn can correctly return to the interrupted user code.

具体的保存的实现我们会放在struct proc里的一个变量里。调用sigreturn以后我们会重新回到正确用户中断的地方继续执行。

Prevent re-entrant calls to the handler----if a handler hasn't returned yet, the kernel shouldn't call it again. test2 tests this.

需要防止重复进入到回调里，需要有个变量标记当前是否还在回调执行中。

保存什么寄存器？答案是保存p->trapframe，它的类型包含了所有需要的寄存器，前面的epc也是在里面的：

c 复制代码

struct trapframe {
  /*   0 */ uint64 kernel_satp;   // kernel page table
  /*   8 */ uint64 kernel_sp;     // top of process's kernel stack
  /*  16 */ uint64 kernel_trap;   // usertrap()
  /*  24 */ uint64 epc;           // saved user program counter
  /*  32 */ uint64 kernel_hartid; // saved kernel tp
  /*  40 */ uint64 ra;
  /*  48 */ uint64 sp;
  /*  56 */ uint64 gp;
  /*  64 */ uint64 tp;
  /*  72 */ uint64 t0;
  /*  80 */ uint64 t1;
  /*  88 */ uint64 t2;
  /*  96 */ uint64 s0;
  /* 104 */ uint64 s1;
  /* 112 */ uint64 a0;
  /* 120 */ uint64 a1;
  /* 128 */ uint64 a2;
  /* 136 */ uint64 a3;
  /* 144 */ uint64 a4;
  /* 152 */ uint64 a5;
  /* 160 */ uint64 a6;
  /* 168 */ uint64 a7;
  /* 176 */ uint64 s2;
  /* 184 */ uint64 s3;
  /* 192 */ uint64 s4;
  /* 200 */ uint64 s5;
  /* 208 */ uint64 s6;
  /* 216 */ uint64 s7;
  /* 224 */ uint64 s8;
  /* 232 */ uint64 s9;
  /* 240 */ uint64 s10;
  /* 248 */ uint64 s11;
  /* 256 */ uint64 t3;
  /* 264 */ uint64 t4;
  /* 272 */ uint64 t5;
  /* 280 */ uint64 t6;
};

我们看一下struct proc里怎么定义它的：

c 复制代码

struct proc {
  ...
  struct trapframe *trapframe; // data page for trampoline.S
  ...
}

哦它是一个指针！那我们也可以类似地保存一个struct trapframe *类型的prevTrapframe，另外为了不重复进入回调，我们还需要一个类似锁的东西，命名为isInSigalarm，当isInSigalarm为1，说明还没有sigreture，不能重复设置epc，sigreturn函数调用是isInSigalarm设置为0，具体如下：

c 复制代码

struct proc {
  ...
  struct trapframe *trapframe; // data page for trampoline.S
  ...
  int sigalarmTicksTarget;    
  uint64 sigalarmHandler;
  int sigalarmTicksSinceLastCall;
  struct trapframe *preTrapframe;  // 添加在这里
  int isInSigalarm; // 添加在这里
}

在allocproc里仿照trapframe的初始化，我们也初始化我们的preTrapframe：

c 复制代码

static struct proc*
allocproc(void)
{
  struct proc *p;
  ...
  // Allocate a trapframe page.
  if((p->trapframe = (struct trapframe *)kalloc()) == 0){
    release(&p->lock);
    return 0;
  }
  // 添加开始
  if((p->preTrapframe = (struct trapframe *)kalloc()) == 0){
    release(&p->lock);
    return 0;
  }
  p->isInSigalarm = 0;
  // 添加结束
  ...
}

另外别忘了freeproc进行释放：

c 复制代码

static void
freeproc(struct proc *p)
{
  if(p->trapframe)
    kfree((void*)p->trapframe);
  p->trapframe = 0;
  // 添加开始
  if (p->preTrapframe)
    kfree((void*)p->preTrapframe);
  p->preTrapframe = 0;
  // 添加结束
  ...
  p->isInSigalarm = 0; // 添加在这里
}

接下来我们在usertrap函数里的if(which_dev == 2)添加保存寄存器的逻辑。

c 复制代码

void
usertrap(void)
{
  ...
  if(which_dev == 2)
  {
    if (p->sigalarmTicksTarget > 0)
    {
      if (p->sigalarmTicksSinceLastCall >= p->sigalarmTicksTarget)
      {
        if (!p->isInSigalarm)
        {
          p->sigalarmTicksSinceLastCall = 0;
          p->isInSigalarm = 1;
          memmove((void*)(p->preTrapframe), (void*)(p->trapframe), sizeof(struct trapframe));
          p->trapframe->epc = p->sigalarmHandler;
        }
      }
      p->sigalarmTicksSinceLastCall++;
      
    }
    yield();
  }
  ...
}

修改一下sys_sigreturn函数，恢复原来的trapframe，并且设置isInSigalarm为0，另外sys_sigalarm也设置一下isInSigalarm：

c 复制代码

uint64
sys_sigreturn(void)
{
  struct proc* p = myproc();
  memmove((void*)(p->trapframe), (void*)(p->preTrapframe), sizeof(struct trapframe));
  p->isInSigalarm = 0;
  return 0;
}

uint64
sys_sigalarm(void)
{
  int ticks;
  uint64 handler;
  if (argint(0, &ticks) < 0)
    return -1;
  if (argaddr(1, &handler) < 0)
    return -1;
  struct proc* p  = myproc();
  p->sigalarmTicksTarget = ticks;
  p->sigalarmHandler = handler;
  p->sigalarmTicksSinceLastCall = 0;
  p->isInSigalarm = 0;
  return 0;
}

重新编译测试alarmtest，通过！

然后我们再使用作业所说的测试命令：

cpp 复制代码

./grade-lab-traps alarmtest

完活！