MIT6.824(6.5840) Lab1笔记+源码

文章目录

原本有可借鉴的部分

mrsequential.go,多看几遍源码

其他人的内容,笔记写的更好,思路可以去看他们的

MIT - 6.824 全课程 + Lab 博客总览-CSDN博客

MapReuce 详解与复现, 完成 MIT 6.824(6.5840) Lab1 - 掘金 (juejin.cn)

mit 6.824 lab1 笔记 - 掘金 (juejin.cn)

MapReduce

每个 worker 进程需完成以下工作:

向 master 进程请求 task,从若干文件中读取输入数据,执行 task,并将 task 的输出写入到若干文件中。

master 进程除了为 worker 进程分配 task 外,还需要检查在一定时间内(本实验中为 10 秒)每个 worker 进程是否完成了相应的 task,如果未完成的话则将该 task 转交给其他 worker 进程。

worker

worker函数包含map和redicef两个功能

需要知道自己执行哪个功能

worker 请求获取任务 GetTask

任务设置为结构体,其中一个字段为任务类型

type Task struct{Type:0 1 }0为map,1为reduce const枚举

一个为编号

如何获取任务?

GetTask请求coordinator的assignTask方法,传入为自己的信息(???),获得任务信息

自己当前的状态 空闲 忙碌

call(rpcname,args,reply)bool的rpcname 对应coordinator.go中coordinator的相关方法

???没理解这个什么用

GetTask中调用call来从coordinator获取任务信息?

那么rpc.go来干什么

使用ihash(key) % NReduce为Map发出的每个KeyValue选择reduce任务号。随机选择序号

n个文件,生成m个不同文件 n X m个 文件??

保存多少个文件

map得到

对结果ohashkey,写入到文件序号1-10

根据序号分配reduce任务

将结果写入到同一文件

map

go 复制代码
mapf func(filename string, content string) []KeyValue

filename是传入的文件名

content为传入的文件的内容------传入前需读取内容

传出intermediate[] 产生文件名 mr-x-y

reduce

go 复制代码
reducef func(key string, values []string) string)

这里的key对应ihash生成的任务号??

coordinator

分配任务

需要创建几个map worker---根据几个文件

几个 reduce worker---根据设置,这里为10

coordinator函数用来生成唯一id

Coordinator 结构体定义

用来在不同rpc之间通信,所以其内容是公用的?

go 复制代码
type Coordinator struct{
	files   []string
	nReduce int
当前处在什么阶段? state


}

type dic struct{
status 0or1or2
id       
}

map[file string]

如何解决并发问题??

怎么查询worker的状态??

worker主动向coordinator发送信息

rpc

args 请求参数怎么定义

type args struct{

自己的身份信息

自己的状态信息

}

纠错

  1. 6.5840/mr.writeKVs({0xc000e90000, 0x1462a, 0x16800?}, 0xc00007d100, {0xc39d00, 0x0, 0x0?})

    /home/wang2/6.5840/src/mr/worker.go:109 +0x285

  2. *** Starting wc test.

    panic: runtime error: index out of range [1141634764] with length 0

    ihash取余

  3. runtime error: integer divide by zero

reply.NReduce = c.nReduce // 设置 NReduce

  1. cat: 'mr-out*': No such file or directory

    --- saw 0 workers rather than 2

    --- map parallelism test: FAIL

    cat: 'mr-out*': No such file or directory

    --- map workers did not run in parallel

    --- map parallelism test: FAIL

    cat: 'mr-out*': No such file or directory

    --- too few parallel reduces.

    --- reduce parallelism test: FAIL

    文件句柄问题

  2. sort: cannot read: 'mr-out*': No such file or directory

    cmp: EOF on mr-wc-all which is empty

    2024/07/19 11:15:10 dialing:dial unix /var/tmp/5840-mr-1000: connect: connection refused

    2024/07/19 11:15:10 dialing:dial unix /var/tmp/5840-mr-1000: connect: connection refused

    2024/07/19 11:15:10 dialing:dial unix /var/tmp/5840-mr-1000: connect: connection refused

  3. coordinator结构体的并发读取问题

    dialing:dial-http unix /var/tmp/5840-mr-1000: read unix @->/var/tmp/5840-mr-1000: read: connection reset by peer

源码

由于笔记等提示跟没有没区别,这里将源码放上,等大家实在没办法再看吧,
github仓库地址

注意lab1中的提示很重要

博主初期也迷茫过,检查bug也痛苦过,祝福大家。

有的时候也会出错,但是不想搞了--

worker.go

go 复制代码
package mr

import (
	"encoding/json"
	"fmt"
	"io"
	"os"
	"sort"
	"strings"
	"time"
)
import "log"
import "net/rpc"
import "hash/fnv"

// for sorting by key.
type ByKey []KeyValue

// for sorting by key.
func (a ByKey) Len() int           { return len(a) }
func (a ByKey) Swap(i, j int)      { a[i], a[j] = a[j], a[i] }
func (a ByKey) Less(i, j int) bool { return a[i].Key < a[j].Key }

const (
	Map = iota
	Reduce
	Over
)
const (
	Idle = iota
	Busy
	Finish
)

// Map functions return a slice of KeyValue.

type KeyValue struct {
	Key   string
	Value string
}

// use ihash(key) % NReduce to choose the reduce
// task number for each KeyValue emitted by Map.

func ihash(key string) int {
	h := fnv.New32a()
	h.Write([]byte(key))
	return int(h.Sum32() & 0x7fffffff)
}

// main/mrworker.go calls this function.

func Worker(mapf func(string, string) []KeyValue, reducef func(string, []string) string) {
	// Your worker implementation here.

	Ch := make(chan bool)
	for {
		var Res = &Args{State: Idle} //初始化为idle状态
		var TaskInformation = &TaskInfo{}
		GetTask(Res, TaskInformation)

		//主任务结束后不再请求
		if TaskInformation.TaskType == Over {
			break
		}
		//fmt.Println("do it!")
		go DoTask(TaskInformation, mapf, reducef, Ch)

		sign := <-Ch
		//fmt.Println("sign:", sign)

		if sign == true {
			//fmt.Println("Finish one,ID:", TaskInformation.TaskId)
			Done(Res, TaskInformation)

		} else {
			//TaskInformation.Status = Idle
			//fmt.Println("err one,ID:", TaskInformation.TaskId)
			call("Coordinator.Err", Res, TaskInformation)
			Res = &Args{State: Idle}
		}
		time.Sleep(time.Second)
	}

	// uncomment to send the Example RPC to the coordinator.
	//CallExample()

}

func GetTask(Args *Args, TaskInformation *TaskInfo) {
	// 调用coordinator获取任务
	for {
		call("Coordinator.AssignTask", Args, TaskInformation)
		//fmt.Println(TaskInformation)

		if TaskInformation.Status != Idle {
			Args.State = Busy
			Args.Tasktype = TaskInformation.TaskType
			Args.TaskId = TaskInformation.TaskId
			//fmt.Println("TaskInfo:", TaskInformation)
			//fmt.Println("Args:", Args)
			call("Coordinator.Verify", Args, TaskInformation)
			break
		}

		time.Sleep(time.Second)
	}
	//fmt.Printf("Type:%v,Id:%v\n", TaskInformation.TaskType, TaskInformation.TaskId)
}

func writeKVs(KVs []KeyValue, info *TaskInfo, fConts []*os.File) {
	//fConts := make([]io.Writer, info.NReduce)
	KVset := make([][]KeyValue, info.NReduce)

	//fmt.Println("start write")

	//for j := 1; j <= info.NReduce; j++ {
	//
	//	fileName := fmt.Sprintf("mr-%v-%v", info.TaskId, j)
	//	os.Create(fileName)
	//
	//	f, _ := os.Open(fileName)
	//	fConts[j-1] = f
	//
	//	defer f.Close()
	//}
	var Order int
	for _, v := range KVs {
		Order = ihash(v.Key) % info.NReduce
		KVset[Order] = append(KVset[Order], v)
	}
	//fmt.Println("kvset:", KVset)

	for i, v := range KVset {

		for _, value := range v {
			data, _ := json.Marshal(value)
			_, err := fConts[i].Write(data)

			//fmt.Println("data: ", data)
			//fmt.Println("numbers:", write)
			if err != nil {
				return
			}

		}
	}
	//fmt.Println("finish write")
}

func read(filename string) []byte {
	//fmt.Println("read", filename)
	file, err := os.Open(filename)
	defer file.Close()
	if err != nil {
		log.Fatalf("cannot open %v", filename)
		fmt.Println(err)
	}
	content, err := io.ReadAll(file)
	if err != nil {
		log.Fatalf("cannot read %v", filename)
	}

	return content
}

// DoTask 执行mapf或者reducef任务

func DoTask(info *TaskInfo, mapf func(string, string) []KeyValue, reducef func(string, []string) string, Ch chan bool) {
	//fConts := make([]io.Writer, info.NReduce)

	//fmt.Println("start", info.TaskId)
	//go AssignAnother(Ch)
	switch info.TaskType {
	case Map:
		info.FileContent = string(read(info.FileName))
		//fmt.Println(info.FileContent)
		KVs := mapf(info.FileName, info.FileContent.(string))
		//fmt.Println("map:", KVs)
		//将其排序
		sort.Sort(ByKey(KVs))

		var fConts []*os.File // 修改为 *os.File 类型

		//0-9
		for j := 0; j < info.NReduce; j++ {

			//暂时名,完成后重命名
			fileName := fmt.Sprintf("mr-%v-%v-test", info.TaskId, j)
			//_, err := os.Create(fileName)
			//if err != nil {
			//	fmt.Println(err)
			//	return
			//}
			//
			//f, _ := os.Open(fileName)
			//fConts[j-1] = f
			//
			//defer f.Close()
			f, err := os.Create(fileName) // 直接使用 Create 函数
			if err != nil {
				fmt.Println(err)
				return
			}

			//fmt.Println("creatfile:  ", fileName)

			//fConts[j] = f
			fConts = append(fConts, f)
			defer os.Rename(fileName, strings.TrimSuffix(fileName, "-test"))
			defer f.Close()
		}

		writeKVs(KVs, info, fConts)

	case Reduce:
		fileName := fmt.Sprintf("testmr-out-%v", info.TaskId)
		fileOS, err := os.Create(fileName)
		//fmt.Println("create success")
		if err != nil {
			fmt.Println("Error creating file:", err)
			return
		}
		defer os.Rename(fileName, strings.TrimPrefix(fileName, "test"))
		defer fileOS.Close()
		var KVs []KeyValue
		//读取文件
		for i := 0; i < info.Nmap; i++ {
			fileName := fmt.Sprintf("mr-%v-%v", i, info.TaskId)
			//fmt.Println(fileName)

			file, err := os.Open(fileName)
			defer file.Close()
			if err != nil {
				fmt.Println(err)
			}

			dec := json.NewDecoder(file)

			for {
				var kv KeyValue
				if err := dec.Decode(&kv); err != nil {
					break
				}
				//fmt.Println(kv)
				KVs = append(KVs, kv)
			}
		}
		//var KVsRes []KeyValue
		sort.Sort(ByKey(KVs))
		//整理并传输内容给reduce
		i := 0
		for i < len(KVs) {
			j := i + 1
			for j < len(KVs) && KVs[j].Key == KVs[i].Key {
				j++
			}
			values := []string{}
			for k := i; k < j; k++ {
				values = append(values, KVs[k].Value)
			}
			// this is the correct format for each line of Reduce output.
			output := reducef(KVs[i].Key, values)
			//每个key对应的计数
			//KVsRes = append(KVsRes, KeyValue{KVs[i].Key, output})

			fmt.Fprintf(fileOS, "%v %v\n", KVs[i].Key, output)

			i = j
		}

	}
	Ch <- true

}

func Done(Arg *Args, Info *TaskInfo) {

	//Info.Status = Idle
	call("Coordinator.WorkerDone", Arg, Info)

	//arg重新清空
	Arg = &Args{State: Idle}
}

func AssignAnother(Ch chan bool) {
	time.Sleep(2 * time.Second)

	Ch <- false
}

// example function to show how to make an RPC call to the coordinator.
//
// the RPC argument and reply types are defined in rpc.go.
func CallExample() {

	// declare an argument structure.
	args := ExampleArgs{}

	// fill in the argument(s).
	args.X = 99

	// declare a reply structure.
	reply := ExampleReply{}

	// send the RPC request, wait for the reply.
	// the "Coordinator.Example" tells the
	// receiving server that we'd like to call
	// the Example() method of struct Coordinator.
	ok := call("Coordinator.Example", &args, &reply)
	if ok {
		// reply.Y should be 100.
		fmt.Printf("reply.Y %v\n", reply.Y)
	} else {
		fmt.Printf("call failed!\n")
	}
}

// send an RPC request to the coordinator, wait for the response.
// usually returns true.
// returns false if something goes wrong.

func call(rpcname string, args interface{}, reply interface{}) bool {
	// c, err := rpc.DialHTTP("tcp", "127.0.0.1"+":1234")
	sockname := coordinatorSock()
	//fmt.Println("Worker is dialing", sockname)
	c, err := rpc.DialHTTP("unix", sockname)
	if err != nil {
		//log.Fatal("dialing:", err)
		return false
	}
	defer c.Close()

	err = c.Call(rpcname, args, reply)
	if err != nil {
		//fmt.Println(err)
		return false
	}

	return true
}

coordinator.go

go 复制代码
package mr

import (
	"log"
	"sync"
	"time"
)
import "net"
import "os"
import "net/rpc"
import "net/http"

type Coordinator struct {
	// Your definitions here.
	files      []string
	nReduce    int
	MapTask    map[int]*Task
	ReduceTask []int
	OK         bool
	Lock       sync.Mutex
}

type Task struct {
	fileName string
	state    int
}

var TaskMapR map[int]*Task

func (c *Coordinator) Verify(Arg *Args, Reply *TaskInfo) error {

	switch Arg.Tasktype {
	case Map:
		time.Sleep(3 * time.Second)
		if c.MapTask[Arg.TaskId].state != Finish {
			c.MapTask[Arg.TaskId].state = Idle
			Reply = &TaskInfo{}
		}
	case Reduce:
		time.Sleep(3 * time.Second)
		if c.ReduceTask[Arg.TaskId] != Finish {
			c.ReduceTask[Arg.TaskId] = Idle
			Reply = &TaskInfo{}
		}
	}
	return nil
}

// Your code here -- RPC handlers for the worker to call.
func (c *Coordinator) AssignTask(Arg *Args, Reply *TaskInfo) error {
	c.Lock.Lock()
	defer c.Lock.Unlock()
	//如果请求为空闲
	if Arg.State == Idle {

		//Args.State = Busy
		//首先分配Map
		for i, task := range c.MapTask {
			//fmt.Println(*task, "Id:", i)
			if task.state == Idle {
				//Arg.Tasktype = Map
				//Arg.TaskId = i + 1
				Reply.TaskType = Map
				Reply.FileName = task.fileName

				//fmt.Println(task.fileName)

				Reply.TaskId = i          //range从0开始
				Reply.NReduce = c.nReduce // 设置 NReduce
				Reply.Status = Busy
				task.state = Busy
				//fmt.Println("map,Id:", i)
				return nil
			}
		}

		//Map完成后再Reduce
		for _, task := range c.MapTask {
			if task.state != Finish {
				//fmt.Println("等待Map完成")
				return nil
			}
		}

		//fmt.Println("MapDone")

		//分配Reduce
		for i, v := range c.ReduceTask {
			//fmt.Println(c.ReduceTask)
			if v == Idle {
				Arg.Tasktype = Reduce
				Arg.TaskId = i
				Reply.TaskType = Reduce
				Reply.TaskId = i
				Reply.NReduce = c.nReduce // 设置 NReduce
				Reply.Status = Busy
				Reply.Nmap = len(c.files)
				c.ReduceTask[i] = Busy
				//fmt.Println(c.ReduceTask[i])
				//fmt.Println("reduce", i)
				return nil
			}
		}

		//Reduce都结束则成功
		for _, v := range c.ReduceTask {
			if v == Finish {
			} else {
				return nil
			}
		}
		Reply.TaskType = Over
		c.OK = true
	}

	return nil
}

func (c *Coordinator) WorkerDone(args *Args, reply *TaskInfo) error {
	//c.Lock.Lock()
	//defer c.Lock.Unlock()

	//reply清空
	reply = &TaskInfo{}
	//args.State = Finish

	id := args.TaskId
	//fmt.Println("id", id)
	switch args.Tasktype {
	case Map:
		c.MapTask[id].state = Finish
		//fmt.Println(*c.MapTask[id])
	case Reduce:
		c.ReduceTask[id] = Finish
		//fmt.Println(c.ReduceTask)
	}
	return nil
}

func (c *Coordinator) Err(args *Args, reply *TaskInfo) error {
	//c.Lock.Lock()
	//defer c.Lock.Unlock()

	reply = &TaskInfo{}
	id := args.TaskId
	switch args.Tasktype {
	case Map:
		if c.MapTask[id].state != Finish {
			c.MapTask[id].state = Idle
		}

	case Reduce:
		if c.ReduceTask[id] != Finish {
			c.ReduceTask[id] = Idle
		}
	}
	return nil
}

// an example RPC handler.
//
// the RPC argument and reply types are defined in rpc.go.

func (c *Coordinator) Example(args *ExampleArgs, reply *ExampleReply) error {
	reply.Y = args.X + 1
	return nil
}

// start a thread that listens for RPCs from worker.go

func (c *Coordinator) server() {
	rpc.Register(c)
	rpc.HandleHTTP()
	sockname := coordinatorSock()
	os.Remove(sockname)
	l, e := net.Listen("unix", sockname)
	if e != nil {
		log.Fatal("listen error:", e)
	}
	//fmt.Println("Coordinator is listening on", sockname)
	go http.Serve(l, nil)
}

// main/mrcoordinator.go calls Done() periodically to find out
// if the entire job has finished.

func (c *Coordinator) Done() bool {
	//c.Lock.Lock()
	//defer c.Lock.Unlock()

	ret := false
	if c.OK == true {
		ret = true
	}
	// Your code here.

	return ret
}

// create a Coordinator.
// main/mrcoordinator.go calls this function.
// nReduce is the number of reduce tasks to use.
func MakeCoordinator(files []string, nReduce int) *Coordinator {
	//fmt.Println(files)
	TaskMapR = make(map[int]*Task, len(files))

	for i, file := range files {
		TaskMapR[i] = &Task{
			fileName: file,
			state:    Idle,
		}
	}

	ReduceMap := make([]int, nReduce)

	c := Coordinator{
		files:      files,
		nReduce:    nReduce,
		MapTask:    TaskMapR,
		ReduceTask: ReduceMap,
		OK:         false,
	}

	// Your code here.

	c.server()

	return &c
}

rpc.go

go 复制代码
package mr

//
// RPC definitions.
//
// remember to capitalize all names.
//

import "os"
import "strconv"

//
// example to show how to declare the arguments
// and reply for an RPC.
//

type ExampleArgs struct {
	X int
}

type ExampleReply struct {
	Y int
}

// Add your RPC definitions here.

type Args struct {
	State    int
	Tasktype int
	TaskId   int
}

type TaskInfo struct {
	Status int

	TaskType int //任务基本信息
	TaskId   int

	NReduce int
	Nmap    int

	FileName    string
	FileContent any

	//Key    string //reduce所需信息
	//Values []string
}

// Cook up a unique-ish UNIX-domain socket name
// in /var/tmp, for the coordinator.
// Can't use the current directory since
// Athena AFS doesn't support UNIX-domain sockets.

func coordinatorSock() string {
	s := "/var/tmp/5840-mr-"
	s += strconv.Itoa(os.Getuid())
	return s
}
相关推荐
神经网络的应用16 分钟前
C++程序设计例题——第三章程序控制结构
c++·学习·算法
南宫生29 分钟前
力扣-数据结构-3【算法学习day.74】
java·数据结构·学习·算法·leetcode
挥剑决浮云 -33 分钟前
STM32学习之 按键/光敏电阻 控制 LED/蜂鸣器
c语言·经验分享·stm32·单片机·嵌入式硬件·学习
keep-learner1 小时前
Unity Dots理论学习-2.ECS有关的模块(1)
学习·unity·游戏引擎
A懿轩A1 小时前
C/C++ 数据结构与算法【树和二叉树】 树和二叉树,二叉树先中后序遍历详细解析【日常学习,考研必备】带图+详细代码
c语言·数据结构·c++·学习·二叉树·
胡楚昊1 小时前
攻防世界PWN刷题笔记(引导模式)1-3
笔记
虾球xz1 小时前
游戏引擎学习第62天
学习·游戏引擎
上等猿2 小时前
Ajax笔记
前端·笔记·ajax
lmxnsI2 小时前
docker使用笔记
笔记·docker·容器
程序猿online2 小时前
nvm安装使用,控制node版本
开发语言·前端·学习