(持续更新~)
本文主要用于记录在操作libvirt + qemu + kvm过程中遇到的问题及原因分析。
Hugepage
让qemu使用大页可以减少tdp的size,一定程度上可以提高性能;使用大页可以用memfd或者file backend。
memfd
操作步骤如下:
-
在系统中reserve大页;命令参考http://t.csdnimg.cn/PPetb,例如:
cppecho 16 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
-
重新挂载/dev/hugepages,使其pagesize为1g,
cppmount -o remount,pagesize=1g /dev/hugepages
-
重启libvirtd,
cppsystemctl restart libvirtd
-
修改虚拟机xml文件如下:
cpp<memory unit='KiB'>16777216</memory> <currentMemory unit='KiB'>16777216</currentMemory> <memoryBacking> <hugepages/> <source type='memfd'/> <access mode='shared'/> </memoryBacking
启动虚拟机之后,我们会看到memfd文件;
之所以会显示deleted,是memfd创建文件的方式导致的,参考内核代码:
cpp
proc_pid_readlink()
-> do_proc_readlink()
-> d_path()
---
if (unlikely(d_unlinked(path->dentry)))
prepend(&b, " (deleted)", 11);
else
prepend(&b, "", 1);
---
static inline int d_unlinked(const struct dentry *dentry)
{
return d_unhashed(dentry) && !IS_ROOT(dentry);
}
SYSCALL_DEFINE2(memfd_create)
-> hugetlb_file_setup()
-> alloc_file_pseudo()
---
path.dentry = d_alloc_pseudo(mnt->mnt_sb, &this);
...
path.mnt = mntget(mnt);
d_instantiate(path.dentry, inode);
---
其并没有调用d_splice_alias()、d_add()接口,所以是unhashed的
另外,初次测试时,并没有step 2,导致内存分配失败;原因是:libvirt传给qemu的hugetlbszie是2M,而我预留的是1G;追查libvirt代码,原因在于:
cpp
virQEMUDriverConfigNew()
---
/* For privileged driver, try and find hugetlbfs mounts automatically.
* Non-privileged driver requires admin to create a dir for the
* user, chown it, and then let user configure it manually. */
if (privileged &&
virFileFindHugeTLBFS(&cfg->hugetlbfs, &cfg->nhugetlbfs) < 0) {
...
}
---
libvirt会参考系统中挂载的hugetlbfs的pagesize,以此作为参考。