core dump管理在linux中的前世今生

目录

[一、什么是core dump?](#一、什么是core dump?)

二、coredump是怎么来的?

三、怎么限制coredump文件的产生?

ulimit

半永久限制

永久限制

四、从源码分析如何对coredump文件的名字和路径管理

命名

管理

一些问题的答案

1、为什么新的ubuntu不能产生coredump了,需要手动管理?

2、systemd是什么时候出现在ubuntu中的?

3、为什么使用脚本管理coredump后ulimit就不能限制coredump文件的大小了。

4、为什么ulimit限制的大小并不准确?

5、使用脚本就没办法限制coredump的大小了嘛?

6、ubuntu现在对coredump文件的管理方式


一、什么是core dump?

core dump是Linux应用程序调试的一种有效方式,core dump又称为"核心转储",是该进程实际使用的物理内存的"快照"。分析core dump文件可以获取应用程序崩溃时的现场信息,如程序运行时的CPU寄存器值、堆栈指针、栈数据、函数调用栈等信息。在嵌入式系统开发中是开发人员调试所需必备材料之一。每当程序异常行为产生coredump时我们就可以利用gdb工具去查看程序崩溃的原因。

二、coredump是怎么来的?

Core dump是Linux基于信号实现的。Linux中信号是一种异步事件处理机制,每种信号都对应有默认的异常处理操作,默认操作包括忽略该信号(Ignore)、暂停进程(Stop)、终止进程(Terminate)、终止并产生core dump(Core)等。

|-----------|----------|--------|-------------------------------------------------------------------------|
| Signal | Value | Action | Comment |
| SIGHUP | 1 | Term | Hangup detected on controlling terminal or death of controlling process |
| SIGINT | 2 | Term | Interrupt from keyboard |
| SIGQUIT | 3 | Core | Quit from keyboard |
| SIGILL | 4 | Core | Illegal Instruction |
| SIGTRAP | 5 | Core | Trace/breakpoint trap |
| SIGABRT | 6 | Core | Abort signal from abort(3) |
| SIGIOT | 6 | Core | IOT trap. A synonym for SIGABRT |
| SIGEMT | 7 | Term | |
| SIGFPE | 8 | Core | Floating point exception |
| SIGKILL | 9 | Term | Kill signal, cannot be caught, blocked or ignored. |
| SIGBUS | 10,7,10 | Core | Bus error (bad memory access) |
| SIGSEGV | 11 | Core | Invalid memory reference |
| SIGPIPE | 13 | Term | Broken pipe: write to pipe with no readers |
| SIGALRM | 14 | Term | Timer signal from alarm(2) |
| SIGTERM | 15 | Term | Termination signal |
| SIGUSR1 | 30,10,16 | Term | User-defined signal 1 |
| SIGUSR2 | 31,12,17 | Term | User-defined signal 2 |
| SIGCHLD | 20,17,18 | Ign | Child stopped or terminated |
| SIGCONT | 19,18,25 | Cont | Continue if stopped |
| SIGSTOP | 17,19,23 | Stop | Stop process, cannot be caught, blocked or ignored. |
| SIGTSTP | 18,20,24 | Stop | Stop typed at terminal |
| SIGTTIN | 21,21,26 | Stop | Terminal input for background process |
| SIGTTOU | 22,22,27 | Stop | Terminal output for background process |
| SIGIO | 23,29,22 | Term | I/O now possible (4.2BSD) |
| SIGPOLL | | Term | Pollable event (Sys V). Synonym for SIGIO |
| SIGPROF | 27,27,29 | Term | Profiling timer expired |
| SIGSYS | 12,31,12 | Core | Bad argument to routine (SVr4) |
| SIGURG | 16,23,21 | Ign | Urgent condition on socket (4.2BSD) |
| SIGVTALRM | 26,26,28 | Term | Virtual alarm clock (4.2BSD) |
| SIGXCPU | 24,24,30 | Core | CPU time limit exceeded (4.2BSD) |
| SIGXFSZ | 25,25,31 | Core | File size limit exceeded (4.2BSD) |
| SIGSTKFLT | 16 | Term | Stack fault on coprocessor (unused) |
| SIGCLD | 18 | Ign | A synonym for SIGCHLD |
| SIGPWR | 29,30,19 | Term | Power failure (System V) |
| SIGINFO | 29 | | A synonym for SIGPWR, on an alpha |
| SIGLOST | 29 | Term | File lock lost (unused), on a sparc |
| SIGWINCH | 28,28,20 | Ign | Window resize signal (4.3BSD, Sun) |
| SIGUNUSED | 31 | Core | Synonymous with SIGSYS |

以下情况会出现应用程序崩溃导致产生core dump:

  1. 内存访问越界 (数组越界、字符串无\n结束符、字符串读写越界)
  2. 多线程程序中使用了线程不安全的函数,如不可重入函数
  3. 多线程读写的数据未加锁保护(临界区资源需要互斥访问)
  4. 非法指针(如空指针异常或者非法地址访问)
  5. 堆栈溢出

三、怎么限制coredump文件的产生?

ulimit

说到限制就不得不提到ulimit工具了,这个工具其实有个坑,他并不是linux自带的工具,而是shell工具,也就是说他的级别是shell层的。

当我们新开启一个终端后之前终端的全部配置都会消失,并且他还看人下菜碟,每个linux用设置的都是自己的ulimit,比如你有两个用户,在同一个终端下对当前终端配置

一个 ulimit -c 1024

一个ulimit -c 2048

他们的限制都作用在各自的范围

即便同一个目录的同一个文件,本来这个coredump可以产生4096k个字节那么a的coredump就是1024,b的就是2048当然,不会很准确的限制,后面会从源码上给大家说明为什么。

还有就是可以通过配置文件,但是这个优先级虽然高但是可以改,这个文件只在重启时被读取一次,后面其它人再来改这个限制你之前的限制也就无效了。

半永久限制

vim /etc/profile

相当于每开一个shell时,自动执行ulimit -c unlimited(表示无限制其它具体数字通常表示多少k)

永久限制

编辑/etc/security/limits.conf,需要重新登录,或者重新打开ssh客户端连接,永久生效

下面就是软限制和硬限制

四、从源码分析如何对coredump文件的名字和路径管理

命名

在/etc/sysctrl.conf中保存的是用户给内核传递的参数,这里有个core_pattern参数是告诉内核怎么管理coredump的。

后面直接写等号就表示这个coredump文件的命名

%% 单个%字符

%p 所dump进程的进程ID

%u 所dump进程的实际用户ID

%g 所dump进程的实际组ID

%s 导致本次core dump的信号

%t core dump的时间 (由1970年1月1日计起的秒数)

%h 主机名

%e 程序文件名

管理

复制代码
kernel.core_pattern=|/usr/bin/coredump_helper.sh core_%e_%I_%p_sig_%s_time_%t.gz

像上面那样配置core_pattern后内核就可以使用我们指定的脚本去管理coredump文件了。

这个路径必须是绝对路径,前面必须是|前后都不能有空格,coredump文件会作为参数传入脚本。

这里我直接将文件以压缩格式传递给了脚本

#!/bin/sh

if [ ! -d "/var/coredump" ];then
    mkdir -p /var/coredump
fi
gzip > "/var/coredump/$1"

上面的代码就是一个简单的示例可以把coredump都放到var下的coredump目录中,我们还可以添加一些遍历判断操作限制产生coredump的数量的单个coredump的大小。

cpp 复制代码
#!/bin/bash  
  
# Set the maximum number of files allowed in /var/coredump  
max_files=1000  
  
# Check if /var/coredump directory exists  
if [ ! -d "/var/coredump" ]; then  
  mkdir -p /var/coredump  
fi  
  
# Get the current number of files in /var/coredump  
current_files=$(ls -1 /var/coredump | wc -l)  
  
# Check if the number of files exceeds the maximum limit  
if [ $current_files -ge $max_files ]; then  
  # Sort the files based on creation time (oldest first)  
  sorted_files=$(ls -1t /var/coredump | head -n 1)  
    
  # Remove the oldest files to reduce the number of files to below the limit  
  rm $sorted_files  
fi  
  
# Compress the file and store it in /var/coredump using gzip  
gzip > "/var/coredump/$1"

代码很简陋正常还要有出错管理,异常情况打印到日志备份等等。

一些问题的答案

1、为什么新的ubuntu不能产生coredump了,需要手动管理?

因为现在的ubuntu使用的是systemd,在这里有些服务会对内核管理coredump的接口做配置,导致coredump文件被服务写到接口的脚本管理了。

2、systemd是什么时候出现在ubuntu中的?

systemd是在Ubuntu 15.04版本中首次出现的。在此之前,Ubuntu使用Upstart作为默认的init系统。然而,从Ubuntu 15.04开始,Ubuntu采用了systemd作为默认的init系统和服务管理器。systemd是一个用于启动、停止和管理系统进程的软件套件,它提供了更快的启动速度、更好的系统资源管理和更强大的服务控制能力。

而在Upstart之前用的才是SysVinit。

3、为什么使用脚本管理coredump后ulimit就不能限制coredump文件的大小了。

这个问题就要看源码了

cpp 复制代码
void do_coredump(const siginfo_t *siginfo)
{
	struct core_state core_state;
	struct core_name cn;
	struct mm_struct *mm = current->mm;
	struct linux_binfmt * binfmt;
	const struct cred *old_cred;
	struct cred *cred;
	int retval = 0;
	int flag = 0;
	int ispipe;
	struct files_struct *displaced;
	bool need_nonrelative = false;
	bool core_dumped = false;
	static atomic_t core_dump_count = ATOMIC_INIT(0);
	struct coredump_params cprm = {
		.siginfo = siginfo,
		.regs = signal_pt_regs(),
		.limit = rlimit(RLIMIT_CORE),
		/*
		 * We must use the same mm->flags while dumping core to avoid
		 * inconsistency of bit flags, since this flag is not protected
		 * by any locks.
		 */
		.mm_flags = mm->flags,
	};

	audit_core_dumps(siginfo->si_signo);

	binfmt = mm->binfmt;
	if (!binfmt || !binfmt->core_dump)
		goto fail;
	if (!__get_dumpable(cprm.mm_flags))
		goto fail;

	cred = prepare_creds();
	if (!cred)
		goto fail;
	/*
	 * We cannot trust fsuid as being the "true" uid of the process
	 * nor do we know its entire history. We only know it was tainted
	 * so we dump it as root in mode 2, and only into a controlled
	 * environment (pipe handler or fully qualified path).
	 */
	if (__get_dumpable(cprm.mm_flags) == SUID_DUMP_ROOT) {
		/* Setuid core dump mode */
		flag = O_EXCL;		/* Stop rewrite attacks */
		cred->fsuid = GLOBAL_ROOT_UID;	/* Dump root private */
		need_nonrelative = true;
	}

	retval = coredump_wait(siginfo->si_signo, &core_state);
	if (retval < 0)
		goto fail_creds;

	old_cred = override_creds(cred);

	ispipe = format_corename(&cn, &cprm);

	if (ispipe) {
		int dump_count;
		char **helper_argv;
		struct subprocess_info *sub_info;

		if (ispipe < 0) {
			printk(KERN_WARNING "format_corename failed\n");
			printk(KERN_WARNING "Aborting core\n");
			goto fail_unlock;
		}

		if (cprm.limit == 1) {
			/* See umh_pipe_setup() which sets RLIMIT_CORE = 1.
			 *
			 * Normally core limits are irrelevant to pipes, since
			 * we're not writing to the file system, but we use
			 * cprm.limit of 1 here as a speacial value, this is a
			 * consistent way to catch recursive crashes.
			 * We can still crash if the core_pattern binary sets
			 * RLIM_CORE = !1, but it runs as root, and can do
			 * lots of stupid things.
			 *
			 * Note that we use task_tgid_vnr here to grab the pid
			 * of the process group leader.  That way we get the
			 * right pid if a thread in a multi-threaded
			 * core_pattern process dies.
			 */
			printk(KERN_WARNING
				"Process %d(%s) has RLIMIT_CORE set to 1\n",
				task_tgid_vnr(current), current->comm);
			printk(KERN_WARNING "Aborting core\n");
			goto fail_unlock;
		}
		cprm.limit = RLIM_INFINITY;

		dump_count = atomic_inc_return(&core_dump_count);
		if (core_pipe_limit && (core_pipe_limit < dump_count)) {
			printk(KERN_WARNING "Pid %d(%s) over core_pipe_limit\n",
			       task_tgid_vnr(current), current->comm);
			printk(KERN_WARNING "Skipping core dump\n");
			goto fail_dropcount;
		}

		helper_argv = argv_split(GFP_KERNEL, cn.corename, NULL);
		if (!helper_argv) {
			printk(KERN_WARNING "%s failed to allocate memory\n",
			       __func__);
			goto fail_dropcount;
		}

		retval = -ENOMEM;
		sub_info = call_usermodehelper_setup(helper_argv[0],
						helper_argv, NULL, GFP_KERNEL,
						umh_pipe_setup, NULL, &cprm);
		if (sub_info)
			retval = call_usermodehelper_exec(sub_info,
							  UMH_WAIT_EXEC);

		argv_free(helper_argv);
		if (retval) {
			printk(KERN_INFO "Core dump to |%s pipe failed\n",
			       cn.corename);
			goto close_fail;
		}
	} else {
		struct inode *inode;

		if (cprm.limit < binfmt->min_coredump)
			goto fail_unlock;

		if (need_nonrelative && cn.corename[0] != '/') {
			printk(KERN_WARNING "Pid %d(%s) can only dump core "\
				"to fully qualified path!\n",
				task_tgid_vnr(current), current->comm);
			printk(KERN_WARNING "Skipping core dump\n");
			goto fail_unlock;
		}

		cprm.file = filp_open(cn.corename,
				 O_CREAT | 2 | O_NOFOLLOW | O_LARGEFILE | flag,
				 0600);
		if (IS_ERR(cprm.file))
			goto fail_unlock;

		inode = file_inode(cprm.file);
		if (inode->i_nlink > 1)
			goto close_fail;
		if (d_unhashed(cprm.file->f_path.dentry))
			goto close_fail;
		/*
		 * AK: actually i see no reason to not allow this for named
		 * pipes etc, but keep the previous behaviour for now.
		 */
		if (!S_ISREG(inode->i_mode))
			goto close_fail;
		/*
		 * Dont allow local users get cute and trick others to coredump
		 * into their pre-created files.
		 */
		if (!uid_eq(inode->i_uid, current_fsuid()))
			goto close_fail;
		if (!cprm.file->f_op->write)
			goto close_fail;
		if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file))
			goto close_fail;
	}

	/* get us an unshared descriptor table; almost always a no-op */
	retval = unshare_files(&displaced);
	if (retval)
		goto close_fail;
	if (displaced)
		put_files_struct(displaced);
	if (!dump_interrupted()) {
		file_start_write(cprm.file);
		core_dumped = binfmt->core_dump(&cprm);
		file_end_write(cprm.file);
	}
	if (ispipe && core_pipe_limit)
		wait_for_dump_helpers(cprm.file);
close_fail:
	if (cprm.file)
		filp_close(cprm.file, NULL);
fail_dropcount:
	if (ispipe)
		atomic_dec(&core_dump_count);
fail_unlock:
	kfree(cn.corename);
	coredump_finish(mm, core_dumped);
	revert_creds(old_cred);
fail_creds:
	put_cred(cred);
fail:
	return;
}

从源码中可以看到在产生coredump的源码中会对上面的core_pattern进行判断,如果传入了脚本就会走上面的分支,没传入走下面的分支,上面只判断了limit是不是1,1表示不产生coredump,如果不是1就赋值为0,0就是没有限制。

而下面有这个判断,当已经产生的coredump文件超过ulimit -c的限制时就会停止产生。有一个点需要注意,ulimit-c和这里的limit不是完全对应的,是在shell源码中有转化的,不过大体上还是对应的。

4、为什么ulimit限制的大小并不准确?

还是要看这张图,它每次写入的都是内核设计好的,就写这些。

在外面有个for循环调用do_coredump文件.

cpp 复制代码
int get_signal_to_deliver(siginfo_t *info, struct k_sigaction *return_ka,
			  struct pt_regs *regs, void *cookie)
{
	struct sighand_struct *sighand = current->sighand;
	struct signal_struct *signal = current->signal;
	int signr;

	if (unlikely(current->task_works))
		task_work_run();

	if (unlikely(uprobe_deny_signal()))
		return 0;

	/*
	 * Do this once, we can't return to user-mode if freezing() == T.
	 * do_signal_stop() and ptrace_stop() do freezable_schedule() and
	 * thus do not need another check after return.
	 */
	try_to_freeze();

relock:
	spin_lock_irq(&sighand->siglock);
	/*
	 * Every stopped thread goes here after wakeup. Check to see if
	 * we should notify the parent, prepare_signal(SIGCONT) encodes
	 * the CLD_ si_code into SIGNAL_CLD_MASK bits.
	 */
	if (unlikely(signal->flags & SIGNAL_CLD_MASK)) {
		int why;

		if (signal->flags & SIGNAL_CLD_CONTINUED)
			why = CLD_CONTINUED;
		else
			why = CLD_STOPPED;

		signal->flags &= ~SIGNAL_CLD_MASK;

		spin_unlock_irq(&sighand->siglock);

		/*
		 * Notify the parent that we're continuing.  This event is
		 * always per-process and doesn't make whole lot of sense
		 * for ptracers, who shouldn't consume the state via
		 * wait(2) either, but, for backward compatibility, notify
		 * the ptracer of the group leader too unless it's gonna be
		 * a duplicate.
		 */
		read_lock(&tasklist_lock);
		do_notify_parent_cldstop(current, false, why);

		if (ptrace_reparented(current->group_leader))
			do_notify_parent_cldstop(current->group_leader,
						true, why);
		read_unlock(&tasklist_lock);

		goto relock;
	}

	for (;;) {
		struct k_sigaction *ka;

		if (unlikely(current->jobctl & JOBCTL_STOP_PENDING) &&
		    do_signal_stop(0))
			goto relock;

		if (unlikely(current->jobctl & JOBCTL_TRAP_MASK)) {
			do_jobctl_trap();
			spin_unlock_irq(&sighand->siglock);
			goto relock;
		}

		signr = dequeue_signal(current, &current->blocked, info);

		if (!signr)
			break; /* will return 0 */

		if (unlikely(current->ptrace) && signr != SIGKILL) {
			signr = ptrace_signal(signr, info);
			if (!signr)
				continue;
		}

		ka = &sighand->action[signr-1];

		/* Trace actually delivered signals. */
		trace_signal_deliver(signr, info, ka);

		if (ka->sa.sa_handler == SIG_IGN) /* Do nothing.  */
			continue;
		if (ka->sa.sa_handler != SIG_DFL) {
			/* Run the handler.  */
			*return_ka = *ka;

			if (ka->sa.sa_flags & SA_ONESHOT)
				ka->sa.sa_handler = SIG_DFL;

			break; /* will return non-zero "signr" value */
		}

		/*
		 * Now we are doing the default action for this signal.
		 */
		if (sig_kernel_ignore(signr)) /* Default is nothing. */
			continue;

		/*
		 * Global init gets no signals it doesn't want.
		 * Container-init gets no signals it doesn't want from same
		 * container.
		 *
		 * Note that if global/container-init sees a sig_kernel_only()
		 * signal here, the signal must have been generated internally
		 * or must have come from an ancestor namespace. In either
		 * case, the signal cannot be dropped.
		 */
		if (unlikely(signal->flags & SIGNAL_UNKILLABLE) &&
				!sig_kernel_only(signr))
			continue;

		if (sig_kernel_stop(signr)) {
			/*
			 * The default action is to stop all threads in
			 * the thread group.  The job control signals
			 * do nothing in an orphaned pgrp, but SIGSTOP
			 * always works.  Note that siglock needs to be
			 * dropped during the call to is_orphaned_pgrp()
			 * because of lock ordering with tasklist_lock.
			 * This allows an intervening SIGCONT to be posted.
			 * We need to check for that and bail out if necessary.
			 */
			if (signr != SIGSTOP) {
				spin_unlock_irq(&sighand->siglock);

				/* signals can be posted during this window */

				if (is_current_pgrp_orphaned())
					goto relock;

				spin_lock_irq(&sighand->siglock);
			}

			if (likely(do_signal_stop(info->si_signo))) {
				/* It released the siglock.  */
				goto relock;
			}

			/*
			 * We didn't actually stop, due to a race
			 * with SIGCONT or something like that.
			 */
			continue;
		}

		spin_unlock_irq(&sighand->siglock);

		/*
		 * Anything else is fatal, maybe with a core dump.
		 */
		current->flags |= PF_SIGNALED;

		if (sig_kernel_coredump(signr)) {
			if (print_fatal_signals)
				print_fatal_signal(info->si_signo);
			proc_coredump_connector(current);
			/*
			 * If it was able to dump core, this kills all
			 * other threads in the group and synchronizes with
			 * their demise.  If we lost the race with another
			 * thread getting here, it set group_exit_code
			 * first and our do_group_exit call below will use
			 * that value and ignore the one we pass it.
			 */
			do_coredump(info);
		}

		/*
		 * Death signals, no core dump.
		 */
		do_group_exit(info->si_signo);
		/* NOTREACHED */
	}
	spin_unlock_irq(&sighand->siglock);
	return signr;
}

哪次判断超出了就会停止生成.

5、使用脚本就没办法限制coredump的大小了嘛?

可以的,但是limit这个系统调用肯定是不行了,源码中我们就看到了人家都不鸟你。但是我们可以ulimit -f,这个脚本coredump文件从内核态到用户态是通过这个传入的脚本,所以直接ulimit -f限制脚本生成的全部文件大小就ok了,但是这里要写成正常两倍大小因为是先获取在写入,我们一个字节计算两次,这次限制的大小就完全准确了,也有个坏处就是文件会损坏限制住了也没法去看这个coredump。

6、ubuntu现在对coredump文件的管理方式

由于使用systemd机制,所以现在的coredump是以服务的方式存在的,为了可以更好的保存和规范化管理,coredump内容被放到了日志中,使用coredump@和socket服务共同实现,将coredump文件从内核态转移到用户态后又用了socket机制做过滤和写入日志。完美限制大小和内容。只要日志管理的好可以想看任何时候的有用的coredump。


这是我工作中遇到的第一个做了两周的问题的一部分,当时贼痛苦各种查资料翻源码,好在最后想到了还算不错的解决方案,工作的提升是真的大,不止是技术,主要是团队写作能力,做事情的条例,规范化做事才会少出错。这些很重要。还有表达能力,不然你写的永远都是垃圾。下周有时间更新另一半,日志的管理。

相关推荐
frank00600713 分钟前
linux 使用mdadm 创建raid0 nvme 磁盘
linux·运维
绿白尼5 分钟前
进程与线程
linux
iangyu7 分钟前
linux命令之pwdx
linux·运维·服务器
C语言扫地僧27 分钟前
Docker 镜像制作(Dockerfile)
linux·服务器·docker·容器
Xinan_____35 分钟前
Linux——高流量 高并发(访问场景) 高可用(架构要求)
linux·运维·架构
周伯通*1 小时前
Windows上,使用远程桌面连接Ubuntu
linux·windows·ubuntu
Spring-wind1 小时前
【linux】 date命令
linux
钡铼技术物联网关1 小时前
Codesys 与 ARMxy ARM 工业控制器:工业控制的黄金组合
linux·运维·服务器·arm开发·硬件工程
Dola_Pan4 小时前
Linux文件IO(一)-open使用详解
java·linux·dubbo
Spring-wind4 小时前
【linux】pwd命令
linux