linux 中断管理机制

中断的概念

中断是指在CPU正常运行期间,由于内外部事件或由程序预先安排的事件引起的 CPU 暂时停止正在运行的程序,转而为该内部或外部事件或预先安排的事件服务的程序中去,服务完毕后再返回去继续运行被暂时中断的程序。Linux中通常分为外部中断(又叫硬件中断)和内部中断(又叫异常)。

软件对硬件进行配置后,软件期望等待硬件的某种状态(比如,收到了数据),这里有两种方式,一种是轮询(polling): CPU 不断的去读硬件状态。另一种是当硬件完成某种事件后,给 CPU 一个中断,让 CPU 停下手上的事情,去处理这个中断。很显然,中断的交互方式提高了系统的吞吐。

当 CPU 收到一个中断 (IRQ)的时候,会去执行该中断对应的处理函数(ISR)。普通情况下,会有一个中断向量表,向量表中定义了 CPU 对应的每一个外设资源的中断处理程序的入口,当发生对应的中断的时候, CPU 直接跳转到这个入口执行程序。也就是中断上下文。(注意:中断上下文中,不可阻塞睡眠)。

Linux 中断 top/bottom

玩过 MCU 的人都知道,中断服务程序的设计最好是快速完成任务并退出,因为此刻系统处于被中断中。但是在 ISR 中又有一些必须完成的事情,比如:清中断标志,读/写数据,寄存器操作等。

在 Linux 中,同样也是这个要求,希望尽快的完成 ISR。但事与愿违,有些 ISR 中任务繁重,会消耗很多时间,导致响应速度变差。Linux 中针对这种情况,将中断分为了两部分:

  1. 上半部(top half):收到一个中断,立即执行,有严格的时间限制,只做一些必要的工作,比如:应答,复位等。这些工作都是在所有中断被禁止的情况下完成的。

  2. 底半部(bottom half):能够被推迟到后面完成的任务会在底半部进行。在适合的时机,下半部会被开中断执行。(具体的机制在接下来章节分析(软中断、tasklet、工作队列))。

中断处理程序

驱动程序可以使用接口:

static inline int __must_check request_irq(unsigned int irq, irq_handler_t handler, unsigned long flags,const char *name, void *dev)

像系统申请注册一个中断处理程序。其中的参数:

参数 含义

irq 表了该中断的中断号,一般 CPU 的中断号都会事先定义好。

handler 中断发生后的 ISR

flags 中断标志( IRQF_DISABLED / IRQFSAMPLE_RANDOM / IRQF_TIMER / IRQF_SHARED)

name 中断相关的设备 ASCII 文本,例如 "keyboard",这些名字会在 /proc/irq 和 /proc/interrupts 文件使用

dev 用于共享中断线,传递驱动程序的设备结构。非共享类型的中断,直接设置成为 NULL

中断标志 flag 的含义:

标志 含义

IRQF_DISABLED 设置这个标志的话,意味着内核在处理这个 ISR 期间,要禁止其他中断(多数情况不使用这个)

IRQFSAMPLE_RANDOM 表明这个设备产生的中断对内核熵池有贡献

IRQF_TIMER 为系统定时器准备的标志

IRQF_SHARED 表明多个中断处理程序之间共享中断线。同一个给定的线上注册每个处理程序,必须设置这个

调用 request_irq 成功执行返回 0。常见错误是 -EBUSY,表示给定的中断线已经在使用(或者没有指定 IRQF_SHARED)

注意:request_irq 函数可能引起睡眠,所以不允许在中断上下文或者不允许睡眠的代码中调用。

释放中断:

const void *free_irq(unsigned int irq, void *dev_id) //用于释放中断处理函数。

注意:Linux 中的中断处理程序是无须重入的。当给定的中断处理程序正在执行的时候,其中断线在所有的处理器上都会被屏蔽掉,以防在同一个中断线上又接收到另一个新的中断。通常情况下,除了该中断的其他中断都是打开的,也就是说其他的中断线上的重点都能够被处理,但是当前的中断线总是被禁止的,故,同一个中断处理程序是绝对不会被自己嵌套的,另外ARM上也不支持中断优先级,也就是没有使用FIQ,因此ARM不支持中断嵌套。

中断上下文

与进程上下文不一样,内核执行中断服务程序的时候,处于中断上下文。中断处理程序并没有自己的独立的栈,而是使用了内核栈,其大小一般是有限制的(32bit 机器 8KB)。所以其必须短小精悍。同时中断服务程序是打断了正常的程序流程,这一点上也必须保证快速的执行。同时中断上下文中是不允许睡眠,阻塞的。

中断上下文不能睡眠的原因是:

1、 中断处理的时候,不应该发生进程切换,因为在中断context中,唯一能打断当前中断handler的只有更高优先级的中断,它不会被进程打断,如果在中断context中休眠,则没有办法唤醒它,因为所有的wake_up_xxx都是针对某个进程而言的,而在中断context中,没有进程的概念,没有一个task_struct(这点对于softirq和tasklet一样),因此真的休眠了,比如调用了会导致block的例程,内核几乎肯定会死。

2、schedule()在切换进程时,保存当前的进程上下文(CPU寄存器的值、进程的状态以及堆栈中的内容),以便以后恢复此进程运行。中断发生后,内核会先保存当前被中断的进程上下文(在调用中断处理程序后恢复);但在中断处理程序里,CPU寄存器的值肯定已经变化了吧(最重要的程序计数器PC、堆栈SP等),如果此时因为睡眠或阻塞操作调用了schedule(),则保存的进程上下文就不是当前的进程context了.所以不可以在中断处理程序中调用schedule()。

3、内核中schedule()函数本身在进来的时候判断是否处于中断上下文:

if(unlikely(in_interrupt()))

BUG();

因此,强行调用schedule()的结果就是内核BUG。

4、中断handler会使用被中断的进程内核堆栈,但不会对它有任何影响,因为handler使用完后会完全清除它使用的那部分堆栈,恢复被中断前的原貌。

5、处于中断context时候,内核是不可抢占的。因此,如果休眠,则内核一定挂起

中断处理流程

发生中断时,CPU执行异常向量vector_irq的代码, 即异常向量表中的中断异常的代码,它是一个跳转指令,跳去执行真正的中断处理程序,在vector_irq里面,最终会调用中断处理的总入口函数。

对于 ARM64 处理器的异常级别 1、 2 和 3,每个异常级别都有自己的异常向量表,异常向量表的起始虚拟地址存放在寄存器 VBAR_ELn(向量基准地址寄存器, Vector Based Address Register)中。每个异常向量表有 16 项,分为 4 组,每组 4 项,每项的长度是 128 字节(可以存放32 条指令)。异常级别 n 的异常向量表所示。

异常级别 n 的异常向量表

地址 异常类型 说明

VBAR_ELn + 0x000 同步异常 当前异常级别生成的异常,使用异常

级别0的栈指针寄存器SP_EL0

  • 0x080 中断

  • 0x100 快速中断

  • 0x180 系统错误

  • 0x200 同步异常 当前异常级别生成的异常,使用当前

异常级别的栈指针寄存器SP_ELn

  • 0x280 中断

  • 0x300 快速中断

  • 0x380 系统错误

  • 0x400 同步异常 64位应用程序在异常级别( n-1)生

成的异常

  • 0x480 中断

  • 0x500 快速中断

  • 0x580 系统错误

  • 0x600 同步异常 32位应用程序在异常级别( n-1)生

成的异常

  • 0x680 中断

  • 0x700 快速中断

  • 0x780 系统错误

ARM64 架构内核定义的异常向量表如下:

这部分内容在《Linux应用层和内核交互》中系统调用章节讲过,这里只列出与中断有关的内容;

arch/arm64/kernel/entry.S:

/*

* Exception vectors.

*/

.pushsection ".entry.text", "ax"

.align 11

ENTRY(vectors)

kernel_ventry 1, sync_invalid //异常级别1生成的同步异常,使用栈指针寄存器SP_EL0

kernel_ventry 1, irq_invalid //异常级别1生成的中断,使用栈指针寄存器SP_EL0

kernel_ventry 1, fiq_invalid //异常级别1生成的快速中断,使用栈指针寄存器SP_EL0

kernel_ventry 1, error_invalid //异常级别1生成的系统错误,使用栈指针寄存器SP_EL0

kernel_ventry 1, sync //异常级别1生成的同步异常,使用栈指针寄存器SP_EL1

kernel_ventry 1, irq //异常级别1生成的中断,使用栈指针寄存器SP_EL1

kernel_ventry 1, fiq_invalid //异常级别1生成的快速中断,使用栈指针寄存器SP_EL1

kernel_ventry 1, error_invalid //异常级别1生成的系统错误,使用栈指针寄存器SP_EL1

kernel_ventry 0, sync //64位应用程序在异常级别0生成的同步异常

kernel_ventry 0, irq // 64位应用程序在异常级别0生成的中断

kernel_ventry 0, fiq_invalid // 64位应用程序在异常级别0生成的快速中断

kernel_ventry 0, error_invalid //64位应用程序在异常级别0生成的系统错误

#ifdef CONFIG_COMPAT

kernel_ventry 0, sync_compat, 32 //32位应用程序在异常级别0生成的同步异常

kernel_ventry 0, irq_compat, 32 // 32位应用程序在异常级别0生成的中断

kernel_ventry 0, fiq_invalid_compat, 32 // 32位应用程序在异常级别0生成的快速中断

kernel_ventry 0, error_invalid_compat, 32 // 32位应用程序在异常级别0生成的系统错误

#else

kernel_ventry 0, sync_invalid, 32 //32位应用程序在异常级别0生成的同步异常

kernel_ventry 0, irq_invalid, 32 // 32位应用程序在异常级别0生成的中断

kernel_ventry 0, fiq_invalid, 32 // 32位应用程序在异常级别0生成的快速中断

kernel_ventry 0, error_invalid, 32 // 32位应用程序在异常级别0生成的系统错误

#endif

END(vectors)

kernel_ventry是一个宏,参数是跳转标号,即异常处理程序的标号,宏的定义如下(/arch/arm64/kernel/entry.S):

.macro kernel_ventry, el, label, regsize = 64

.align 7

sub sp, sp, #S_FRAME_SIZE // 将sp预留一个fram_size, 这个size 就是struct pt_regs的大小

#ifdef CONFIG_VMAP_STACK

....这里省略掉检查栈溢出的代码

#endif

b el\()\el\()_\label // 跳转到对应级别的异常处理函数, kernel_entry 1, irq为el1_irq

.endm

" .align 7"表示把下一条指令的地址对齐到 2^7,即对齐到 128; 对于向量表vectors中的kernel_ventry 1, irq , 则 b el\()\el\()_\label跳转到el1_irq函数。 其中1表示的是从哪个异常模式产生的,比如是User->kernel就是0, kernel->kernel就是1.

每个CPU 在初始化是,都会设置中断向量地址。

arch/arm64/kernel/head.S

__primary_switched:

adrp x4, init_thread_union

add sp, x4, #THREAD_SIZE

adr_l x5, init_task

msr sp_el0, x5 // Save thread_info

adr_l x8, vectors // load VBAR_EL1 with virtual

msr vbar_el1, x8 // vector table address

isb

stp xzr, x30, [sp, #-16]!

mov x29, sp

str_l x21, __fdt_pointer, x5 // Save FDT pointer

ldr_l x4, kimage_vaddr // Save the offset between

sub x4, x4, x0 // the kernel virtual and

str_l x4, kimage_voffset, x5 // physical mappings

// Clear BSS

adr_l x0, __bss_start

mov x1, xzr

adr_l x2, __bss_stop

sub x2, x2, x0

bl __pi_memset

dsb ishst // Make zero page visible to PTW

#ifdef CONFIG_KASAN

bl kasan_early_init

#endif

#ifdef CONFIG_RANDOMIZE_BASE

tst x23, ~(MIN_KIMG_ALIGN - 1) // already running randomized?

b.ne 0f

mov x0, x21 // pass FDT address in x0

bl kaslr_early_init // parse FDT for KASLR options

cbz x0, 0f // KASLR disabled? just proceed

orr x23, x23, x0 // record KASLR offset

ldp x29, x30, [sp], #16 // we must enable KASLR, return

ret // to __primary_switch()

0:

#endif

add sp, sp, #16

mov x29, #0

mov x30, #0

b start_kernel

ENDPROC(__primary_switched)

__secondary_switched:

adr_l x5, vectors //设置中断向量地址

msr vbar_el1, x5

isb

adr_l x0, secondary_data

ldr x1, [x0, #CPU_BOOT_STACK] // get secondary_data.stack

mov sp, x1

ldr x2, [x0, #CPU_BOOT_TASK]

msr sp_el0, x2

mov x29, #0

mov x30, #0

b secondary_start_kernel

ENDPROC(__secondary_switched)

有中断产生时, GIC会向相应的CPU发出中断信号,CPU检测到中断信号,根据中断向量表,跳转到el1_irq。

arch/arm64/kernel/entry.S

el1_irq:

kernel_entry 1

enable_dbg

#ifdef CONFIG_TRACE_IRQFLAGS

bl trace_hardirqs_off

#endif

irq_handler

#ifdef CONFIG_PREEMPT

get_thread_info tsk

ldr w24, [tsk, #TI_PREEMPT] // get preempt count

cbnz w24, 1f // preempt count != 0

ldr x0, [tsk, #TI_FLAGS] // get flags

tbz x0, #TIF_NEED_RESCHED, 1f // needs rescheduling?

bl el1_preempt

1:

#endif

#ifdef CONFIG_TRACE_IRQFLAGS

bl trace_hardirqs_on

#endif

kernel_exit 1

ENDPROC(el1_irq)

/*

* Interrupt handling.

*/

.macro irq_handler

#ifdef CONFIG_STRICT_MEMORY_RWX

ldr x1, =handle_arch_irq

ldr x1, [x1]

#else

ldr x1, handle_arch_irq

#endif

mov x0, sp

blr x1

.endm

.text

arch/arm64/kernel/irq.c

void __init set_handle_irq(void (*handle_irq)(struct pt_regs *))

{

if (handle_arch_irq)

return;

handle_arch_irq = handle_irq;

}

Gicv2中断控制器初始化时会调用set_handle_irq(gic_handle_irq);

dtb:

gic: interrupt-controller@1400000 {

compatible = "arm,gic-400";

#interrupt-cells = <3>;

interrupt-controller;

reg = <0x0 0x1401000 0 0x1000>, /* GICD */

<0x0 0x1402000 0 0x2000>, /* GICC */

<0x0 0x1404000 0 0x2000>, /* GICH */

<0x0 0x1406000 0 0x2000>; /* GICV */

interrupts = <1 9 0xf08>;

};

IRQCHIP_DECLARE(gic_400, "arm,gic-400", gic_of_init);

设置代码路径:gic_of_init()->__gic_init_bases()->set_handle_irq(gic_handle_irq);

static void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)

{

u32 irqstat, irqnr;

struct gic_chip_data *gic = &gic_data[0];

void __iomem *cpu_base = gic_data_cpu_base(gic);

do {

irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);

irqnr = irqstat & GICC_IAR_INT_ID_MASK;

if (likely(irqnr > 15 && irqnr < 1020)) {

if (static_key_true(&supports_deactivate))

writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);

isb();

handle_domain_irq(gic->domain, irqnr, regs); //调用相应的中断处理函数

continue;

}

if (irqnr < 16) {

writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);

if (static_key_true(&supports_deactivate))

writel_relaxed(irqstat, cpu_base + GIC_CPU_DEACTIVATE);

#ifdef CONFIG_SMP

/*

* Ensure any shared data written by the CPU sending

* the IPI is read after we've read the ACK register

* on the GIC.

*

* Pairs with the write barrier in gic_raise_softirq

*/

smp_rmb();

handle_IPI(irqnr, regs); //SMP 核间中断

#endif

continue;

}

break;

} while (1);

}

gic_handle_irq()->handle_domain_irq()->__handle_domain_irq()

static inline int handle_domain_irq(struct irq_domain *domain,

unsigned int hwirq, struct pt_regs *regs)

{

return __handle_domain_irq(domain, hwirq, true, regs);

}

/**

* __handle_domain_irq - Invoke the handler for a HW irq belonging to a domain

* @domain: The domain where to perform the lookup

* @hwirq: The HW irq number to convert to a logical one

* @lookup: Whether to perform the domain lookup or not

* @regs: Register file coming from the low-level handling code

*

* Returns: 0 on success, or -EINVAL if conversion has failed

*/

int __handle_domain_irq(struct irq_domain *domain, unsigned int hwirq,

bool lookup, struct pt_regs *regs)

{

struct pt_regs *old_regs = set_irq_regs(regs);

unsigned int irq = hwirq;

int ret = 0;

irq_enter();

#ifdef CONFIG_IRQ_DOMAIN

if (lookup)

irq = irq_find_mapping(domain, hwirq);

#endif

/*

* Some hardware gives randomly wrong interrupts. Rather

* than crashing, do something sensible.

*/

if (unlikely(!irq || irq >= nr_irqs)) {

ack_bad_irq(irq);

ret = -EINVAL;

} else {

generic_handle_irq(irq);

}

irq_exit();

set_irq_regs(old_regs);

return ret;

}

这里请注意:

先调用了 irq_enter 标记进入了硬件中断:

irq_enter是更新一些系统的统计信息,同时在__irq_enter宏中禁止了进程的抢占。虽然在产生IRQ时,ARM会自动把CPSR中的I位置位,禁止新的IRQ请求,直到中断控制转到相应的流控层后才通过local_irq_enable()打开。那为何还要禁止抢占?这是因为要考虑中断嵌套的问题,一旦流控层或驱动程序主动通过local_irq_enable打开了IRQ,而此时该中断还没处理完成,新的irq请求到达,这时代码会再次进入irq_enter,在本次嵌套中断返回时,内核不希望进行抢占调度,而是要等到最外层的中断处理完成后才做出调度动作,所以才有了禁止抢占这一处理

再调用 generic_handle_irq()最后调用 irq_exit 删除进入硬件中断的标记。

gic_handle_irq()->handle_domain_irq()->__handle_domain_irq()->generic_handle_irq()

/**

* generic_handle_irq - Invoke the handler for a particular irq

* @irq: The irq number to handle

*

*/

int generic_handle_irq(unsigned int irq)

{

struct irq_desc *desc = irq_to_desc(irq);

if (!desc)

return -EINVAL;

generic_handle_irq_desc(desc);

return 0;

}

首先在函数 irq_to_desc 中根据发生中断的中断号,去取出它的 irq_desc 中断描述结构,然后调用 generic_handle_irq_desc:

gic_handle_irq()->handle_domain_irq()->__handle_domain_irq()->generic_handle_irq()->generic_handle_irq_desc()

/*

* Architectures call this to let the generic IRQ layer

* handle an interrupt.

*/

static inline void generic_handle_irq_desc(struct irq_desc *desc)

{

desc->handle_irq(desc);

}

这里调用了 handle_irq 函数。所以,在上述流程中,还需要分析 irq_to_desc 流程:

struct irq_desc *irq_to_desc(unsigned int irq)

{

return (irq < NR_IRQS) ? irq_desc + irq : NULL;

}

NR_IRQS 是支持的总的中断个数,当然,irq 不能够大于这个数目。所以返回 irq_desc + irq。

irq_desc 是一个全局的数组:

struct irq_desc irq_desc[NR_IRQS] __cacheline_aligned_in_smp = {

[0 ... NR_IRQS-1] = {

.handle_irq = handle_bad_irq,

.depth = 1,

.lock = __RAW_SPIN_LOCK_UNLOCKED(irq_desc->lock),

}

};

这里是这个数组的初始化的地方。所有的 handle_irq 函数都被初始化成为了 handle_bad_irq。

细心的观众可能发现了,调用这个 desc->handle_irq(desc) 函数,并不是咱们注册进去的中断处理函数啊,因为两个函数的原型定义都不一样。这个 handle_irq 是 irq_flow_handler_t 类型,而我们注册进去的服务程序是 irq_handler_t,这两个明显不是同一个东西,所以这里我们还需要继续分析。

1.5.1 中断相关的数据结构

Linux 中断相关的数据结构有 3 个

结构名称 作用

irq_desc IRQ 的软件层面上的资源描述

irqaction IRQ 的通用操作

irq_chip 对应每个芯片的具体实现

1.5.1.1 struct irq_desc

irq_desc 结构如下:

/**

* struct irq_desc - interrupt descriptor

* @irq_common_data: per irq and chip data passed down to chip functions

* @kstat_irqs: irq stats per cpu

* @handle_irq: highlevel irq-events handler

* @preflow_handler: handler called before the flow handler (currently used by sparc)

* @action: the irq action chain

* @status: status information

* @core_internal_state__do_not_mess_with_it: core internal status information

* @depth: disable-depth, for nested irq_disable() calls

* @wake_depth: enable depth, for multiple irq_set_irq_wake() callers

* @irq_count: stats field to detect stalled irqs

* @last_unhandled: aging timer for unhandled count

* @irqs_unhandled: stats field for spurious unhandled interrupts

* @threads_handled: stats field for deferred spurious detection of threaded handlers

* @threads_handled_last: comparator field for deferred spurious detection of theraded handlers

* @lock: locking for SMP

* @affinity_hint: hint to user space for preferred irq affinity

* @affinity_notify: context for notification of affinity changes

* @pending_mask: pending rebalanced interrupts

* @threads_oneshot: bitfield to handle shared oneshot threads

* @threads_active: number of irqaction threads currently running

* @wait_for_threads: wait queue for sync_irq to wait for threaded handlers

* @nr_actions: number of installed actions on this descriptor

* @no_suspend_depth: number of irqactions on a irq descriptor with

* IRQF_NO_SUSPEND set

* @force_resume_depth: number of irqactions on a irq descriptor with

* IRQF_FORCE_RESUME set

* @rcu: rcu head for delayed free

* @kobj: kobject used to represent this struct in sysfs

* @request_mutex: mutex to protect request/free before locking desc->lock

* @dir: /proc/irq/ procfs entry

* @debugfs_file: dentry for the debugfs file

* @name: flow handler name for /proc/interrupts output

*/

struct irq_desc {

struct irq_common_data irq_common_data;

struct irq_data irq_data;

unsigned int __percpu *kstat_irqs;

irq_flow_handler_t handle_irq;

#ifdef CONFIG_IRQ_PREFLOW_FASTEOI

irq_preflow_handler_t preflow_handler;

#endif

struct irqaction *action; /* IRQ action list */

unsigned int status_use_accessors;

unsigned int core_internal_state__do_not_mess_with_it;

unsigned int depth; /* nested irq disables */

unsigned int wake_depth; /* nested wake enables */

unsigned int irq_count; /* For detecting broken IRQs */

unsigned long last_unhandled; /* Aging timer for unhandled count */

unsigned int irqs_unhandled;

atomic_t threads_handled;

int threads_handled_last;

raw_spinlock_t lock;

struct cpumask *percpu_enabled;

const struct cpumask *percpu_affinity;

#ifdef CONFIG_SMP

const struct cpumask *affinity_hint;

struct irq_affinity_notify *affinity_notify;

#ifdef CONFIG_GENERIC_PENDING_IRQ

cpumask_var_t pending_mask;

#endif

#endif

unsigned long threads_oneshot;

atomic_t threads_active;

wait_queue_head_t wait_for_threads;

#ifdef CONFIG_PM_SLEEP

unsigned int nr_actions;

unsigned int no_suspend_depth;

unsigned int cond_suspend_depth;

unsigned int force_resume_depth;

#endif

#ifdef CONFIG_PROC_FS

struct proc_dir_entry *dir;

#endif

#ifdef CONFIG_GENERIC_IRQ_DEBUGFS

struct dentry *debugfs_file;

#endif

#ifdef CONFIG_SPARSE_IRQ

struct rcu_head rcu;

struct kobject kobj;

#endif

struct mutex request_mutex;

int parent_irq;

struct module *owner;

const char *name;

} ____cacheline_internodealigned_in_smp;

1.5.1.2 struct irqaction

irqaction 结构如下:

/**

* struct irqaction - per interrupt action descriptor

* @handler: interrupt handler function

* @name: name of the device

* @dev_id: cookie to identify the device

* @percpu_dev_id: cookie to identify the device

* @next: pointer to the next irqaction for shared interrupts

* @irq: interrupt number

* @flags: flags (see IRQF_* above)

* @thread_fn: interrupt handler function for threaded interrupts

* @thread: thread pointer for threaded interrupts

* @secondary: pointer to secondary irqaction (force threading)

* @thread_flags: flags related to @thread

* @thread_mask: bitmask for keeping track of @thread activity

* @dir: pointer to the proc/irq/NN/name entry

*/

struct irqaction {

irq_handler_t handler;

void *dev_id;

void __percpu *percpu_dev_id;

struct irqaction *next;

irq_handler_t thread_fn;

struct task_struct *thread;

struct irqaction *secondary;

unsigned int irq;

unsigned int flags;

unsigned long thread_flags;

unsigned long thread_mask;

const char *name;

struct proc_dir_entry *dir;

} ____cacheline_internodealigned_in_smp;

1.5.1.3 struct irq_chip

irq_chip 描述如下:

/**

* struct irq_chip - hardware interrupt chip descriptor

*

* @parent_device: pointer to parent device for irqchip

* @name: name for /proc/interrupts

* @irq_startup: start up the interrupt (defaults to ->enable if NULL)

* @irq_shutdown: shut down the interrupt (defaults to ->disable if NULL)

* @irq_enable: enable the interrupt (defaults to chip->unmask if NULL)

* @irq_disable: disable the interrupt

* @irq_ack: start of a new interrupt

* @irq_mask: mask an interrupt source

* @irq_mask_ack: ack and mask an interrupt source

* @irq_unmask: unmask an interrupt source

* @irq_eoi: end of interrupt

* @irq_set_affinity: Set the CPU affinity on SMP machines. If the force

* argument is true, it tells the driver to

* unconditionally apply the affinity setting. Sanity

* checks against the supplied affinity mask are not

* required. This is used for CPU hotplug where the

* target CPU is not yet set in the cpu_online_mask.

* @irq_retrigger: resend an IRQ to the CPU

* @irq_set_type: set the flow type (IRQ_TYPE_LEVEL/etc.) of an IRQ

* @irq_set_wake: enable/disable power-management wake-on of an IRQ

* @irq_bus_lock: function to lock access to slow bus (i2c) chips

* @irq_bus_sync_unlock:function to sync and unlock slow bus (i2c) chips

* @irq_cpu_online: configure an interrupt source for a secondary CPU

* @irq_cpu_offline: un-configure an interrupt source for a secondary CPU

* @irq_suspend: function called from core code on suspend once per

* chip, when one or more interrupts are installed

* @irq_resume: function called from core code on resume once per chip,

* when one ore more interrupts are installed

* @irq_pm_shutdown: function called from core code on shutdown once per chip

* @irq_calc_mask: Optional function to set irq_data.mask for special cases

* @irq_print_chip: optional to print special chip info in show_interrupts

* @irq_request_resources: optional to request resources before calling

* any other callback related to this irq

* @irq_release_resources: optional to release resources acquired with

* irq_request_resources

* @irq_compose_msi_msg: optional to compose message content for MSI

* @irq_write_msi_msg: optional to write message content for MSI

* @irq_get_irqchip_state: return the internal state of an interrupt

* @irq_set_irqchip_state: set the internal state of a interrupt

* @irq_set_vcpu_affinity: optional to target a vCPU in a virtual machine

* @ipi_send_single: send a single IPI to destination cpus

* @ipi_send_mask: send an IPI to destination cpus in cpumask

* @flags: chip specific flags

*/

struct irq_chip {

struct device *parent_device;

const char *name;

unsigned int (*irq_startup)(struct irq_data *data);

void (*irq_shutdown)(struct irq_data *data);

void (*irq_enable)(struct irq_data *data);

void (*irq_disable)(struct irq_data *data);

void (*irq_ack)(struct irq_data *data);

void (*irq_mask)(struct irq_data *data);

void (*irq_mask_ack)(struct irq_data *data);

void (*irq_unmask)(struct irq_data *data);

void (*irq_eoi)(struct irq_data *data);

int (*irq_set_affinity)(struct irq_data *data, const struct cpumask *dest, bool force);

int (*irq_retrigger)(struct irq_data *data);

int (*irq_set_type)(struct irq_data *data, unsigned int flow_type);

int (*irq_set_wake)(struct irq_data *data, unsigned int on);

void (*irq_bus_lock)(struct irq_data *data);

void (*irq_bus_sync_unlock)(struct irq_data *data);

void (*irq_cpu_online)(struct irq_data *data);

void (*irq_cpu_offline)(struct irq_data *data);

void (*irq_suspend)(struct irq_data *data);

void (*irq_resume)(struct irq_data *data);

void (*irq_pm_shutdown)(struct irq_data *data);

void (*irq_calc_mask)(struct irq_data *data);

void (*irq_print_chip)(struct irq_data *data, struct seq_file *p);

int (*irq_request_resources)(struct irq_data *data);

void (*irq_release_resources)(struct irq_data *data);

void (*irq_compose_msi_msg)(struct irq_data *data, struct msi_msg *msg);

void (*irq_write_msi_msg)(struct irq_data *data, struct msi_msg *msg);

int (*irq_get_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool *state);

int (*irq_set_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool state);

int (*irq_set_vcpu_affinity)(struct irq_data *data, void *vcpu_info);

void (*ipi_send_single)(struct irq_data *data, unsigned int cpu);

void (*ipi_send_mask)(struct irq_data *data, const struct cpumask *dest);

unsigned long flags;

};

irq_chip 是一串和芯片相关的函数指针,这里定义的非常的全面,基本上和 IRQ 相关的可能出现的操作都全部定义进去了,具体根据不同的芯片,需要在不同的芯片的地方去初始化这个结构,然后这个结构会嵌入到通用的 IRQ 处理软件中去使用,使得软件处理逻辑和芯片逻辑完全的分开。

我们接下来继续前进。

1.5.2 初始化 Chip 相关的 IRQ

众所周知,启动的时候,C 语言从 start_kernel 开始,在这里面,调用了和 machine 相关的 IRQ 的初始化 init_IRQ():

1.5.2.1 init_IRQ()

asmlinkage __visible void __init start_kernel(void)

{

char *command_line;

char *after_dashes;

.....

early_irq_init();

init_IRQ();

.....

}

1.5.2.1.1 irqchip_init ()

在 init_IRQ 中,调用了irqchip_init ():

void __init init_IRQ(void)

{

init_irq_stacks();

irqchip_init();

if (!handle_arch_irq)

panic("No interrupt controller found.");

}

void __init irqchip_init(void)

{

of_irq_init(__irqchip_of_table);

acpi_probe_device_table(irqchip);

}

__irqchip_of_table就是内核irq chip table的首地址,这个table也就保存了kernel支持的所有的中断控制器的ID信息(用于和device node的匹配)。of_irq_init函数执行之前,系统已经完成了device tree的初始化,因此系统中的所有的设备节点都已经形成了一个树状结构,每个节点代表一个设备的device node。of_irq_init是在所有的device node中寻找中断控制器节点,形成树状结构(系统可以有多个interrupt controller,之所以形成中断控制器的树状结构,是为了让系统中所有的中断控制器驱动按照一定的顺序进行初始化)。之后,从root interrupt controller节点开始,对于每一个interrupt controller的device node,扫描irq chip table,进行匹配,一旦匹配到,就调用该interrupt controller的初始化函数,并把该中断控制器的device node以及parent中断控制器的device node作为参数传递给irq chip driver。。具体的匹配过程的代码属于Device Tree模块的内容,更详细的信息可以参考Device Tree代码分析文档。

1.5.2.1.1.1 of_irq_init()

/**

* of_irq_init - Scan and init matching interrupt controllers in DT

* @matches: 0 terminated array of nodes to match and init function to call

*

* This function scans the device tree for matching interrupt controller nodes,

* and calls their initialization functions in order with parents first.

*/

void __init of_irq_init(const struct of_device_id *matches)

{

const struct of_device_id *match;

struct device_node *np, *parent = NULL;

struct of_intc_desc *desc, *temp_desc;

struct list_head intc_desc_list, intc_parent_list;

INIT_LIST_HEAD(&intc_desc_list);

INIT_LIST_HEAD(&intc_parent_list);

for_each_matching_node_and_match(np, matches, &match) {

if (!of_property_read_bool(np, "interrupt-controller") ||

!of_device_is_available(np))

continue;

if (WARN(!match->data, "of_irq_init: no init function for %s\n",

match->compatible))

continue;

/*

* Here, we allocate and populate an of_intc_desc with the node

* pointer, interrupt-parent device_node etc.

*/

desc = kzalloc(sizeof(*desc), GFP_KERNEL);

if (WARN_ON(!desc)) {

of_node_put(np);

goto err;

}

desc->irq_init_cb = match->data;

desc->dev = of_node_get(np);

desc->interrupt_parent = of_irq_find_parent(np);

if (desc->interrupt_parent == np)

desc->interrupt_parent = NULL;

list_add_tail(&desc->list, &intc_desc_list);

}

/*

* The root irq controller is the one without an interrupt-parent.

* That one goes first, followed by the controllers that reference it,

* followed by the ones that reference the 2nd level controllers, etc.

*/

while (!list_empty(&intc_desc_list)) {

/*

* Process all controllers with the current 'parent'.

* First pass will be looking for NULL as the parent.

* The assumption is that NULL parent means a root controller.

*/

list_for_each_entry_safe(desc, temp_desc, &intc_desc_list, list) {

int ret;

if (desc->interrupt_parent != parent)

continue;

list_del(&desc->list);

of_node_set_flag(desc->dev, OF_POPULATED);

pr_debug("of_irq_init: init %pOF (%p), parent %p\n",

desc->dev,

desc->dev, desc->interrupt_parent);

ret = desc->irq_init_cb(desc->dev,

desc->interrupt_parent);

if (ret) {

of_node_clear_flag(desc->dev, OF_POPULATED);

kfree(desc);

continue;

}

/*

* This one is now set up; add it to the parent list so

* its children can get processed in a subsequent pass.

*/

list_add_tail(&desc->list, &intc_parent_list);

}

/* Get the next pending parent that might have children */

desc = list_first_entry_or_null(&intc_parent_list,

typeof(*desc), list);

if (!desc) {

pr_err("of_irq_init: children remain, but no parents\n");

break;

}

list_del(&desc->list);

parent = desc->dev;

kfree(desc);

}

list_for_each_entry_safe(desc, temp_desc, &intc_parent_list, list) {

list_del(&desc->list);

kfree(desc);

}

err:

list_for_each_entry_safe(desc, temp_desc, &intc_desc_list, list) {

list_del(&desc->list);

of_node_put(desc->dev);

kfree(desc);

}

}

dtb:

gic: interrupt-controller@1400000 {

compatible = "arm,gic-400";

#interrupt-cells = <3>;

interrupt-controller;

reg = <0x0 0x1401000 0 0x1000>, /* GICD */

<0x0 0x1402000 0 0x2000>, /* GICC */

<0x0 0x1404000 0 0x2000>, /* GICH */

<0x0 0x1406000 0 0x2000>; /* GICV */

interrupts = <1 9 0xf08>;

};

IRQCHIP_DECLARE(gic_400, "arm,gic-400", gic_of_init);

IRQCHIP_DECLARE(arm11mp_gic, "arm,arm11mp-gic", gic_of_init);

IRQCHIP_DECLARE(arm1176jzf_dc_gic, "arm,arm1176jzf-devchip-gic", gic_of_init);

IRQCHIP_DECLARE(cortex_a15_gic, "arm,cortex-a15-gic", gic_of_init);

IRQCHIP_DECLARE(cortex_a9_gic, "arm,cortex-a9-gic", gic_of_init);

IRQCHIP_DECLARE(cortex_a7_gic, "arm,cortex-a7-gic", gic_of_init);

IRQCHIP_DECLARE(msm_8660_qgic, "qcom,msm-8660-qgic", gic_of_init);

IRQCHIP_DECLARE(msm_qgic2, "qcom,msm-qgic2", gic_of_init);

IRQCHIP_DECLARE(pl390, "arm,pl390", gic_of_init);

#define IRQCHIP_DECLARE(name, compat, fn) OF_DECLARE_2(irqchip, name, compat, fn)

#define OF_DECLARE_2(table, name, compat, fn) \

_OF_DECLARE(table, name, compat, fn, of_init_fn_2)

#define _OF_DECLARE(table, name, compat, fn, fn_type) \

static const struct of_device_id _of_table##name \

__used section(##table##_of_table) \

= { .compatible = compat, \

.data = (fn == (fn_type)NULL) ? fn : fn }

GIC driver初始化代码分析:

1.5.2.1.1.1.1 gic_of_init()

int __init

gic_of_init(struct device_node *node, struct device_node *parent)

{

struct gic_chip_data *gic;

int irq, ret;

if (WARN_ON(!node))

return -ENODEV;

if (WARN_ON(gic_cnt >= CONFIG_ARM_GIC_MAX_NR))

return -EINVAL;

gic = &gic_data[gic_cnt];

ret = gic_of_setup(gic, node);

if (ret)

return ret;

/*

* Disable split EOI/Deactivate if either HYP is not available

* or the CPU interface is too small.

*/

if (gic_cnt == 0 && !gic_check_eoimode(node, &gic->raw_cpu_base))

static_key_slow_dec(&supports_deactivate);

ret = __gic_init_bases(gic, -1, &node->fwnode);

if (ret) {

gic_teardown(gic);

return ret;

}

if (!gic_cnt) {

gic_init_physaddr(node);

gic_of_setup_kvm_info(node);

}

if (parent) {

irq = irq_of_parse_and_map(node, 0);

gic_cascade_irq(gic_cnt, irq);

}

if (IS_ENABLED(CONFIG_ARM_GIC_V2M))

gicv2m_init(&node->fwnode, gic_data[gic_cnt].domain);

gic_cnt++;

return 0;

}

1.5.2.1.1.1.1.1 gic_init_bases()

__gic_init_bases()->gic_init_bases()

static int gic_init_bases(struct gic_chip_data *gic, int irq_start,

struct fwnode_handle *handle)

{

irq_hw_number_t hwirq_base;

int gic_irqs, irq_base, ret;

if (IS_ENABLED(CONFIG_GIC_NON_BANKED) && gic->percpu_offset) {

/* Frankein-GIC without banked registers... */

unsigned int cpu;

gic->dist_base.percpu_base = alloc_percpu(void __iomem *);

gic->cpu_base.percpu_base = alloc_percpu(void __iomem *);

if (WARN_ON(!gic->dist_base.percpu_base ||

!gic->cpu_base.percpu_base)) {

ret = -ENOMEM;

goto error;

}

for_each_possible_cpu(cpu) {

u32 mpidr = cpu_logical_map(cpu);

u32 core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0);

unsigned long offset = gic->percpu_offset * core_id;

*per_cpu_ptr(gic->dist_base.percpu_base, cpu) =

gic->raw_dist_base + offset;

*per_cpu_ptr(gic->cpu_base.percpu_base, cpu) =

gic->raw_cpu_base + offset;

}

gic_set_base_accessor(gic, gic_get_percpu_base);

} else {

/* Normal, sane GIC... */

WARN(gic->percpu_offset,

"GIC_NON_BANKED not enabled, ignoring %08x offset!",

gic->percpu_offset);

gic->dist_base.common_base = gic->raw_dist_base;

gic->cpu_base.common_base = gic->raw_cpu_base;

gic_set_base_accessor(gic, gic_get_common_base);

}

/*

* Find out how many interrupts are supported.

* The GIC only supports up to 1020 interrupt sources.

*/

gic_irqs = readl_relaxed(gic_data_dist_base(gic) + GIC_DIST_CTR) & 0x1f;

gic_irqs = (gic_irqs + 1) * 32;

if (gic_irqs > 1020)

gic_irqs = 1020;

gic->gic_irqs = gic_irqs;

if (handle) { /* DT/ACPI */

gic->domain = irq_domain_create_linear(handle, gic_irqs,

&gic_irq_domain_hierarchy_ops,

gic);

} else { /* Legacy support */

/*

* For primary GICs, skip over SGIs.

* For secondary GICs, skip over PPIs, too.

*/

if (gic == &gic_data[0] && (irq_start & 31) > 0) {

hwirq_base = 16;

if (irq_start != -1)

irq_start = (irq_start & ~31) + 16;

} else {

hwirq_base = 32;

}

gic_irqs -= hwirq_base; /* calculate # of irqs to allocate */

irq_base = irq_alloc_descs(irq_start, 16, gic_irqs,

numa_node_id());

if (irq_base < 0) {

WARN(1, "Cannot allocate irq_descs @ IRQ%d, assuming pre-allocated\n",

irq_start);

irq_base = irq_start;

}

gic->domain = irq_domain_add_legacy(NULL, gic_irqs, irq_base,

hwirq_base, &gic_irq_domain_ops, gic);

}

if (WARN_ON(!gic->domain)) {

ret = -ENODEV;

goto error;

}

gic_dist_init(gic);

ret = gic_cpu_init(gic);

if (ret)

goto error;

ret = gic_pm_init(gic);

if (ret)

goto error;

return 0;

error:

if (IS_ENABLED(CONFIG_GIC_NON_BANKED) && gic->percpu_offset) {

free_percpu(gic->dist_base.percpu_base);

free_percpu(gic->cpu_base.percpu_base);

}

return ret;

}

这段代码主要是向系统中注册一个irq domain的数据结构。为何需要struct irq_domain这样一个数据结构呢?从linux kernel的角度来看,任何外部的设备的中断都是一个异步事件,kernel都需要识别这个事件。在内核中,用IRQ number来标识某一个设备的某个interrupt request。有了IRQ number就可以定位到该中断的描述符(struct irq_desc)。但是,对于中断控制器而言,它不并知道IRQ number,它只是知道HW interrupt number(中断控制器会为其支持的interrupt source进行编码,这个编码被称为Hardware interrupt number )。不同的软件模块用不同的ID来识别interrupt source,这样就需要映射了。如何将Hardware interrupt number 映射到IRQ number呢?这需要一个translation object,内核定义为struct irq_domain。

每个interrupt controller都会形成一个irq domain,负责解析其下游的interrut source。如果interrupt controller有级联的情况,那么一个非root interrupt controller的中断控制器也是其parent irq domain的一个普通的interrupt source。struct irq_domain定义如下:

struct irq_domain {

......

const struct irq_domain_ops *ops;

void *host_data;

......

};

在注册GIC的irq domain的时候还有一个重要的数据结构gic_irq_domain_ops,其类型是struct irq_domain_ops ,对于GIC,其irq domain的操作函数是gic_irq_domain_ops,定义如下:

static const struct irq_domain_ops gic_irq_domain_ops = {

.map = gic_irq_domain_map,

.unmap = gic_irq_domain_unmap,

};

irq domain的概念是一个通用中断子系统的概念,

irq domain相关callback函数分析: gic_irq_domain_map函数:创建IRQ number和GIC hw interrupt ID之间映射关系的时候,需要调用该回调函数。具体代码如下:

static int gic_irq_domain_map(struct irq_domain *d, unsigned int irq,

irq_hw_number_t hw)

{

struct gic_chip_data *gic = d->host_data;

if (hw < 32) {

irq_set_percpu_devid(irq);

irq_domain_set_info(d, irq, hw, &gic->chip, d->host_data,

handle_percpu_devid_irq, NULL, NULL);

irq_set_status_flags(irq, IRQ_NOAUTOEN);

} else {

irq_domain_set_info(d, irq, hw, &gic->chip, d->host_data,

handle_fasteoi_irq, NULL, NULL);

irq_set_probe(irq);

irqd_set_single_target(irq_desc_get_irq_data(irq_to_desc(irq)));

}

return 0;

}

由此,这里就找到了desc->handle_irq(desc) 函数被设置为handle_percpu_devid_irq或者handle_fasteoi_irq,以handle_percpu_devid_irq为例:

/**

* handle_percpu_devid_irq - Per CPU local irq handler with per cpu dev ids

* @desc: the interrupt description structure for this irq

*

* Per CPU interrupts on SMP machines without locking requirements. Same as

* handle_percpu_irq() above but with the following extras:

*

* action->percpu_dev_id is a pointer to percpu variables which

* contain the real device id for the cpu on which this handler is

* called

*/

void handle_percpu_devid_irq(struct irq_desc *desc)

{

struct irq_chip *chip = irq_desc_get_chip(desc);

struct irqaction *action = desc->action;

unsigned int irq = irq_desc_get_irq(desc);

irqreturn_t res;

kstat_incr_irqs_this_cpu(desc);

if (chip->irq_ack)

chip->irq_ack(&desc->irq_data);

if (likely(action)) {

trace_irq_handler_entry(irq, action);

res = action->handler(irq, raw_cpu_ptr(action->percpu_dev_id));

trace_irq_handler_exit(irq, action, res);

} else {

unsigned int cpu = smp_processor_id();

bool enabled = cpumask_test_cpu(cpu, desc->percpu_enabled);

if (enabled)

irq_percpu_disable(desc, cpu);

pr_err_once("Spurious%s percpu IRQ%u on CPU%u\n",

enabled ? " and unmasked" : "", irq, cpu);

}

if (chip->irq_eoi)

chip->irq_eoi(&desc->irq_data);

}

最终就调用了我们注册进去的服务程序。

相关推荐
勤奋的凯尔森同学1 小时前
webmin配置终端显示样式,模仿UbuntuDesktop终端
linux·运维·服务器·ubuntu·webmin
丁卯4042 小时前
Go语言中使用viper绑定结构体和yaml文件信息时,标签的使用
服务器·后端·golang
chengooooooo2 小时前
苍穹外卖day8 地址上传 用户下单 订单支付
java·服务器·数据库
李白同学3 小时前
【C语言】结构体内存对齐问题
c语言·开发语言
人间打气筒(Ada)3 小时前
MySQL主从架构
服务器·数据库·mysql
楼台的春风4 小时前
【MCU驱动开发概述】
c语言·驱动开发·单片机·嵌入式硬件·mcu·自动驾驶·嵌入式
Moonnnn.4 小时前
51单片机学习——动态数码管显示
笔记·嵌入式硬件·学习·51单片机
落笔画忧愁e4 小时前
FastGPT快速将消息发送至飞书
服务器·数据库·飞书
小冷爱学习!4 小时前
华为动态路由-OSPF-完全末梢区域
服务器·网络·华为
技术小齐5 小时前
网络运维学习笔记 016网工初级(HCIA-Datacom与CCNA-EI)PPP点对点协议和PPPoE以太网上的点对点协议(此处只讲华为)
运维·网络·学习