Sched_ext 回调深度解析(八):running —— 任务开始执行(6.18.26)

基于 Linux 6.18.26,结合内核源码逐行分析

系列文章:


框架背景与回调总览见前言,本文聚焦 running 回调。

1. running 是什么

running 是其中一个钩子,定义在 struct sched_ext_ops 中(kernel/sched/ext_internal.h:387

c 复制代码
/**
 * @running: A task is starting to run on its associated CPU
 * @p: task starting to run
 *
 * Note that this callback may be called from a CPU other than the
 * one the task is going to run on. This can happen when a task
 * property is changed (i.e., affinity), since scx_next_task_scx(),
 * which triggers this callback, may run on a CPU different from
 * the task's assigned CPU.
 *
 * Therefore, always use scx_bpf_task_cpu(@p) to determine the
 * target CPU the task is going to use.
 *
 * See ->runnable() for explanation on the task state notifiers.
 */
void (*running)(struct task_struct *p);

它的职责很明确:通知 BPF 调度器,一个 task 即将开始在 CPU 上执行。

与前面分析过的 init_task(登记)和 enable(上岗)不同,running 不是一次性的------每次 task 被调度到 CPU 上运行时,running 都会被调用。它和 stopping 构成一对:running 标志执行开始,stopping 标志执行结束。

ext_sched_class 完整定义及 running/stopping 的挂接位置见前言第 6 节。


2. running 的完整调用链

running 回调发生在内核调度的核心路径中。以下是从 __schedule()running 的完整调用链:

复制代码
__schedule()                                          // kernel/sched/core.c:6789
    └─ pick_next_task(rq, rq->donor, &rf)             // kernel/sched/core.c:6875
        └─ __pick_next_task(rq, prev, rf)              // kernel/sched/core.c:5955
            └─ class->pick_task(rq)                    // 对 ext_sched_class 即 pick_task_scx
            └─ put_prev_set_next_task(rq, prev, next)  // kernel/sched/sched.h:2499
                ├─ prev->sched_class->put_prev_task()  // → put_prev_task_scx → stopping 回调
                └─ next->sched_class->set_next_task()  // → set_next_task_scx → running 回调

下面用一张总览图展示从时钟中断到 running 回调的完整路径:
#mermaid-svg-AkHNRQaR23UUK8Ey{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-AkHNRQaR23UUK8Ey .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-AkHNRQaR23UUK8Ey .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-AkHNRQaR23UUK8Ey .error-icon{fill:#552222;}#mermaid-svg-AkHNRQaR23UUK8Ey .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-AkHNRQaR23UUK8Ey .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-AkHNRQaR23UUK8Ey .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-AkHNRQaR23UUK8Ey .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-AkHNRQaR23UUK8Ey .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-AkHNRQaR23UUK8Ey .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-AkHNRQaR23UUK8Ey .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-AkHNRQaR23UUK8Ey .marker{fill:#333333;stroke:#333333;}#mermaid-svg-AkHNRQaR23UUK8Ey .marker.cross{stroke:#333333;}#mermaid-svg-AkHNRQaR23UUK8Ey svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-AkHNRQaR23UUK8Ey p{margin:0;}#mermaid-svg-AkHNRQaR23UUK8Ey .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-AkHNRQaR23UUK8Ey .cluster-label text{fill:#333;}#mermaid-svg-AkHNRQaR23UUK8Ey .cluster-label span{color:#333;}#mermaid-svg-AkHNRQaR23UUK8Ey .cluster-label span p{background-color:transparent;}#mermaid-svg-AkHNRQaR23UUK8Ey .label text,#mermaid-svg-AkHNRQaR23UUK8Ey span{fill:#333;color:#333;}#mermaid-svg-AkHNRQaR23UUK8Ey .node rect,#mermaid-svg-AkHNRQaR23UUK8Ey .node circle,#mermaid-svg-AkHNRQaR23UUK8Ey .node ellipse,#mermaid-svg-AkHNRQaR23UUK8Ey .node polygon,#mermaid-svg-AkHNRQaR23UUK8Ey .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-AkHNRQaR23UUK8Ey .rough-node .label text,#mermaid-svg-AkHNRQaR23UUK8Ey .node .label text,#mermaid-svg-AkHNRQaR23UUK8Ey .image-shape .label,#mermaid-svg-AkHNRQaR23UUK8Ey .icon-shape .label{text-anchor:middle;}#mermaid-svg-AkHNRQaR23UUK8Ey .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-AkHNRQaR23UUK8Ey .rough-node .label,#mermaid-svg-AkHNRQaR23UUK8Ey .node .label,#mermaid-svg-AkHNRQaR23UUK8Ey .image-shape .label,#mermaid-svg-AkHNRQaR23UUK8Ey .icon-shape .label{text-align:center;}#mermaid-svg-AkHNRQaR23UUK8Ey .node.clickable{cursor:pointer;}#mermaid-svg-AkHNRQaR23UUK8Ey .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-AkHNRQaR23UUK8Ey .arrowheadPath{fill:#333333;}#mermaid-svg-AkHNRQaR23UUK8Ey .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-AkHNRQaR23UUK8Ey .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-AkHNRQaR23UUK8Ey .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-AkHNRQaR23UUK8Ey .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-AkHNRQaR23UUK8Ey .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-AkHNRQaR23UUK8Ey .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-AkHNRQaR23UUK8Ey .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-AkHNRQaR23UUK8Ey .cluster text{fill:#333;}#mermaid-svg-AkHNRQaR23UUK8Ey .cluster span{color:#333;}#mermaid-svg-AkHNRQaR23UUK8Ey div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-AkHNRQaR23UUK8Ey .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-AkHNRQaR23UUK8Ey rect.text{fill:none;stroke-width:0;}#mermaid-svg-AkHNRQaR23UUK8Ey .icon-shape,#mermaid-svg-AkHNRQaR23UUK8Ey .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-AkHNRQaR23UUK8Ey .icon-shape p,#mermaid-svg-AkHNRQaR23UUK8Ey .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-AkHNRQaR23UUK8Ey .icon-shape .label rect,#mermaid-svg-AkHNRQaR23UUK8Ey .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-AkHNRQaR23UUK8Ey .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-AkHNRQaR23UUK8Ey .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-AkHNRQaR23UUK8Ey :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 触发源

时钟中断 / 自愿让出 / 唤醒抢占
__schedule()
pick_next_task()
__pick_next_task()
prev_balance()
for_each_active_class()
pick_task_scx()

从 local DSQ 选出 next
put_prev_set_next_task()
put_prev_task_scx()
ops.stopping(prev)
set_next_task_scx()
ops.running(next)
📌 注释

balance_scx() 负责从全局 DSQ 搬运任务到 localDSQ

pick_task_scx() 仅从 local DSQ 选取任务,不直接访问全局 DSQ

下面逐层深入分析每一层的关键逻辑。


3. __schedule ------ 调度的总入口 (第1层)

__schedule() 是内核调度的核心函数(kernel/sched/core.c:6789),每次调度切换都经过它。这里只关注与 running 相关的关键部分:

c 复制代码
static void __sched notrace __schedule(int sched_mode)
{
    struct task_struct *prev, *next;
    // ...

    cpu = smp_processor_id();
    rq = cpu_rq(cpu);
    prev = rq->curr;                               // 当前正在运行的任务

    rq_lock(rq, &rf);
    smp_mb__after_spinlock();
    update_rq_clock(rq);

    // ... 处理 prev 状态(阻塞、抢占等)...

pick_again:
    next = pick_next_task(rq, rq->donor, &rf);     // ← 选出下一个要运行的任务
    rq_set_donor(rq, next);

    // ...

    if (likely(is_switch)) {
        RCU_INIT_POINTER(rq->curr, next);          // ← 切换 curr 指针
        // ... 后续上下文切换 ...
    }
}

注意 6.18.26 使用了 rq->donor(而非早期版本的 rq->curr)作为 pick_next_task 的参数,这是 proxy execution 机制引入的概念。

用流程图表示 __schedule 的核心决策路径:
#mermaid-svg-fKIjzkIOX7q9fR6j{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-fKIjzkIOX7q9fR6j .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-fKIjzkIOX7q9fR6j .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-fKIjzkIOX7q9fR6j .error-icon{fill:#552222;}#mermaid-svg-fKIjzkIOX7q9fR6j .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-fKIjzkIOX7q9fR6j .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-fKIjzkIOX7q9fR6j .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-fKIjzkIOX7q9fR6j .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-fKIjzkIOX7q9fR6j .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-fKIjzkIOX7q9fR6j .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-fKIjzkIOX7q9fR6j .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-fKIjzkIOX7q9fR6j .marker{fill:#333333;stroke:#333333;}#mermaid-svg-fKIjzkIOX7q9fR6j .marker.cross{stroke:#333333;}#mermaid-svg-fKIjzkIOX7q9fR6j svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-fKIjzkIOX7q9fR6j p{margin:0;}#mermaid-svg-fKIjzkIOX7q9fR6j .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-fKIjzkIOX7q9fR6j .cluster-label text{fill:#333;}#mermaid-svg-fKIjzkIOX7q9fR6j .cluster-label span{color:#333;}#mermaid-svg-fKIjzkIOX7q9fR6j .cluster-label span p{background-color:transparent;}#mermaid-svg-fKIjzkIOX7q9fR6j .label text,#mermaid-svg-fKIjzkIOX7q9fR6j span{fill:#333;color:#333;}#mermaid-svg-fKIjzkIOX7q9fR6j .node rect,#mermaid-svg-fKIjzkIOX7q9fR6j .node circle,#mermaid-svg-fKIjzkIOX7q9fR6j .node ellipse,#mermaid-svg-fKIjzkIOX7q9fR6j .node polygon,#mermaid-svg-fKIjzkIOX7q9fR6j .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-fKIjzkIOX7q9fR6j .rough-node .label text,#mermaid-svg-fKIjzkIOX7q9fR6j .node .label text,#mermaid-svg-fKIjzkIOX7q9fR6j .image-shape .label,#mermaid-svg-fKIjzkIOX7q9fR6j .icon-shape .label{text-anchor:middle;}#mermaid-svg-fKIjzkIOX7q9fR6j .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-fKIjzkIOX7q9fR6j .rough-node .label,#mermaid-svg-fKIjzkIOX7q9fR6j .node .label,#mermaid-svg-fKIjzkIOX7q9fR6j .image-shape .label,#mermaid-svg-fKIjzkIOX7q9fR6j .icon-shape .label{text-align:center;}#mermaid-svg-fKIjzkIOX7q9fR6j .node.clickable{cursor:pointer;}#mermaid-svg-fKIjzkIOX7q9fR6j .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-fKIjzkIOX7q9fR6j .arrowheadPath{fill:#333333;}#mermaid-svg-fKIjzkIOX7q9fR6j .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-fKIjzkIOX7q9fR6j .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-fKIjzkIOX7q9fR6j .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fKIjzkIOX7q9fR6j .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-fKIjzkIOX7q9fR6j .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fKIjzkIOX7q9fR6j .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-fKIjzkIOX7q9fR6j .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-fKIjzkIOX7q9fR6j .cluster text{fill:#333;}#mermaid-svg-fKIjzkIOX7q9fR6j .cluster span{color:#333;}#mermaid-svg-fKIjzkIOX7q9fR6j div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-fKIjzkIOX7q9fR6j .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-fKIjzkIOX7q9fR6j rect.text{fill:none;stroke-width:0;}#mermaid-svg-fKIjzkIOX7q9fR6j .icon-shape,#mermaid-svg-fKIjzkIOX7q9fR6j .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fKIjzkIOX7q9fR6j .icon-shape p,#mermaid-svg-fKIjzkIOX7q9fR6j .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-fKIjzkIOX7q9fR6j .icon-shape .label rect,#mermaid-svg-fKIjzkIOX7q9fR6j .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fKIjzkIOX7q9fR6j .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-fKIjzkIOX7q9fR6j .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-fKIjzkIOX7q9fR6j :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} SM_IDLE 模式
yes
no
!preempt 且 prev_state
preempt 或 running
yes
no
__schedule() 入口
cpu = smp_processor_id()
rq = cpu_rq(cpu)
prev = rq->curr
rq_lock(rq, &rf)
update_rq_clock(rq)
prev 状态判断
rq->nr_running == 0

且 !scx_enabled()?
next = prev, goto picked
pick_again:

next = pick_next_task()
try_to_block_task()
prev != next?
RCU_INIT_POINTER(rq->curr, next)

上下文切换
直接返回


4. __pick_next_task ------ 选择下一个任务(第2层)

__pick_next_task() 负责从所有调度类中选出下一个要运行的任务(kernel/sched/core.c:5955):

c 复制代码
static inline struct task_struct *
__pick_next_task(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
    const struct sched_class *class;
    struct task_struct *p;

    rq->dl_server = NULL;

    if (scx_enabled())
        goto restart;                              // ← scx 启用时,跳过 CFS 快速路径

    // CFS 快速路径优化(scx 启用时跳过)...

restart:
    prev_balance(rq, prev, rf);                    // ← 调用各调度类的 balance()

    for_each_active_class(class) {
        if (class->pick_next_task) {
            p = class->pick_next_task(rq, prev);
            if (p)
                return p;
        } else {
            p = class->pick_task(rq);              // ← 对 ext_sched_class,调用 pick_task_scx()
            if (p) {
                put_prev_set_next_task(rq, prev, p);  // ← 关键:完成 prev/next 切换
                return p;
            }
        }
    }

    BUG(); /* The idle class should always have a runnable task. */
}

当 scx_enabled() 为 true 时,CFS 快速路径被跳过,直接进入 restart 标签。 这意味着即使系统中大多数任务都是 fair 类,只要 scx 加载了,就必须走完整的调度类遍历路径。

__pick_next_task 的决策逻辑可以用下面的流程图表示:
#mermaid-svg-AOdqx5cFJMe3rpu5{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-AOdqx5cFJMe3rpu5 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-AOdqx5cFJMe3rpu5 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-AOdqx5cFJMe3rpu5 .error-icon{fill:#552222;}#mermaid-svg-AOdqx5cFJMe3rpu5 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-AOdqx5cFJMe3rpu5 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-AOdqx5cFJMe3rpu5 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-AOdqx5cFJMe3rpu5 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-AOdqx5cFJMe3rpu5 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-AOdqx5cFJMe3rpu5 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-AOdqx5cFJMe3rpu5 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-AOdqx5cFJMe3rpu5 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-AOdqx5cFJMe3rpu5 .marker.cross{stroke:#333333;}#mermaid-svg-AOdqx5cFJMe3rpu5 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-AOdqx5cFJMe3rpu5 p{margin:0;}#mermaid-svg-AOdqx5cFJMe3rpu5 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-AOdqx5cFJMe3rpu5 .cluster-label text{fill:#333;}#mermaid-svg-AOdqx5cFJMe3rpu5 .cluster-label span{color:#333;}#mermaid-svg-AOdqx5cFJMe3rpu5 .cluster-label span p{background-color:transparent;}#mermaid-svg-AOdqx5cFJMe3rpu5 .label text,#mermaid-svg-AOdqx5cFJMe3rpu5 span{fill:#333;color:#333;}#mermaid-svg-AOdqx5cFJMe3rpu5 .node rect,#mermaid-svg-AOdqx5cFJMe3rpu5 .node circle,#mermaid-svg-AOdqx5cFJMe3rpu5 .node ellipse,#mermaid-svg-AOdqx5cFJMe3rpu5 .node polygon,#mermaid-svg-AOdqx5cFJMe3rpu5 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-AOdqx5cFJMe3rpu5 .rough-node .label text,#mermaid-svg-AOdqx5cFJMe3rpu5 .node .label text,#mermaid-svg-AOdqx5cFJMe3rpu5 .image-shape .label,#mermaid-svg-AOdqx5cFJMe3rpu5 .icon-shape .label{text-anchor:middle;}#mermaid-svg-AOdqx5cFJMe3rpu5 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-AOdqx5cFJMe3rpu5 .rough-node .label,#mermaid-svg-AOdqx5cFJMe3rpu5 .node .label,#mermaid-svg-AOdqx5cFJMe3rpu5 .image-shape .label,#mermaid-svg-AOdqx5cFJMe3rpu5 .icon-shape .label{text-align:center;}#mermaid-svg-AOdqx5cFJMe3rpu5 .node.clickable{cursor:pointer;}#mermaid-svg-AOdqx5cFJMe3rpu5 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-AOdqx5cFJMe3rpu5 .arrowheadPath{fill:#333333;}#mermaid-svg-AOdqx5cFJMe3rpu5 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-AOdqx5cFJMe3rpu5 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-AOdqx5cFJMe3rpu5 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-AOdqx5cFJMe3rpu5 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-AOdqx5cFJMe3rpu5 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-AOdqx5cFJMe3rpu5 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-AOdqx5cFJMe3rpu5 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-AOdqx5cFJMe3rpu5 .cluster text{fill:#333;}#mermaid-svg-AOdqx5cFJMe3rpu5 .cluster span{color:#333;}#mermaid-svg-AOdqx5cFJMe3rpu5 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-AOdqx5cFJMe3rpu5 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-AOdqx5cFJMe3rpu5 rect.text{fill:none;stroke-width:0;}#mermaid-svg-AOdqx5cFJMe3rpu5 .icon-shape,#mermaid-svg-AOdqx5cFJMe3rpu5 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-AOdqx5cFJMe3rpu5 .icon-shape p,#mermaid-svg-AOdqx5cFJMe3rpu5 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-AOdqx5cFJMe3rpu5 .icon-shape .label rect,#mermaid-svg-AOdqx5cFJMe3rpu5 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-AOdqx5cFJMe3rpu5 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-AOdqx5cFJMe3rpu5 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-AOdqx5cFJMe3rpu5 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} yes
no
yes
RETRY_TASK
选到任务
没选到
no
pick_next_task
pick_next_task
yes
no
__pick_next_task()
scx_enabled()?
goto restart
全部是 fair 类任务?
pick_next_task_fair()
返回 p
pick_task_idle()
put_prev_set_next_task()
prev_balance()
for_each_active_class()
dl_sched_class
rt_sched_class
ext_sched_class
pick_task_scx()
选到任务?
put_prev_set_next_task()
fair_sched_class
idle_sched_class

for_each_active_class 循环中,ext_sched_class 没有 pick_next_task 方法(只有 pick_task),所以走 else 分支:

  1. 调用 pick_task_scx() 选出下一个任务
  2. 调用 put_prev_set_next_task() 完成 prev/next 切换

5. put_prev_set_next_task ------ prev/next 切换枢纽(第3层)

put_prev_set_next_task() 是连接 put_prev 和 set_next 的枢纽函数(kernel/sched/sched.h:2499):

c 复制代码
static inline void put_prev_set_next_task(struct rq *rq,
                                          struct task_struct *prev,
                                          struct task_struct *next)
{
    WARN_ON_ONCE(rq->donor != prev);

    __put_prev_set_next_dl_server(rq, prev, next);

    if (next == prev)                              // ← 同一个任务,无需切换
        return;

    prev->sched_class->put_prev_task(rq, prev, next);  // ← 让 prev "下班"(触发 stopping)
    next->sched_class->set_next_task(rq, next, true);   // ← 让 next "上班"(触发 running)
}

这里有两点值得注意:

细节 1:next == prev 时直接返回。 如果 pick_task 选出的任务和当前运行的任务相同(比如时间片未耗尽,继续运行),不会触发任何回调。这意味着 running 不是"每个 tick 都调用",而是"每次实际发生任务切换时调用"。

细节 2:先 put_prev 再 set_next。 stopping 一定先于 running 执行。对于同一个 CPU,时序永远是:先让旧任务 stopping,再让新任务 running。


6. 第四层:set_next_task_scx ------ running 回调的执行现场

set_next_task_scx()ext_sched_classset_next_task 实现(kernel/sched/ext.c:2268),这是 running 回调真正被调用的地方:

c 复制代码
static void set_next_task_scx(struct rq *rq, struct task_struct *p, bool first)
{
    struct scx_sched *sch = scx_root;

    // ① 如果 task 还在队列中(core-sched 强制提前调度),先出队
    if (p->scx.flags & SCX_TASK_QUEUED) {
        /*
         * Core-sched might decide to execute @p before it is
         * dispatched. Call ops_dequeue() to notify the BPF scheduler.
         */
        ops_dequeue(rq, p, SCX_DEQ_CORE_SCHED_EXEC);
        dispatch_dequeue(rq, p);
    }

    // ② 设置执行开始时间戳
    p->se.exec_start = rq_clock_task(rq);

    // ③ 调用 running 回调
    /* see dequeue_task_scx() on why we skip when !QUEUED */
    if (SCX_HAS_OP(sch, running) && (p->scx.flags & SCX_TASK_QUEUED))
        SCX_CALL_OP_TASK(sch, SCX_KF_REST, running, rq, p);

    // ④ 清除 runnable 状态
    clr_task_runnable(p, true);

    // ⑤ 刷新 tick 依赖
    if ((p->scx.slice == SCX_SLICE_INF) !=
        (bool)(rq->scx.flags & SCX_RQ_CAN_STOP_TICK)) {
        if (p->scx.slice == SCX_SLICE_INF)
            rq->scx.flags |= SCX_RQ_CAN_STOP_TICK;
        else
            rq->scx.flags &= ~SCX_RQ_CAN_STOP_TICK;

        sched_update_tick_dependency(rq);
        update_other_load_avgs(rq);
    }
}

这个函数做的事情比较多,与running相关的分析如下:

6.1 core-sched 提前调度处理

正常情况下,task 被调度时已经从 DSQ(调度队列)中移除了。但 core-sched(内核的同级线程调度机制) 可能会强制让一个 task 提前执行------即使它还没被 dispatch 到 local DSQ。
#mermaid-svg-68juu1Qcjc2OySCx{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-68juu1Qcjc2OySCx .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-68juu1Qcjc2OySCx .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-68juu1Qcjc2OySCx .error-icon{fill:#552222;}#mermaid-svg-68juu1Qcjc2OySCx .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-68juu1Qcjc2OySCx .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-68juu1Qcjc2OySCx .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-68juu1Qcjc2OySCx .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-68juu1Qcjc2OySCx .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-68juu1Qcjc2OySCx .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-68juu1Qcjc2OySCx .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-68juu1Qcjc2OySCx .marker{fill:#333333;stroke:#333333;}#mermaid-svg-68juu1Qcjc2OySCx .marker.cross{stroke:#333333;}#mermaid-svg-68juu1Qcjc2OySCx svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-68juu1Qcjc2OySCx p{margin:0;}#mermaid-svg-68juu1Qcjc2OySCx .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-68juu1Qcjc2OySCx .cluster-label text{fill:#333;}#mermaid-svg-68juu1Qcjc2OySCx .cluster-label span{color:#333;}#mermaid-svg-68juu1Qcjc2OySCx .cluster-label span p{background-color:transparent;}#mermaid-svg-68juu1Qcjc2OySCx .label text,#mermaid-svg-68juu1Qcjc2OySCx span{fill:#333;color:#333;}#mermaid-svg-68juu1Qcjc2OySCx .node rect,#mermaid-svg-68juu1Qcjc2OySCx .node circle,#mermaid-svg-68juu1Qcjc2OySCx .node ellipse,#mermaid-svg-68juu1Qcjc2OySCx .node polygon,#mermaid-svg-68juu1Qcjc2OySCx .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-68juu1Qcjc2OySCx .rough-node .label text,#mermaid-svg-68juu1Qcjc2OySCx .node .label text,#mermaid-svg-68juu1Qcjc2OySCx .image-shape .label,#mermaid-svg-68juu1Qcjc2OySCx .icon-shape .label{text-anchor:middle;}#mermaid-svg-68juu1Qcjc2OySCx .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-68juu1Qcjc2OySCx .rough-node .label,#mermaid-svg-68juu1Qcjc2OySCx .node .label,#mermaid-svg-68juu1Qcjc2OySCx .image-shape .label,#mermaid-svg-68juu1Qcjc2OySCx .icon-shape .label{text-align:center;}#mermaid-svg-68juu1Qcjc2OySCx .node.clickable{cursor:pointer;}#mermaid-svg-68juu1Qcjc2OySCx .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-68juu1Qcjc2OySCx .arrowheadPath{fill:#333333;}#mermaid-svg-68juu1Qcjc2OySCx .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-68juu1Qcjc2OySCx .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-68juu1Qcjc2OySCx .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-68juu1Qcjc2OySCx .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-68juu1Qcjc2OySCx .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-68juu1Qcjc2OySCx .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-68juu1Qcjc2OySCx .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-68juu1Qcjc2OySCx .cluster text{fill:#333;}#mermaid-svg-68juu1Qcjc2OySCx .cluster span{color:#333;}#mermaid-svg-68juu1Qcjc2OySCx div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-68juu1Qcjc2OySCx .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-68juu1Qcjc2OySCx rect.text{fill:none;stroke-width:0;}#mermaid-svg-68juu1Qcjc2OySCx .icon-shape,#mermaid-svg-68juu1Qcjc2OySCx .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-68juu1Qcjc2OySCx .icon-shape p,#mermaid-svg-68juu1Qcjc2OySCx .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-68juu1Qcjc2OySCx .icon-shape .label rect,#mermaid-svg-68juu1Qcjc2OySCx .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-68juu1Qcjc2OySCx .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-68juu1Qcjc2OySCx .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-68juu1Qcjc2OySCx :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} core-sched 强制路径
task 仍在 DSQ 中
core-sched 强制选择 task
set_next_task_scx()
ops_dequeue(DEQ_CORE_SCHED_EXEC)
dispatch_dequeue()
正常路径
ops.dispatch()
task 进入 local DSQ
pick_task_scx() 取出
dispatch_dequeue() 已完成
set_next_task_scx()

此时 p->scx.flags & SCX_TASK_QUEUED 仍然为 true,需要做两件事:

  1. ops_dequeue(rq, p, SCX_DEQ_CORE_SCHED_EXEC) :通知 BPF 调度器,这个 task 正在被 core-sched 强制出队。BPF 的 dequeue 回调会被调用,deq_flagsSCX_DEQ_CORE_SCHED_EXEC1LLU << 32)。

  2. dispatch_dequeue(rq, p) :从 DSQ 中物理移除 task,清理 p->scx.dsqp->scx.holding_cpu 等字段。

SCX_DEQ_CORE_SCHED_EXEC 的定义(kernel/sched/ext_internal.h:959):

c 复制代码
/*
 * The generic core-sched layer decided to execute the task even though
 * it hasn't been dispatched yet. Dequeue from the BPF side.
 */
SCX_DEQ_CORE_SCHED_EXEC = 1LLU << 32,

6.2 调用 running 回调

c 复制代码
if (SCX_HAS_OP(sch, running) && (p->scx.flags & SCX_TASK_QUEUED))
    SCX_CALL_OP_TASK(sch, SCX_KF_REST, running, rq, p);

这里有两个条件必须同时满足:
#mermaid-svg-l2yaGQgdIFOuCicI{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-l2yaGQgdIFOuCicI .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-l2yaGQgdIFOuCicI .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-l2yaGQgdIFOuCicI .error-icon{fill:#552222;}#mermaid-svg-l2yaGQgdIFOuCicI .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-l2yaGQgdIFOuCicI .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-l2yaGQgdIFOuCicI .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-l2yaGQgdIFOuCicI .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-l2yaGQgdIFOuCicI .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-l2yaGQgdIFOuCicI .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-l2yaGQgdIFOuCicI .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-l2yaGQgdIFOuCicI .marker{fill:#333333;stroke:#333333;}#mermaid-svg-l2yaGQgdIFOuCicI .marker.cross{stroke:#333333;}#mermaid-svg-l2yaGQgdIFOuCicI svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-l2yaGQgdIFOuCicI p{margin:0;}#mermaid-svg-l2yaGQgdIFOuCicI .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-l2yaGQgdIFOuCicI .cluster-label text{fill:#333;}#mermaid-svg-l2yaGQgdIFOuCicI .cluster-label span{color:#333;}#mermaid-svg-l2yaGQgdIFOuCicI .cluster-label span p{background-color:transparent;}#mermaid-svg-l2yaGQgdIFOuCicI .label text,#mermaid-svg-l2yaGQgdIFOuCicI span{fill:#333;color:#333;}#mermaid-svg-l2yaGQgdIFOuCicI .node rect,#mermaid-svg-l2yaGQgdIFOuCicI .node circle,#mermaid-svg-l2yaGQgdIFOuCicI .node ellipse,#mermaid-svg-l2yaGQgdIFOuCicI .node polygon,#mermaid-svg-l2yaGQgdIFOuCicI .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-l2yaGQgdIFOuCicI .rough-node .label text,#mermaid-svg-l2yaGQgdIFOuCicI .node .label text,#mermaid-svg-l2yaGQgdIFOuCicI .image-shape .label,#mermaid-svg-l2yaGQgdIFOuCicI .icon-shape .label{text-anchor:middle;}#mermaid-svg-l2yaGQgdIFOuCicI .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-l2yaGQgdIFOuCicI .rough-node .label,#mermaid-svg-l2yaGQgdIFOuCicI .node .label,#mermaid-svg-l2yaGQgdIFOuCicI .image-shape .label,#mermaid-svg-l2yaGQgdIFOuCicI .icon-shape .label{text-align:center;}#mermaid-svg-l2yaGQgdIFOuCicI .node.clickable{cursor:pointer;}#mermaid-svg-l2yaGQgdIFOuCicI .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-l2yaGQgdIFOuCicI .arrowheadPath{fill:#333333;}#mermaid-svg-l2yaGQgdIFOuCicI .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-l2yaGQgdIFOuCicI .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-l2yaGQgdIFOuCicI .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-l2yaGQgdIFOuCicI .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-l2yaGQgdIFOuCicI .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-l2yaGQgdIFOuCicI .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-l2yaGQgdIFOuCicI .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-l2yaGQgdIFOuCicI .cluster text{fill:#333;}#mermaid-svg-l2yaGQgdIFOuCicI .cluster span{color:#333;}#mermaid-svg-l2yaGQgdIFOuCicI div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-l2yaGQgdIFOuCicI .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-l2yaGQgdIFOuCicI rect.text{fill:none;stroke-width:0;}#mermaid-svg-l2yaGQgdIFOuCicI .icon-shape,#mermaid-svg-l2yaGQgdIFOuCicI .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-l2yaGQgdIFOuCicI .icon-shape p,#mermaid-svg-l2yaGQgdIFOuCicI .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-l2yaGQgdIFOuCicI .icon-shape .label rect,#mermaid-svg-l2yaGQgdIFOuCicI .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-l2yaGQgdIFOuCicI .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-l2yaGQgdIFOuCicI .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-l2yaGQgdIFOuCicI :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} false:BPF 未注册
true:BPF 已注册
false:task 不在队列
true:task 在队列中
步骤③:准备调用 running
SCX_HAS_OP(running)?
跳过,不调用
SCX_TASK_QUEUED?
SCX_CALL_OP_TASK(running)
BPF 调度器的 running()

条件 1:SCX_HAS_OP(sch, running)

BPF 调度器必须注册了 running 回调。SCX_HAS_OP 是一个位测试宏:

c 复制代码
#define SCX_HAS_OP(sch, op) test_bit(SCX_OP_IDX(op), (sch)->has_op)

如果 BPF 程序没有实现 running,这个检查直接返回 false,跳过调用。

条件 2:p->scx.flags & SCX_TASK_QUEUED

task 必须仍在 scx 的可运行队列中。这个标志定义在 include/linux/sched/ext.h

c 复制代码
/* scx_entity.flags */
enum scx_ent_flags {
    SCX_TASK_QUEUED = 1 << 0, /* on ext runqueue */
    // ...
};

注释 /* see dequeue_task_scx() on why we skip when !QUEUED */ 指向一个重要细节。这里的关键问题是:步骤①中 dispatch_dequeue 清除了 p->scx.dsq,步骤④中 clr_task_runnablerunnable_list 中移除了 task------但 SCX_TASK_QUEUED 标志只由 dequeue_task_scx() 清除 ,而 dequeue_task_scx()set_next_task_scx 的执行路径中并未被调用。

因此,无论 core-sched 提前调度还是正常调度场景,步骤③检查 SCX_TASK_QUEUED 时该标志始终为 true。只要 BPF 调度器注册了 running 回调,它都会被调用。

6.3 清除 runnable 状态

c 复制代码
clr_task_runnable(p, true);

task 即将开始执行,不再处于 "runnable" 状态。注意 reset_runnable_at = true,这意味着当 task 下次被 enqueue 时,runnable_at 时间戳会被重置。


7. running 的"搭档":stopping

runningstopping 是一对对称的回调。runningset_next_task_scx 中调用,stoppingput_prev_task_scx 中调用。两者在 put_prev_set_next_task 中被紧密连接:

复制代码
put_prev_set_next_task(rq, prev, next)
    ├─ prev->sched_class->put_prev_task(rq, prev, next)
    │   └─ put_prev_task_scx()        → ops.stopping(prev, runnable=true)
    └─ next->sched_class->set_next_task(rq, next, true)
        └─ set_next_task_scx()         → ops.running(next)

7.1 stopping 回调的定义

c 复制代码
/**
 * @stopping: A task is stopping execution
 * @p: task stopping to run
 * @runnable: is task @p still runnable?
 *
 * Note that this callback may be called from a CPU other than the
 * one the task was running on. This can happen when a task
 * property is changed (i.e., affinity), since dequeue_task_scx(),
 * which triggers this callback, may run on a CPU different from
 * the task's assigned CPU.
 *
 * Therefore, always use scx_bpf_task_cpu(@p) to retrieve the CPU
 * the task was running on.
 *
 * See ->runnable() for explanation on the task state notifiers. If
 * !@runnable, ->quiescent() will be invoked after this operation
 * returns.
 */
void (*stopping)(struct task_struct *p, bool runnable);

7.2 put_prev_task_scx 的源码分析

put_prev_task_scx()ext_sched_classput_prev_task 实现(kernel/sched/ext.c:2364):

c 复制代码
static void put_prev_task_scx(struct rq *rq, struct task_struct *p,
                              struct task_struct *next)
{
    struct scx_sched *sch = scx_root;

    /* see kick_cpus_irq_workfn() */
    smp_store_release(&rq->scx.pnt_seq, rq->scx.pnt_seq + 1);

    update_curr_scx(rq);                           // ← 更新时间片消耗

    /* see dequeue_task_scx() on why we skip when !QUEUED */
    if (SCX_HAS_OP(sch, stopping) && (p->scx.flags & SCX_TASK_QUEUED))
        SCX_CALL_OP_TASK(sch, SCX_KF_REST, stopping, rq, p, true);  // ← stopping 回调

    if (p->scx.flags & SCX_TASK_QUEUED) {
        set_task_runnable(rq, p);

        // 如果还有剩余时间片且非 bypass 模式,放回 local DSQ 头部
        if (p->scx.slice && !scx_rq_bypassing(rq)) {
            dispatch_enqueue(sch, &rq->scx.local_dsq, p, SCX_ENQ_HEAD);
            goto switch_class;
        }

        // 否则重新 enqueue
        if (sched_class_above(&ext_sched_class, next->sched_class)) {
            WARN_ON_ONCE(!(sch->ops.flags & SCX_OPS_ENQ_LAST));
            do_enqueue_task(rq, p, SCX_ENQ_LAST, -1);
        } else {
            do_enqueue_task(rq, p, 0, -1);
        }
    }

switch_class:
    if (next && next->sched_class != &ext_sched_class)
        switch_class(rq, next);                   // ← 通知 CPU 被 higher class 抢占
}

put_prev_task_scx 中 stopping 的调用同样要求 SCX_TASK_QUEUED 标志。然后根据 task 是否还有剩余时间片,决定是放回 local DSQ 头部还是重新 enqueue。

其中 update_curr_scx() 会递减时间片:

c 复制代码
static void update_curr_scx(struct rq *rq)
{
    struct task_struct *curr = rq->curr;
    s64 delta_exec;

    delta_exec = update_curr_common(rq);
    if (unlikely(delta_exec <= 0))
        return;

    if (curr->scx.slice != SCX_SLICE_INF) {
        curr->scx.slice -= min_t(u64, curr->scx.slice, delta_exec);
        if (!curr->scx.slice)
            touch_core_sched(rq, curr);
    }
}

7.3 running 与 stopping 的时序关系

BPF Scheduler set_next_task_scx put_prev_task_scx __schedule BPF Scheduler set_next_task_scx put_prev_task_scx __schedule #mermaid-svg-Q9YJza3pb0ZoWHuK{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-Q9YJza3pb0ZoWHuK .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-Q9YJza3pb0ZoWHuK .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-Q9YJza3pb0ZoWHuK .error-icon{fill:#552222;}#mermaid-svg-Q9YJza3pb0ZoWHuK .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-Q9YJza3pb0ZoWHuK .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-Q9YJza3pb0ZoWHuK .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-Q9YJza3pb0ZoWHuK .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-Q9YJza3pb0ZoWHuK .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-Q9YJza3pb0ZoWHuK .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-Q9YJza3pb0ZoWHuK .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-Q9YJza3pb0ZoWHuK .marker{fill:#333333;stroke:#333333;}#mermaid-svg-Q9YJza3pb0ZoWHuK .marker.cross{stroke:#333333;}#mermaid-svg-Q9YJza3pb0ZoWHuK svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-Q9YJza3pb0ZoWHuK p{margin:0;}#mermaid-svg-Q9YJza3pb0ZoWHuK .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-Q9YJza3pb0ZoWHuK text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-Q9YJza3pb0ZoWHuK .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-Q9YJza3pb0ZoWHuK .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-Q9YJza3pb0ZoWHuK .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-Q9YJza3pb0ZoWHuK .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-Q9YJza3pb0ZoWHuK #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-Q9YJza3pb0ZoWHuK .sequenceNumber{fill:white;}#mermaid-svg-Q9YJza3pb0ZoWHuK #sequencenumber{fill:#333;}#mermaid-svg-Q9YJza3pb0ZoWHuK #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-Q9YJza3pb0ZoWHuK .messageText{fill:#333;stroke:none;}#mermaid-svg-Q9YJza3pb0ZoWHuK .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-Q9YJza3pb0ZoWHuK .labelText,#mermaid-svg-Q9YJza3pb0ZoWHuK .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-Q9YJza3pb0ZoWHuK .loopText,#mermaid-svg-Q9YJza3pb0ZoWHuK .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-Q9YJza3pb0ZoWHuK .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-Q9YJza3pb0ZoWHuK .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-Q9YJza3pb0ZoWHuK .noteText,#mermaid-svg-Q9YJza3pb0ZoWHuK .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-Q9YJza3pb0ZoWHuK .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-Q9YJza3pb0ZoWHuK .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-Q9YJza3pb0ZoWHuK .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-Q9YJza3pb0ZoWHuK .actorPopupMenu{position:absolute;}#mermaid-svg-Q9YJza3pb0ZoWHuK .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-Q9YJza3pb0ZoWHuK .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-Q9YJza3pb0ZoWHuK .actor-man circle,#mermaid-svg-Q9YJza3pb0ZoWHuK line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-Q9YJza3pb0ZoWHuK :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} update_curr_scx() - 递减 prev 时间片 将 prev 放回 DSQ 或重新 enqueue 可能执行 ops_dequeue (core-sched) 设置 exec_start 时间戳 clr_task_runnable() 刷新 tick 依赖 put_prev_set_next_task(prev, next) ops.stopping(prev, runnable=true) set_next_task_scx(next) ops.running(next)


8. 重要问题

8.1 执行running 回调的CPU 与 task 即将运行的 CPU 可能不是同一个

6.18.26 的 running 回调注释中特别强调了这一点(6.15.7 的注释中并没有提到):

Note that this callback may be called from a CPU other than the one the task is going to run on.

也就是说:执行 BPF running 函数的 CPU,不一定是 task 即将运行的那个 CPU。 典型场景是 task 的 CPU 亲和性(affinity)被修改,导致 set_next_task_scx() 在 CPU A 上执行,但 task 实际将在 CPU B 上运行。
#mermaid-svg-JSBTizY90nbt63Po{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-JSBTizY90nbt63Po .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-JSBTizY90nbt63Po .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-JSBTizY90nbt63Po .error-icon{fill:#552222;}#mermaid-svg-JSBTizY90nbt63Po .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-JSBTizY90nbt63Po .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-JSBTizY90nbt63Po .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-JSBTizY90nbt63Po .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-JSBTizY90nbt63Po .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-JSBTizY90nbt63Po .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-JSBTizY90nbt63Po .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-JSBTizY90nbt63Po .marker{fill:#333333;stroke:#333333;}#mermaid-svg-JSBTizY90nbt63Po .marker.cross{stroke:#333333;}#mermaid-svg-JSBTizY90nbt63Po svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-JSBTizY90nbt63Po p{margin:0;}#mermaid-svg-JSBTizY90nbt63Po .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-JSBTizY90nbt63Po .cluster-label text{fill:#333;}#mermaid-svg-JSBTizY90nbt63Po .cluster-label span{color:#333;}#mermaid-svg-JSBTizY90nbt63Po .cluster-label span p{background-color:transparent;}#mermaid-svg-JSBTizY90nbt63Po .label text,#mermaid-svg-JSBTizY90nbt63Po span{fill:#333;color:#333;}#mermaid-svg-JSBTizY90nbt63Po .node rect,#mermaid-svg-JSBTizY90nbt63Po .node circle,#mermaid-svg-JSBTizY90nbt63Po .node ellipse,#mermaid-svg-JSBTizY90nbt63Po .node polygon,#mermaid-svg-JSBTizY90nbt63Po .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-JSBTizY90nbt63Po .rough-node .label text,#mermaid-svg-JSBTizY90nbt63Po .node .label text,#mermaid-svg-JSBTizY90nbt63Po .image-shape .label,#mermaid-svg-JSBTizY90nbt63Po .icon-shape .label{text-anchor:middle;}#mermaid-svg-JSBTizY90nbt63Po .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-JSBTizY90nbt63Po .rough-node .label,#mermaid-svg-JSBTizY90nbt63Po .node .label,#mermaid-svg-JSBTizY90nbt63Po .image-shape .label,#mermaid-svg-JSBTizY90nbt63Po .icon-shape .label{text-align:center;}#mermaid-svg-JSBTizY90nbt63Po .node.clickable{cursor:pointer;}#mermaid-svg-JSBTizY90nbt63Po .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-JSBTizY90nbt63Po .arrowheadPath{fill:#333333;}#mermaid-svg-JSBTizY90nbt63Po .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-JSBTizY90nbt63Po .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-JSBTizY90nbt63Po .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-JSBTizY90nbt63Po .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-JSBTizY90nbt63Po .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-JSBTizY90nbt63Po .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-JSBTizY90nbt63Po .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-JSBTizY90nbt63Po .cluster text{fill:#333;}#mermaid-svg-JSBTizY90nbt63Po .cluster span{color:#333;}#mermaid-svg-JSBTizY90nbt63Po div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-JSBTizY90nbt63Po .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-JSBTizY90nbt63Po rect.text{fill:none;stroke-width:0;}#mermaid-svg-JSBTizY90nbt63Po .icon-shape,#mermaid-svg-JSBTizY90nbt63Po .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-JSBTizY90nbt63Po .icon-shape p,#mermaid-svg-JSBTizY90nbt63Po .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-JSBTizY90nbt63Po .icon-shape .label rect,#mermaid-svg-JSBTizY90nbt63Po .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-JSBTizY90nbt63Po .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-JSBTizY90nbt63Po .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-JSBTizY90nbt63Po :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} CPU B(task 的目标 CPU)
CPU A(执行调度的 CPU)
task 将被迁移
set_next_task_scx()
BPF running() 在 CPU A 上执行
task 实际在 CPU B 上运行

这意味着在 running 回调中,不能使用 bpf_get_smp_processor_id() 来获取 task 即将运行的 CPU,因为返回的是 CPU A(当前执行 BPF 程序的 CPU),而不是 CPU B(task 的目标 CPU)。

正确做法是使用 scx_bpf_task_cpu(p) 来获取 task 的目标 CPU:

c 复制代码
void BPF_STRUCT_OPS(my_running, struct task_struct *p)
{
    s32 cpu = scx_bpf_task_cpu(p);  // ← 正确:获取 task 的目标 CPU
    // 不要用 bpf_get_smp_processor_id(),那可能是不同的 CPU
}

8.2 running 的调用频率

running 在以下场景被调用:

场景 说明
正常调度切换 __schedule()pick_next_task()put_prev_set_next_task()set_next_task_scx()
Core-sched 强制调度 core-sched 机制让 SMT 同级线程上的任务被强制切换
Task 亲和性变更 task 被迁移到新 CPU 时,在新 CPU 的 set_next_task_scx 中触发

running 不会在以下场景被调用:

  • next == prev(同一个任务继续运行)
  • task 不在 SCX_TASK_QUEUED 状态
  • BPF 调度器没有注册 running 回调

8.3 running 中不能做阻塞操作

running 回调运行在 rq lock 保护的上下文中(SCX_KF_REST),此时:

  • 当前 CPU 的运行队列锁(rq->lock)被持有
  • 中断被禁用
  • 内核处于原子上下文

在这些约束下,任何可能导致睡眠的操作都是禁止的,包括:

  • 使用 bpf_ktime_get_ns() 是安全的(不阻塞)
  • 使用 bpf_printk() 是安全的
  • 任何涉及内存分配、锁竞争、I/O 等操作都可能导致死锁

8.4 running 回调中可以修改 task 的调度参数吗?

可以,但有约束。常见的做法包括:

  • 更新 per-task 的调度元数据(如 p->scx.dsq_vtime
  • 更新 BPF map 中的统计信息
  • 修改自定义的调度策略参数

不能 修改会影响调度器核心行为的字段,如 p->scx.slice(时间片)在 running 中修改是安全的,但修改 p->scx.weight 应该通过 ops.set_weight() 来完成。


9. 实战:scx_simple 中的 running 实现

scx_simple 是 Linux 内核自带的示例 sched_ext 调度器(tools/scheduling/scx_simple.bpf.c),实现了一个全局加权 vtime 调度器。

它的 running用来推进全局 vtime 时钟**。当 task 开始执行时,如果它的 vtime 比当前全局时钟还大(说明这个 task 被饿了一段时间),就把全局时钟推到它的 vtime 位置。这样:

  • vtime 较小的 task(亏欠更多的)仍然会被优先调度。
c 复制代码
static u64 vtime_now;  // 全局 vtime 时钟

void BPF_STRUCT_OPS(simple_running, struct task_struct *p)
{
    if (fifo_sched)
        return;

    /*
     * Global vtime always progresses forward as tasks start executing. The
     * test and update can be performed concurrently from multiple CPUs and
     * thus racy. Any error should be contained and temporary. Let's just
     * live with it.
     */
    if (time_before(vtime_now, p->scx.dsq_vtime))
        vtime_now = p->scx.dsq_vtime;
}

10. running 与其他回调的完整时序

以下是 scx 任务从创建到调度执行的完整回调时序:
BPF 调度器 scx 框架 fork 流程 BPF 调度器 scx 框架 fork 流程 #mermaid-svg-U6ptw8W2AtDIUtFu{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-U6ptw8W2AtDIUtFu .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-U6ptw8W2AtDIUtFu .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-U6ptw8W2AtDIUtFu .error-icon{fill:#552222;}#mermaid-svg-U6ptw8W2AtDIUtFu .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-U6ptw8W2AtDIUtFu .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-U6ptw8W2AtDIUtFu .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-U6ptw8W2AtDIUtFu .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-U6ptw8W2AtDIUtFu .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-U6ptw8W2AtDIUtFu .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-U6ptw8W2AtDIUtFu .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-U6ptw8W2AtDIUtFu .marker{fill:#333333;stroke:#333333;}#mermaid-svg-U6ptw8W2AtDIUtFu .marker.cross{stroke:#333333;}#mermaid-svg-U6ptw8W2AtDIUtFu svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-U6ptw8W2AtDIUtFu p{margin:0;}#mermaid-svg-U6ptw8W2AtDIUtFu .actor{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-U6ptw8W2AtDIUtFu text.actor>tspan{fill:black;stroke:none;}#mermaid-svg-U6ptw8W2AtDIUtFu .actor-line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-U6ptw8W2AtDIUtFu .innerArc{stroke-width:1.5;stroke-dasharray:none;}#mermaid-svg-U6ptw8W2AtDIUtFu .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333;}#mermaid-svg-U6ptw8W2AtDIUtFu .messageLine1{stroke-width:1.5;stroke-dasharray:2,2;stroke:#333;}#mermaid-svg-U6ptw8W2AtDIUtFu #arrowhead path{fill:#333;stroke:#333;}#mermaid-svg-U6ptw8W2AtDIUtFu .sequenceNumber{fill:white;}#mermaid-svg-U6ptw8W2AtDIUtFu #sequencenumber{fill:#333;}#mermaid-svg-U6ptw8W2AtDIUtFu #crosshead path{fill:#333;stroke:#333;}#mermaid-svg-U6ptw8W2AtDIUtFu .messageText{fill:#333;stroke:none;}#mermaid-svg-U6ptw8W2AtDIUtFu .labelBox{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-U6ptw8W2AtDIUtFu .labelText,#mermaid-svg-U6ptw8W2AtDIUtFu .labelText>tspan{fill:black;stroke:none;}#mermaid-svg-U6ptw8W2AtDIUtFu .loopText,#mermaid-svg-U6ptw8W2AtDIUtFu .loopText>tspan{fill:black;stroke:none;}#mermaid-svg-U6ptw8W2AtDIUtFu .loopLine{stroke-width:2px;stroke-dasharray:2,2;stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);}#mermaid-svg-U6ptw8W2AtDIUtFu .note{stroke:#aaaa33;fill:#fff5ad;}#mermaid-svg-U6ptw8W2AtDIUtFu .noteText,#mermaid-svg-U6ptw8W2AtDIUtFu .noteText>tspan{fill:black;stroke:none;}#mermaid-svg-U6ptw8W2AtDIUtFu .activation0{fill:#f4f4f4;stroke:#666;}#mermaid-svg-U6ptw8W2AtDIUtFu .activation1{fill:#f4f4f4;stroke:#666;}#mermaid-svg-U6ptw8W2AtDIUtFu .activation2{fill:#f4f4f4;stroke:#666;}#mermaid-svg-U6ptw8W2AtDIUtFu .actorPopupMenu{position:absolute;}#mermaid-svg-U6ptw8W2AtDIUtFu .actorPopupMenuPanel{position:absolute;fill:#ECECFF;box-shadow:0px 8px 16px 0px rgba(0,0,0,0.2);filter:drop-shadow(3px 5px 2px rgb(0 0 0 / 0.4));}#mermaid-svg-U6ptw8W2AtDIUtFu .actor-man line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;}#mermaid-svg-U6ptw8W2AtDIUtFu .actor-man circle,#mermaid-svg-U6ptw8W2AtDIUtFu line{stroke:hsl(259.6261682243, 59.7765363128%, 87.9019607843%);fill:#ECECFF;stroke-width:2px;}#mermaid-svg-U6ptw8W2AtDIUtFu :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 新任务创建 状态: NONE → INIT → READY 状态: READY → ENABLED 任务被唤醒 调度循环(每次调度都会发生) next 在 CPU 上执行 时间片耗尽或被抢占 loop 每次调度切换 任务退出/阻塞 scx_fork() ops.init_task(p, fork=true) scx_post_fork() ops.enable(p) ops.set_weight(p, weight) ops.runnable(p, enq_flags) ops.select_cpu(p, prev_cpu, wake_flags) ops.enqueue(p, enq_flags) ops.dispatch(cpu, prev) ops.stopping(prev, runnable) ops.running(next) ops.stopping(p, runnable=false) ops.quiescent(p, deq_flags)

12. 总结:写给 eBPF 调度器开发者的 check list

#mermaid-svg-xtyoIG4mePsJukkV{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-xtyoIG4mePsJukkV .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-xtyoIG4mePsJukkV .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-xtyoIG4mePsJukkV .error-icon{fill:#552222;}#mermaid-svg-xtyoIG4mePsJukkV .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-xtyoIG4mePsJukkV .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-xtyoIG4mePsJukkV .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-xtyoIG4mePsJukkV .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-xtyoIG4mePsJukkV .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-xtyoIG4mePsJukkV .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-xtyoIG4mePsJukkV .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-xtyoIG4mePsJukkV .marker{fill:#333333;stroke:#333333;}#mermaid-svg-xtyoIG4mePsJukkV .marker.cross{stroke:#333333;}#mermaid-svg-xtyoIG4mePsJukkV svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-xtyoIG4mePsJukkV p{margin:0;}#mermaid-svg-xtyoIG4mePsJukkV .edge{stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .section--1 rect,#mermaid-svg-xtyoIG4mePsJukkV .section--1 path,#mermaid-svg-xtyoIG4mePsJukkV .section--1 circle,#mermaid-svg-xtyoIG4mePsJukkV .section--1 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section--1 path{fill:hsl(240, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section--1 text{fill:#ffffff;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon--1{font-size:40px;color:#ffffff;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge--1{stroke:hsl(240, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth--1{stroke-width:17;}#mermaid-svg-xtyoIG4mePsJukkV .section--1 line{stroke:hsl(60, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-0 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-0 path,#mermaid-svg-xtyoIG4mePsJukkV .section-0 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-0 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-0 path{fill:hsl(60, 100%, 73.5294117647%);}#mermaid-svg-xtyoIG4mePsJukkV .section-0 text{fill:black;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-0{font-size:40px;color:black;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-0{stroke:hsl(60, 100%, 73.5294117647%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-0{stroke-width:14;}#mermaid-svg-xtyoIG4mePsJukkV .section-0 line{stroke:hsl(240, 100%, 83.5294117647%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-1 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-1 path,#mermaid-svg-xtyoIG4mePsJukkV .section-1 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-1 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-1 path{fill:hsl(80, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-1 text{fill:black;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-1{font-size:40px;color:black;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-1{stroke:hsl(80, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-1{stroke-width:11;}#mermaid-svg-xtyoIG4mePsJukkV .section-1 line{stroke:hsl(260, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-2 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-2 path,#mermaid-svg-xtyoIG4mePsJukkV .section-2 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-2 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-2 path{fill:hsl(270, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-2 text{fill:#ffffff;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-2{font-size:40px;color:#ffffff;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-2{stroke:hsl(270, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-2{stroke-width:8;}#mermaid-svg-xtyoIG4mePsJukkV .section-2 line{stroke:hsl(90, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-3 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-3 path,#mermaid-svg-xtyoIG4mePsJukkV .section-3 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-3 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-3 path{fill:hsl(300, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-3 text{fill:black;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-3{font-size:40px;color:black;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-3{stroke:hsl(300, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-3{stroke-width:5;}#mermaid-svg-xtyoIG4mePsJukkV .section-3 line{stroke:hsl(120, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-4 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-4 path,#mermaid-svg-xtyoIG4mePsJukkV .section-4 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-4 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-4 path{fill:hsl(330, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-4 text{fill:black;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-4{font-size:40px;color:black;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-4{stroke:hsl(330, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-4{stroke-width:2;}#mermaid-svg-xtyoIG4mePsJukkV .section-4 line{stroke:hsl(150, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-5 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-5 path,#mermaid-svg-xtyoIG4mePsJukkV .section-5 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-5 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-5 path{fill:hsl(0, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-5 text{fill:black;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-5{font-size:40px;color:black;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-5{stroke:hsl(0, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-5{stroke-width:-1;}#mermaid-svg-xtyoIG4mePsJukkV .section-5 line{stroke:hsl(180, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-6 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-6 path,#mermaid-svg-xtyoIG4mePsJukkV .section-6 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-6 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-6 path{fill:hsl(30, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-6 text{fill:black;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-6{font-size:40px;color:black;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-6{stroke:hsl(30, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-6{stroke-width:-4;}#mermaid-svg-xtyoIG4mePsJukkV .section-6 line{stroke:hsl(210, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-7 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-7 path,#mermaid-svg-xtyoIG4mePsJukkV .section-7 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-7 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-7 path{fill:hsl(90, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-7 text{fill:black;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-7{font-size:40px;color:black;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-7{stroke:hsl(90, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-7{stroke-width:-7;}#mermaid-svg-xtyoIG4mePsJukkV .section-7 line{stroke:hsl(270, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-8 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-8 path,#mermaid-svg-xtyoIG4mePsJukkV .section-8 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-8 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-8 path{fill:hsl(150, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-8 text{fill:black;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-8{font-size:40px;color:black;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-8{stroke:hsl(150, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-8{stroke-width:-10;}#mermaid-svg-xtyoIG4mePsJukkV .section-8 line{stroke:hsl(330, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-9 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-9 path,#mermaid-svg-xtyoIG4mePsJukkV .section-9 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-9 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-9 path{fill:hsl(180, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-9 text{fill:black;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-9{font-size:40px;color:black;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-9{stroke:hsl(180, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-9{stroke-width:-13;}#mermaid-svg-xtyoIG4mePsJukkV .section-9 line{stroke:hsl(0, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-10 rect,#mermaid-svg-xtyoIG4mePsJukkV .section-10 path,#mermaid-svg-xtyoIG4mePsJukkV .section-10 circle,#mermaid-svg-xtyoIG4mePsJukkV .section-10 polygon,#mermaid-svg-xtyoIG4mePsJukkV .section-10 path{fill:hsl(210, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-10 text{fill:black;}#mermaid-svg-xtyoIG4mePsJukkV .node-icon-10{font-size:40px;color:black;}#mermaid-svg-xtyoIG4mePsJukkV .section-edge-10{stroke:hsl(210, 100%, 76.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .edge-depth-10{stroke-width:-16;}#mermaid-svg-xtyoIG4mePsJukkV .section-10 line{stroke:hsl(30, 100%, 86.2745098039%);stroke-width:3;}#mermaid-svg-xtyoIG4mePsJukkV .disabled,#mermaid-svg-xtyoIG4mePsJukkV .disabled circle,#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:lightgray;}#mermaid-svg-xtyoIG4mePsJukkV .disabled text{fill:#efefef;}#mermaid-svg-xtyoIG4mePsJukkV .section-root rect,#mermaid-svg-xtyoIG4mePsJukkV .section-root path,#mermaid-svg-xtyoIG4mePsJukkV .section-root circle,#mermaid-svg-xtyoIG4mePsJukkV .section-root polygon{fill:hsl(240, 100%, 46.2745098039%);}#mermaid-svg-xtyoIG4mePsJukkV .section-root text{fill:#ffffff;}#mermaid-svg-xtyoIG4mePsJukkV .section-root span{color:#ffffff;}#mermaid-svg-xtyoIG4mePsJukkV .section-2 span{color:#ffffff;}#mermaid-svg-xtyoIG4mePsJukkV .icon-container{height:100%;display:flex;justify-content:center;align-items:center;}#mermaid-svg-xtyoIG4mePsJukkV .edge{fill:none;}#mermaid-svg-xtyoIG4mePsJukkV .mindmap-node-label{dy:1em;alignment-baseline:middle;text-anchor:middle;dominant-baseline:middle;text-align:center;}#mermaid-svg-xtyoIG4mePsJukkV :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} running要点
触发时机
每次调度切换时触发
与stopping配对 非一次性
执行顺序
先stopping旧任务
后running新任务
跨CPU陷阱
可能不在目标CPU上执行
必须用scx_bpf_task_cpu获取CPU
上下文约束
持有rq lock 不能阻塞
只能调用SCX_KF_REST
典型用途
推进全局vtime时钟
与stopping配合实现vtime调度
core-sched
强制调度时也会调用
触发SCX_DEQ_CORE_SCHED_EXEC

关注点 要点
触发时机 每次任务被调度到 CPU 执行时,非一次性
调用频率 stopping 配对,每次调度切换调用一次
执行顺序 先 stopping(旧任务),后 running(新任务)
跨 CPU 调用 running 可能在不同于 task 目标 CPU 的 CPU 上调用,使用 scx_bpf_task_cpu(p)
上下文约束 rq lock 持有状态,不能做阻塞操作
典型用途 推进全局时钟
core-sched core-sched 强制调度时,running 也会被调用(通过 SCX_DEQ_CORE_SCHED_EXEC 标志)

参考资料

  • Linux 6.18.26 内核源码 kernel/sched/
  • scx_simple 调度器源码 tools/sched_ext/scx_simple.bpf.c
相关推荐
say_fall1 小时前
Linux系统编程(十一):深入理解Linux进程地址空间
android·linux·运维
流浪0011 小时前
Linux篇(十):取代命令行 GDB?CGDB 可视化调试全解析
linux·运维·服务器
键盘上的猫头鹰1 小时前
【Linux 基础教程(五)】磁盘管理、挂载硬盘、系统状态检测与软件安装(RPM/YUM)
linux·运维·服务器
杨某不才1 小时前
Linux服务器离线安装docker
linux·服务器·docker
feng_you_ying_li2 小时前
Linux 之线程封装,线程的同步与互斥,互斥锁的介绍
linux·c++·算法
feng_you_ying_li2 小时前
Linux 线程之 pthread 库的介绍和每个线程独立空间的说明
linux·运维
来点抹茶吗2 小时前
U-Boot、内核移植与根文件系统构建(BeagleBone Green Gateway&AM335X)
linux·嵌入式硬件·ubuntu·debian
Linux运维老纪2 小时前
nginx 打造高性能 API 网关(‌Building a High-Performance API Gateway with Nginx)
linux·运维·mysql·nginx·云计算·运维开发
YXXY31310 小时前
线程的介绍(四)
linux