内核日志分析:__spi_pump_messages的Caller_Optimization和KWorker_Thread

SPI测试内核日志分析

900.082769 SPICORE_TRACE __spi_sync: = => Enter.

900.088086 SPICORE_TRACE __spi_sync: the master of this device is: spi0 (Bus Number: 0).

900.096874 SPICORE_TRACE __spi_sync: master->transfer == spi_queued_transfer, Queueing message (modern path).

900.107067 SPICORE_TRACE __spi_sync: kthread_queue_work in __spi_queued_transfer which bindings the __spi_pump_messages has initialized.

900.121213 SPICORE_TRACE __spi_sync: calling for __spi_pump_messages to processing queue...

900.130043 SPICORE_TRACE __spi_sync: it is trying run in the caller context in current thread (such as ioctl/spi sync).

900.141692 SPICORE_TRACE __spi_pump_messages: => Enter. Context: Caller_Optimization.

900.151260 SPICORE_TRACE __spi_pump_messages: Dequeued message 88659e98. Queue length decremented.

900.160983 SPICORE_TRACE __spi_pump_messages: Acquired I/O Mutex: master->io_mutex.

900.169078 SPICORE_TRACE __spi_pump_messages: Hardware was IDLE. Waking up the transfer hardware.

900.178718 SPICORE_TRACE __spi_pump_messages: Calling master->prepare_message for msg 88659e98

900.188040 SPICORE_TRACE __spi_pump_messages: Mapping DMA buffers (spi_map_msg)...

900.196309\] \[SPICORE_TRACE\] __spi_pump_messages: master-\>transfer_one_message points to \[spi_transfer_one_message+0x0/0x558

900.207931 SPICORE_TRACE __spi_pump_messages: <== Exit. Handover to Master Driver (transfer_one_message).

900.219352 SPICORE_TRACE __spi_pump_messages: => Enter. Context: KWorker_Thread.

root@100ask:\~# 900.227785 SPICORE_TRACE __spi_sync: master->cur_msg is not NULL. Going to sleep... wait_for_completion.

900.227796 SPICORE_TRACE __spi_sync: Woke up! Transfer finished. Status=0

900.227800 SPICORE_TRACE __spi_sync: <== Exit.

900.227804 SPICORE_TRACE spi_sync: __spi_sync returned Success.

900.227807 SPICORE_TRACE spi_sync: <== Exit.

900.227811 SPIDEV_TRACE spidev_sync: The hardware action has been completed.

900.227817 SPIDEV_TRACE spidev_sync: <== Exit. Final status=2, Actual Length=2 bytes

900.227821 SPIDEV_TRACE spidev_message: Sync done! Copying RX data back to user.

900.227828 SPIDEV_TRACE spidev_message: <== Success and Exit.

900.227831 SPIDEV_TRACE spidev_ioctl: Transfer Complete. Total Bytes=2

900.227837 SPIDEV_TRACE spidev_ioctl: <== Exit.

900.227876 SPIDEV_TRACE spidev_release: => Enter. (pid=341)

900.227882 SPIDEV_TRACE spidev_release: current users=0, last user closed.

900.227885 SPIDEV_TRACE spidev_release: Cleaning up resources: kfree for tx_buffer and rx_buffer.

900.227890 SPIDEV_TRACE spidev_release: Device is still present. Keeping spidev_data object alive.

900.227894 SPIDEV_TRACE spidev_release: <== Exit.

900.350416 SPICORE_TRACE __spi_pump_messages: Queue is idle. Exiting.

900.359811 SPICORE_TRACE __spi_pump_messages: kthread_queue_work. Exiting.

问题分析:第二次进入__spi_pump_messages后静默返回

一、问题现象

复制代码
[ 1016.073778] __spi_pump_messages: <== Exit. Handover to Master Driver
[ 1016.084287] __spi_pump_messages: ==> Enter. Context: KWorker_Thread.
[root@100ask:~]# [ 1016.092134] __spi_sync: master->cur_msg is not NULL...
                  ↑ 这里出现了shell提示符!
关键观察
  1. 第二次进入后只打印了"Enter"就没有后续输出
  2. 出现了[root@100ask:~]#提示符(说明用户态程序已返回)
  3. 但内核日志仍在继续(说明内核线程仍在运行)
  4. 在日志最后才打印了

二、返回位置

c 复制代码
if (list_empty(&master->queue) || !master->running) {
    printk("[SPICORE_TRACE] __spi_pump_messages: Queue is idle. Exiting.\n");
    
    if (!master->busy) {
        // ❌ 不是这里
        return;
    }
    
    /* Only do teardown in the thread */
    if (!in_kthread) {
        kthread_queue_work(&master->kworker, &master->pump_messages);
        // ❌ 也不是这里(因为in_kthread=true)
        return;
    }
    
    // ✅ 是这里!继续执行teardown逻辑
    master->busy = false;
    master->idling = true;
    // ... teardown代码 ...
    printk("[SPICORE_TRACE] __spi_pump_messages: kthread_queue_work. Exiting.\n");
    return;  // ← 在这里返回!
}
```

三、为什么日志被"吞掉"了?

原因:printk缓冲区刷新时机

关键事实

c 复制代码
printk(KERN_INFO "[SPICORE_TRACE] __spi_pump_messages: Controller BUSY...\n");

这行代码已经执行了 ,但日志没有立即显示

printk的缓冲机制
复制代码
printk() 
    ↓
写入内核日志缓冲区 (log buffer)
    ↓
等待刷新时机:
├─ 缓冲区满
├─ 遇到换行符 \n(通常会立即刷新)
├─ 主动调用 console_unlock()
└─ 系统调度点 (调度器触发)
    ↓
显示到控制台/dmesg

四、完整执行流程重建

时间线分析
复制代码
T0: 用户调用 ioctl(fd, SPI_IOC_MESSAGE, xfer)
    ↓
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
第一次进入 __spi_pump_messages (in_kthread=false)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

T1: [900.207931] <== Exit. Handover to Master Driver
    ↓
    检查:
    - master->cur_msg = NULL? ❌ (有消息)
    - master->idling = true? ❌
    - queue empty? ❌ (刚加入的消息)
    ↓
    提取消息 → master->cur_msg = 8858be98
    master->busy = false → true (硬件从IDLE变为BUSY)
    ↓
    唤醒硬件 (prepare_transfer_hardware)
    准备消息 (prepare_message)
    映射DMA (spi_map_msg)
    ↓
    调用 master->transfer_one_message()
    ↓
    传输启动(DMA/中断方式)
    ↓
    函数返回 → 但传输在后台继续!

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
第二次进入 __spi_pump_messages (in_kthread=true)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

T2: [900.219352] ==> Enter. Context: KWorker_Thread.
    ↓
    检查1:master->cur_msg != NULL?
    ↓
    ⚠️ 此时的关键状态:
    - master->cur_msg = 8858be98 (第一次的消息还在传输中)
    - master->queue = EMPTY (只有一个消息,已被提取)
    - master->busy = true (硬件正在工作)
    ↓
    进入分支:list_empty(&master->queue) = TRUE
    ↓
    打印:"Queue is idle. Exiting."
    ↓
    检查2:!master->busy? ❌ (busy=true,硬件还在干活)
    ↓
    检查3:!in_kthread? ❌ (in_kthread=true,是kworker上下文)
    ↓
    执行 TEARDOWN 逻辑:
    - master->busy = true → false
    - master->idling = true
    - 释放 dummy_rx/dummy_tx
    - 调用 unprepare_transfer_hardware (关闭硬件)
    - pm_runtime_put_autosuspend (电源管理)
    - master->idling = false
    ↓
    打印:"kthread_queue_work. Exiting."
    返回
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
与此同时,用户线程在等待
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

T3: [900.227785] __spi_sync: Going to sleep... wait_for_completion.
    ↓
    用户线程阻塞在:wait_for_completion(&done)
    等待传输完成通知
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
传输完成(中断触发)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

T4: DMA/中断完成
    ↓
    spi_finalize_current_message()
    - master->cur_msg = NULL
    - complete(msg->context) → 唤醒 wait_for_completion
    ↓
    [900.227796] __spi_sync: Woke up! Transfer finished.
    ↓
    用户线程继续执行 → 拷贝RX数据 → ioctl返回
    ↓
    [900.227876] spidev_release: close(fd)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Kworker再次被调度(延迟的teardown日志)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

T5: [900.350416] Queue is idle. Exiting.
    [900.359811] kthread_queue_work. Exiting.
    ↑ 这是 T2 时刻执行的 teardown 的延迟日志!

五、关键问题:为什么日志延迟了?

答案:printk的异步输出 + kworker的低优先级
c 复制代码
/* T2 时刻的代码执行 */
master->busy = false;
master->idling = true;
spin_unlock_irqrestore(&master->queue_lock, flags);

kfree(master->dummy_rx);  // ← 耗时操作
kfree(master->dummy_tx);  // ← 耗时操作

if (master->unprepare_transfer_hardware)
    master->unprepare_transfer_hardware(master);  // ← 关闭硬件,可能很慢

if (master->auto_runtime_pm) {
    pm_runtime_mark_last_busy(master->dev.parent);
    pm_runtime_put_autosuspend(master->dev.parent);  // ← 电源管理操作
}

trace_spi_master_idle(master);

spin_lock_irqsave(&master->queue_lock, flags);
master->idling = false;
spin_unlock_irqrestore(&master->queue_lock, flags);

printk("[SPICORE_TRACE] kthread_queue_work. Exiting.\n");  // ← 最后才打印
return;

时间开销

  • kfree(): ~微秒级
  • unprepare_transfer_hardware(): 可能涉及寄存器操作,~毫秒级
  • pm_runtime_put_autosuspend(): 可能触发电源域关闭,~毫秒级

总时间 :T2 → T5 = 900.350 - 900.219 = 131ms


六、为什么会有Teardown?

设计哲学:省电优化
复制代码
传输完成后的选择:
┌─────────────────────────┐
│ 选项A:保持硬件开启       │
│ - 优点:下次传输快        │
│ - 缺点:持续耗电          │
└─────────────────────────┘

┌─────────────────────────┐
│ 选项B:关闭硬件省电       │ ← Linux SPI选择这个
│ - 优点:节能             │
│ - 缺点:下次需要重新初始化 │
└─────────────────────────┘
Teardown的触发条件
c 复制代码
if (list_empty(&master->queue) || !master->running) {
    // 队列空了,没有新任务
    
    if (!master->busy) {
        // 硬件已经是关闭状态,直接返回
        return;
    }
    
    if (!in_kthread) {
        // 不能在用户上下文关闭硬件(可能很慢)
        // 交给kworker异步处理
        kthread_queue_work(&master->kworker, &master->pump_messages);
        return;
    }
    
    // 在kworker上下文中执行teardown
    master->busy = false;
    // 关闭DMA、时钟、电源域...
}

七、修正后的日志解读

正确的事件顺序
复制代码
[900.207931] <== Exit. Handover to Master Driver
              ↑ 第一次调用返回,传输已启动

[900.219352] ==> Enter. Context: KWorker_Thread
              ↑ kworker被唤醒,检查是否有新任务

[900.227785] __spi_sync: Going to sleep...
              ↑ 用户线程等待传输完成

[900.227796] __spi_sync: Woke up!
              ↑ 中断完成,用户线程被唤醒

[900.227876] spidev_release
              ↑ 用户进程关闭设备文件

[900.350416] Queue is idle. Exiting.
[900.359811] kthread_queue_work. Exiting.
              ↑ kworker的teardown日志(延迟输出)
延迟的原因

不是日志缓冲! 而是真实的执行延迟

复制代码
T2: 900.219 - kworker开始teardown
    ↓
    执行 unprepare_transfer_hardware()
    执行 pm_runtime_put_autosuspend()
    ↓
T5: 900.350 - teardown完成,打印日志

131ms的延迟来自

  • 硬件关闭操作
  • 电源域切换
  • 可能的延迟工作队列

七、为什么在Queue Empty时还要teardown?

完整的状态检查逻辑
c 复制代码
if (list_empty(&master->queue) || !master->running) {
    /* 情况1:队列空 & 硬件已关 → 什么都不做 */
    if (!master->busy) {
        return;
    }
    
    /* 情况2:队列空 & 硬件还开着 → 需要关闭省电 */
    if (in_kthread) {
        // 执行 teardown
    } else {
        // 延迟到 kworker 执行
    }
}
实际情况
复制代码
第二次进入时的状态:
- list_empty(&master->queue) = TRUE  ← 唯一的消息已被提取
- master->cur_msg = 8858be98         ← 但还在传输中
- master->busy = TRUE                ← 硬件正在工作
- in_kthread = TRUE                  ← 在kworker上下文

结论:
需要等传输完成后关闭硬件(teardown)

八、优化建议

如果你想减少延迟,可以:
c 复制代码
/* 在 teardown 前后添加时间戳 */
if (!in_kthread) {
    // ...
} else {
    unsigned long start = jiffies;
    
    master->busy = false;
    master->idling = true;
    spin_unlock_irqrestore(&master->queue_lock, flags);

    printk("[SPICORE_TRACE] Teardown START at jiffies=%lu\n", jiffies);
    
    kfree(master->dummy_rx);
    kfree(master->dummy_tx);
    
    if (master->unprepare_transfer_hardware) {
        unsigned long hw_start = jiffies;
        master->unprepare_transfer_hardware(master);
        printk("[SPICORE_TRACE] unprepare_transfer_hardware took %lu ms\n",
               jiffies_to_msecs(jiffies - hw_start));
    }
    
    if (master->auto_runtime_pm) {
        unsigned long pm_start = jiffies;
        pm_runtime_mark_last_busy(master->dev.parent);
        pm_runtime_put_autosuspend(master->dev.parent);
        printk("[SPICORE_TRACE] pm_runtime ops took %lu ms\n",
               jiffies_to_msecs(jiffies - pm_start));
    }
    
    printk("[SPICORE_TRACE] Teardown DONE, total %lu ms\n",
           jiffies_to_msecs(jiffies - start));
}

九、总结

第二次进入的完整真相
  1. 进入原因:第一次调用结束后,kworker被调度检查队列
  2. 返回位置if (list_empty(&queue)) { ... teardown ... }
  3. 执行内容:关闭硬件、释放资源、电源管理
  4. 延迟原因:硬件操作本身就慢(131ms)
  5. 是否正常:✅ 完全正常,这是设计的省电机制
关键理解
复制代码
用户视角:
ioctl → 等待 → 完成 → close
        ↑___131ms___↑

内核视角:
启动传输 → kworker检查 → 传输完成 → teardown
           ↑_______________________↑
              (后台异步执行)
相关推荐
m0_377108144 小时前
STM32-adc
stm32·单片机·嵌入式硬件
SmartRadio6 小时前
STM32WLE5 LoRa Smart TDMA 完整协议栈实现(工程级可直接编译)-【1】
javascript·stm32·单片机·嵌入式硬件·lora·自组网·smart tdma
Deitymoon10 小时前
FreeRTOS——中断实验
stm32·单片机
yugi98783810 小时前
STM32 串口计算器实现
stm32·单片机·嵌入式硬件
科芯创展11 小时前
XZ4115B工作电压6-40V 输出电流1.2A 降压恒流LED驱动芯片
stm32·单片机·嵌入式硬件
涂山苏苏⁠12 小时前
stm32之SPI
stm32
MC_J13 小时前
STM32H7 串口 UART/USART从原理到实战
stm32·单片机·嵌入式硬件
学不懂飞行器14 小时前
电赛保姆级教程】从炸管到国一:电赛电源类(DC-DC/单相逆变)硬核避坑与拓扑全指南
stm32·单片机·嵌入式硬件·电赛·fft
JNX_SEMI15 小时前
EG1160:600V半桥驱动,2.5A强驱带保护
stm32·单片机·嵌入式硬件
qq_4294995715 小时前
STM32串口中断接收
stm32·单片机·嵌入式硬件