SPI测试内核日志分析
900.082769 SPICORE_TRACE __spi_sync: = => Enter.
900.088086 SPICORE_TRACE __spi_sync: the master of this device is: spi0 (Bus Number: 0).
900.096874 SPICORE_TRACE __spi_sync: master->transfer == spi_queued_transfer, Queueing message (modern path).
900.107067 SPICORE_TRACE __spi_sync: kthread_queue_work in __spi_queued_transfer which bindings the __spi_pump_messages has initialized.
900.121213 SPICORE_TRACE __spi_sync: calling for __spi_pump_messages to processing queue...
900.130043 SPICORE_TRACE __spi_sync: it is trying run in the caller context in current thread (such as ioctl/spi sync).
900.141692 SPICORE_TRACE __spi_pump_messages: => Enter. Context: Caller_Optimization.
900.151260 SPICORE_TRACE __spi_pump_messages: Dequeued message 88659e98. Queue length decremented.
900.160983 SPICORE_TRACE __spi_pump_messages: Acquired I/O Mutex: master->io_mutex.
900.169078 SPICORE_TRACE __spi_pump_messages: Hardware was IDLE. Waking up the transfer hardware.
900.178718 SPICORE_TRACE __spi_pump_messages: Calling master->prepare_message for msg 88659e98
900.188040 SPICORE_TRACE __spi_pump_messages: Mapping DMA buffers (spi_map_msg)...
900.196309\] \[SPICORE_TRACE\] __spi_pump_messages: master-\>transfer_one_message points to \[spi_transfer_one_message+0x0/0x558
900.207931 SPICORE_TRACE __spi_pump_messages: <== Exit. Handover to Master Driver (transfer_one_message).
900.219352 SPICORE_TRACE __spi_pump_messages: => Enter. Context: KWorker_Thread.
root@100ask:\~# 900.227785 SPICORE_TRACE __spi_sync: master->cur_msg is not NULL. Going to sleep... wait_for_completion.
900.227796 SPICORE_TRACE __spi_sync: Woke up! Transfer finished. Status=0
900.227800 SPICORE_TRACE __spi_sync: <== Exit.
900.227804 SPICORE_TRACE spi_sync: __spi_sync returned Success.
900.227807 SPICORE_TRACE spi_sync: <== Exit.
900.227811 SPIDEV_TRACE spidev_sync: The hardware action has been completed.
900.227817 SPIDEV_TRACE spidev_sync: <== Exit. Final status=2, Actual Length=2 bytes
900.227821 SPIDEV_TRACE spidev_message: Sync done! Copying RX data back to user.
900.227828 SPIDEV_TRACE spidev_message: <== Success and Exit.
900.227831 SPIDEV_TRACE spidev_ioctl: Transfer Complete. Total Bytes=2
900.227837 SPIDEV_TRACE spidev_ioctl: <== Exit.
900.227876 SPIDEV_TRACE spidev_release: => Enter. (pid=341)
900.227882 SPIDEV_TRACE spidev_release: current users=0, last user closed.
900.227885 SPIDEV_TRACE spidev_release: Cleaning up resources: kfree for tx_buffer and rx_buffer.
900.227890 SPIDEV_TRACE spidev_release: Device is still present. Keeping spidev_data object alive.
900.227894 SPIDEV_TRACE spidev_release: <== Exit.
900.350416 SPICORE_TRACE __spi_pump_messages: Queue is idle. Exiting.
900.359811 SPICORE_TRACE __spi_pump_messages: kthread_queue_work. Exiting.
问题分析:第二次进入__spi_pump_messages后静默返回
一、问题现象
[ 1016.073778] __spi_pump_messages: <== Exit. Handover to Master Driver
[ 1016.084287] __spi_pump_messages: ==> Enter. Context: KWorker_Thread.
[root@100ask:~]# [ 1016.092134] __spi_sync: master->cur_msg is not NULL...
↑ 这里出现了shell提示符!
关键观察:
- 第二次进入后只打印了"Enter"就没有后续输出
- 出现了
[root@100ask:~]#提示符(说明用户态程序已返回) - 但内核日志仍在继续(说明内核线程仍在运行)
- 在日志最后才打印了
二、返回位置
c
if (list_empty(&master->queue) || !master->running) {
printk("[SPICORE_TRACE] __spi_pump_messages: Queue is idle. Exiting.\n");
if (!master->busy) {
// ❌ 不是这里
return;
}
/* Only do teardown in the thread */
if (!in_kthread) {
kthread_queue_work(&master->kworker, &master->pump_messages);
// ❌ 也不是这里(因为in_kthread=true)
return;
}
// ✅ 是这里!继续执行teardown逻辑
master->busy = false;
master->idling = true;
// ... teardown代码 ...
printk("[SPICORE_TRACE] __spi_pump_messages: kthread_queue_work. Exiting.\n");
return; // ← 在这里返回!
}
```
三、为什么日志被"吞掉"了?
原因:printk缓冲区刷新时机
关键事实:
c
printk(KERN_INFO "[SPICORE_TRACE] __spi_pump_messages: Controller BUSY...\n");
这行代码已经执行了 ,但日志没有立即显示!
printk的缓冲机制
printk()
↓
写入内核日志缓冲区 (log buffer)
↓
等待刷新时机:
├─ 缓冲区满
├─ 遇到换行符 \n(通常会立即刷新)
├─ 主动调用 console_unlock()
└─ 系统调度点 (调度器触发)
↓
显示到控制台/dmesg
四、完整执行流程重建
时间线分析
T0: 用户调用 ioctl(fd, SPI_IOC_MESSAGE, xfer)
↓
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
第一次进入 __spi_pump_messages (in_kthread=false)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
T1: [900.207931] <== Exit. Handover to Master Driver
↓
检查:
- master->cur_msg = NULL? ❌ (有消息)
- master->idling = true? ❌
- queue empty? ❌ (刚加入的消息)
↓
提取消息 → master->cur_msg = 8858be98
master->busy = false → true (硬件从IDLE变为BUSY)
↓
唤醒硬件 (prepare_transfer_hardware)
准备消息 (prepare_message)
映射DMA (spi_map_msg)
↓
调用 master->transfer_one_message()
↓
传输启动(DMA/中断方式)
↓
函数返回 → 但传输在后台继续!
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
第二次进入 __spi_pump_messages (in_kthread=true)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
T2: [900.219352] ==> Enter. Context: KWorker_Thread.
↓
检查1:master->cur_msg != NULL?
↓
⚠️ 此时的关键状态:
- master->cur_msg = 8858be98 (第一次的消息还在传输中)
- master->queue = EMPTY (只有一个消息,已被提取)
- master->busy = true (硬件正在工作)
↓
进入分支:list_empty(&master->queue) = TRUE
↓
打印:"Queue is idle. Exiting."
↓
检查2:!master->busy? ❌ (busy=true,硬件还在干活)
↓
检查3:!in_kthread? ❌ (in_kthread=true,是kworker上下文)
↓
执行 TEARDOWN 逻辑:
- master->busy = true → false
- master->idling = true
- 释放 dummy_rx/dummy_tx
- 调用 unprepare_transfer_hardware (关闭硬件)
- pm_runtime_put_autosuspend (电源管理)
- master->idling = false
↓
打印:"kthread_queue_work. Exiting."
返回
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
与此同时,用户线程在等待
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
T3: [900.227785] __spi_sync: Going to sleep... wait_for_completion.
↓
用户线程阻塞在:wait_for_completion(&done)
等待传输完成通知
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
传输完成(中断触发)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
T4: DMA/中断完成
↓
spi_finalize_current_message()
- master->cur_msg = NULL
- complete(msg->context) → 唤醒 wait_for_completion
↓
[900.227796] __spi_sync: Woke up! Transfer finished.
↓
用户线程继续执行 → 拷贝RX数据 → ioctl返回
↓
[900.227876] spidev_release: close(fd)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Kworker再次被调度(延迟的teardown日志)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
T5: [900.350416] Queue is idle. Exiting.
[900.359811] kthread_queue_work. Exiting.
↑ 这是 T2 时刻执行的 teardown 的延迟日志!
五、关键问题:为什么日志延迟了?
答案:printk的异步输出 + kworker的低优先级
c
/* T2 时刻的代码执行 */
master->busy = false;
master->idling = true;
spin_unlock_irqrestore(&master->queue_lock, flags);
kfree(master->dummy_rx); // ← 耗时操作
kfree(master->dummy_tx); // ← 耗时操作
if (master->unprepare_transfer_hardware)
master->unprepare_transfer_hardware(master); // ← 关闭硬件,可能很慢
if (master->auto_runtime_pm) {
pm_runtime_mark_last_busy(master->dev.parent);
pm_runtime_put_autosuspend(master->dev.parent); // ← 电源管理操作
}
trace_spi_master_idle(master);
spin_lock_irqsave(&master->queue_lock, flags);
master->idling = false;
spin_unlock_irqrestore(&master->queue_lock, flags);
printk("[SPICORE_TRACE] kthread_queue_work. Exiting.\n"); // ← 最后才打印
return;
时间开销:
kfree(): ~微秒级unprepare_transfer_hardware(): 可能涉及寄存器操作,~毫秒级pm_runtime_put_autosuspend(): 可能触发电源域关闭,~毫秒级
总时间 :T2 → T5 = 900.350 - 900.219 = 131ms
六、为什么会有Teardown?
设计哲学:省电优化
传输完成后的选择:
┌─────────────────────────┐
│ 选项A:保持硬件开启 │
│ - 优点:下次传输快 │
│ - 缺点:持续耗电 │
└─────────────────────────┘
┌─────────────────────────┐
│ 选项B:关闭硬件省电 │ ← Linux SPI选择这个
│ - 优点:节能 │
│ - 缺点:下次需要重新初始化 │
└─────────────────────────┘
Teardown的触发条件
c
if (list_empty(&master->queue) || !master->running) {
// 队列空了,没有新任务
if (!master->busy) {
// 硬件已经是关闭状态,直接返回
return;
}
if (!in_kthread) {
// 不能在用户上下文关闭硬件(可能很慢)
// 交给kworker异步处理
kthread_queue_work(&master->kworker, &master->pump_messages);
return;
}
// 在kworker上下文中执行teardown
master->busy = false;
// 关闭DMA、时钟、电源域...
}
七、修正后的日志解读
正确的事件顺序
[900.207931] <== Exit. Handover to Master Driver
↑ 第一次调用返回,传输已启动
[900.219352] ==> Enter. Context: KWorker_Thread
↑ kworker被唤醒,检查是否有新任务
[900.227785] __spi_sync: Going to sleep...
↑ 用户线程等待传输完成
[900.227796] __spi_sync: Woke up!
↑ 中断完成,用户线程被唤醒
[900.227876] spidev_release
↑ 用户进程关闭设备文件
[900.350416] Queue is idle. Exiting.
[900.359811] kthread_queue_work. Exiting.
↑ kworker的teardown日志(延迟输出)
延迟的原因
不是日志缓冲! 而是真实的执行延迟:
T2: 900.219 - kworker开始teardown
↓
执行 unprepare_transfer_hardware()
执行 pm_runtime_put_autosuspend()
↓
T5: 900.350 - teardown完成,打印日志
131ms的延迟来自:
- 硬件关闭操作
- 电源域切换
- 可能的延迟工作队列
七、为什么在Queue Empty时还要teardown?
完整的状态检查逻辑
c
if (list_empty(&master->queue) || !master->running) {
/* 情况1:队列空 & 硬件已关 → 什么都不做 */
if (!master->busy) {
return;
}
/* 情况2:队列空 & 硬件还开着 → 需要关闭省电 */
if (in_kthread) {
// 执行 teardown
} else {
// 延迟到 kworker 执行
}
}
实际情况
第二次进入时的状态:
- list_empty(&master->queue) = TRUE ← 唯一的消息已被提取
- master->cur_msg = 8858be98 ← 但还在传输中
- master->busy = TRUE ← 硬件正在工作
- in_kthread = TRUE ← 在kworker上下文
结论:
需要等传输完成后关闭硬件(teardown)
八、优化建议
如果你想减少延迟,可以:
c
/* 在 teardown 前后添加时间戳 */
if (!in_kthread) {
// ...
} else {
unsigned long start = jiffies;
master->busy = false;
master->idling = true;
spin_unlock_irqrestore(&master->queue_lock, flags);
printk("[SPICORE_TRACE] Teardown START at jiffies=%lu\n", jiffies);
kfree(master->dummy_rx);
kfree(master->dummy_tx);
if (master->unprepare_transfer_hardware) {
unsigned long hw_start = jiffies;
master->unprepare_transfer_hardware(master);
printk("[SPICORE_TRACE] unprepare_transfer_hardware took %lu ms\n",
jiffies_to_msecs(jiffies - hw_start));
}
if (master->auto_runtime_pm) {
unsigned long pm_start = jiffies;
pm_runtime_mark_last_busy(master->dev.parent);
pm_runtime_put_autosuspend(master->dev.parent);
printk("[SPICORE_TRACE] pm_runtime ops took %lu ms\n",
jiffies_to_msecs(jiffies - pm_start));
}
printk("[SPICORE_TRACE] Teardown DONE, total %lu ms\n",
jiffies_to_msecs(jiffies - start));
}
九、总结
第二次进入的完整真相
- 进入原因:第一次调用结束后,kworker被调度检查队列
- 返回位置 :
if (list_empty(&queue)) { ... teardown ... } - 执行内容:关闭硬件、释放资源、电源管理
- 延迟原因:硬件操作本身就慢(131ms)
- 是否正常:✅ 完全正常,这是设计的省电机制
关键理解
用户视角:
ioctl → 等待 → 完成 → close
↑___131ms___↑
内核视角:
启动传输 → kworker检查 → 传输完成 → teardown
↑_______________________↑
(后台异步执行)