Android SWT重启问题

SWT 又重启了?Android 系统 Watchdog 机制全拆解:为什么系统会莫名其妙自动重启,怎么抓到真凶?(附实战日志分析)

目录

  • [一、SWT 是什么](#一、SWT 是什么)
  • [二、Watchdog 的监控机制](#二、Watchdog 的监控机制)
  • [三、Watchdog 是怎样判断"卡死了"的](#三、Watchdog 是怎样判断"卡死了"的)
  • [四、SWT 重启的完整流程](#四、SWT 重启的完整流程)
  • [五、常见的 SWT 触发场景](#五、常见的 SWT 触发场景)
  • [六、实战:SWT 日志分析](#六、实战:SWT 日志分析)
  • [七、实战:复现和定位 SWT 问题](#七、实战:复现和定位 SWT 问题)
  • [八、实战:SWT 问题修复](#八、实战:SWT 问题修复)
  • [九、Watchdog 的源码走读](#九、Watchdog 的源码走读)
  • 十、常见踩坑记录
  • 十一、总结

一、SWT 是什么

SWT 全称是 Software Watchdog Timer ,软件看门狗。它的本质是 SystemServer 进程里的 Watchdog 线程------专门盯着系统核心服务的线程有没有卡死。
如果你手机突然自动重启,开机后去抓 sysdump 日志,看到 "Watchdog: *** WATCHDOG KILLING SYSTEM PROCESS ***"------这就是 SWT 干的。
#mermaid-svg-AtN9R7Dh1QxblWwQ{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-AtN9R7Dh1QxblWwQ .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-AtN9R7Dh1QxblWwQ .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-AtN9R7Dh1QxblWwQ .error-icon{fill:#552222;}#mermaid-svg-AtN9R7Dh1QxblWwQ .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-AtN9R7Dh1QxblWwQ .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-AtN9R7Dh1QxblWwQ .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-AtN9R7Dh1QxblWwQ .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-AtN9R7Dh1QxblWwQ .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-AtN9R7Dh1QxblWwQ .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-AtN9R7Dh1QxblWwQ .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-AtN9R7Dh1QxblWwQ .marker{fill:#333333;stroke:#333333;}#mermaid-svg-AtN9R7Dh1QxblWwQ .marker.cross{stroke:#333333;}#mermaid-svg-AtN9R7Dh1QxblWwQ svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-AtN9R7Dh1QxblWwQ p{margin:0;}#mermaid-svg-AtN9R7Dh1QxblWwQ .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-AtN9R7Dh1QxblWwQ .cluster-label text{fill:#333;}#mermaid-svg-AtN9R7Dh1QxblWwQ .cluster-label span{color:#333;}#mermaid-svg-AtN9R7Dh1QxblWwQ .cluster-label span p{background-color:transparent;}#mermaid-svg-AtN9R7Dh1QxblWwQ .label text,#mermaid-svg-AtN9R7Dh1QxblWwQ span{fill:#333;color:#333;}#mermaid-svg-AtN9R7Dh1QxblWwQ .node rect,#mermaid-svg-AtN9R7Dh1QxblWwQ .node circle,#mermaid-svg-AtN9R7Dh1QxblWwQ .node ellipse,#mermaid-svg-AtN9R7Dh1QxblWwQ .node polygon,#mermaid-svg-AtN9R7Dh1QxblWwQ .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-AtN9R7Dh1QxblWwQ .rough-node .label text,#mermaid-svg-AtN9R7Dh1QxblWwQ .node .label text,#mermaid-svg-AtN9R7Dh1QxblWwQ .image-shape .label,#mermaid-svg-AtN9R7Dh1QxblWwQ .icon-shape .label{text-anchor:middle;}#mermaid-svg-AtN9R7Dh1QxblWwQ .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-AtN9R7Dh1QxblWwQ .rough-node .label,#mermaid-svg-AtN9R7Dh1QxblWwQ .node .label,#mermaid-svg-AtN9R7Dh1QxblWwQ .image-shape .label,#mermaid-svg-AtN9R7Dh1QxblWwQ .icon-shape .label{text-align:center;}#mermaid-svg-AtN9R7Dh1QxblWwQ .node.clickable{cursor:pointer;}#mermaid-svg-AtN9R7Dh1QxblWwQ .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-AtN9R7Dh1QxblWwQ .arrowheadPath{fill:#333333;}#mermaid-svg-AtN9R7Dh1QxblWwQ .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-AtN9R7Dh1QxblWwQ .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-AtN9R7Dh1QxblWwQ .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-AtN9R7Dh1QxblWwQ .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-AtN9R7Dh1QxblWwQ .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-AtN9R7Dh1QxblWwQ .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-AtN9R7Dh1QxblWwQ .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-AtN9R7Dh1QxblWwQ .cluster text{fill:#333;}#mermaid-svg-AtN9R7Dh1QxblWwQ .cluster span{color:#333;}#mermaid-svg-AtN9R7Dh1QxblWwQ div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-AtN9R7Dh1QxblWwQ .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-AtN9R7Dh1QxblWwQ rect.text{fill:none;stroke-width:0;}#mermaid-svg-AtN9R7Dh1QxblWwQ .icon-shape,#mermaid-svg-AtN9R7Dh1QxblWwQ .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-AtN9R7Dh1QxblWwQ .icon-shape p,#mermaid-svg-AtN9R7Dh1QxblWwQ .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-AtN9R7Dh1QxblWwQ .icon-shape .label rect,#mermaid-svg-AtN9R7Dh1QxblWwQ .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-AtN9R7Dh1QxblWwQ .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-AtN9R7Dh1QxblWwQ .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-AtN9R7Dh1QxblWwQ :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 正常响应
超时未响应
Watchdog 线程

每隔 30 秒检查
核心服务线程

是否在 60 秒内

响应了心跳?
无事发生

继续监控
★ 触发 SWT

杀 SystemServer

手机重启


二、Watchdog 的监控机制

Watchdog 在 SystemServer 启动阶段被创建并开始监控。它监控的不是所有线程,而是一组注册到 Watchdog 的核心 Handler
#mermaid-svg-nivOKNMxLrfJWUUV{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-nivOKNMxLrfJWUUV .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-nivOKNMxLrfJWUUV .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-nivOKNMxLrfJWUUV .error-icon{fill:#552222;}#mermaid-svg-nivOKNMxLrfJWUUV .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-nivOKNMxLrfJWUUV .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-nivOKNMxLrfJWUUV .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-nivOKNMxLrfJWUUV .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-nivOKNMxLrfJWUUV .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-nivOKNMxLrfJWUUV .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-nivOKNMxLrfJWUUV .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-nivOKNMxLrfJWUUV .marker{fill:#333333;stroke:#333333;}#mermaid-svg-nivOKNMxLrfJWUUV .marker.cross{stroke:#333333;}#mermaid-svg-nivOKNMxLrfJWUUV svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-nivOKNMxLrfJWUUV p{margin:0;}#mermaid-svg-nivOKNMxLrfJWUUV .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-nivOKNMxLrfJWUUV .cluster-label text{fill:#333;}#mermaid-svg-nivOKNMxLrfJWUUV .cluster-label span{color:#333;}#mermaid-svg-nivOKNMxLrfJWUUV .cluster-label span p{background-color:transparent;}#mermaid-svg-nivOKNMxLrfJWUUV .label text,#mermaid-svg-nivOKNMxLrfJWUUV span{fill:#333;color:#333;}#mermaid-svg-nivOKNMxLrfJWUUV .node rect,#mermaid-svg-nivOKNMxLrfJWUUV .node circle,#mermaid-svg-nivOKNMxLrfJWUUV .node ellipse,#mermaid-svg-nivOKNMxLrfJWUUV .node polygon,#mermaid-svg-nivOKNMxLrfJWUUV .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-nivOKNMxLrfJWUUV .rough-node .label text,#mermaid-svg-nivOKNMxLrfJWUUV .node .label text,#mermaid-svg-nivOKNMxLrfJWUUV .image-shape .label,#mermaid-svg-nivOKNMxLrfJWUUV .icon-shape .label{text-anchor:middle;}#mermaid-svg-nivOKNMxLrfJWUUV .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-nivOKNMxLrfJWUUV .rough-node .label,#mermaid-svg-nivOKNMxLrfJWUUV .node .label,#mermaid-svg-nivOKNMxLrfJWUUV .image-shape .label,#mermaid-svg-nivOKNMxLrfJWUUV .icon-shape .label{text-align:center;}#mermaid-svg-nivOKNMxLrfJWUUV .node.clickable{cursor:pointer;}#mermaid-svg-nivOKNMxLrfJWUUV .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-nivOKNMxLrfJWUUV .arrowheadPath{fill:#333333;}#mermaid-svg-nivOKNMxLrfJWUUV .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-nivOKNMxLrfJWUUV .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-nivOKNMxLrfJWUUV .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-nivOKNMxLrfJWUUV .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-nivOKNMxLrfJWUUV .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-nivOKNMxLrfJWUUV .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-nivOKNMxLrfJWUUV .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-nivOKNMxLrfJWUUV .cluster text{fill:#333;}#mermaid-svg-nivOKNMxLrfJWUUV .cluster span{color:#333;}#mermaid-svg-nivOKNMxLrfJWUUV div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-nivOKNMxLrfJWUUV .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-nivOKNMxLrfJWUUV rect.text{fill:none;stroke-width:0;}#mermaid-svg-nivOKNMxLrfJWUUV .icon-shape,#mermaid-svg-nivOKNMxLrfJWUUV .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-nivOKNMxLrfJWUUV .icon-shape p,#mermaid-svg-nivOKNMxLrfJWUUV .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-nivOKNMxLrfJWUUV .icon-shape .label rect,#mermaid-svg-nivOKNMxLrfJWUUV .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-nivOKNMxLrfJWUUV .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-nivOKNMxLrfJWUUV .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-nivOKNMxLrfJWUUV :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Watchdog 单例
MainHandler

主线程(AMS / WMS / ...)
FgThread

前台线程
IoThread

IO 线程
DisplayThread

显示线程
AnimationThread

动画线程
BinderThread

Binder 调用线程
... 其他被监控线程

每个被监控的线程都有一个对应的 MonitorChecker:

被监控的东西 类型 超时了说明什么
MainHandler Handler 主线程死锁或卡死------最常见
AMS Monitor ActivityManagerService 无响应
WMS Monitor WindowManagerService 无响应
InputManagerService Monitor 输入系统无响应
NetworkManagementService Monitor 网络管理服务无响应
Binder 线程池 Monitor SystemServer 的 Binder 全部卡住

任何线程在处理消息时持锁超过 60 秒,Watchdog 就直接判定卡死,触发 SWT 重启。


三、Watchdog 是怎样判断"卡死了"的

Watchdog 的检测逻辑不复杂------它定时给每个被监控的 Handler 发一条空消息,看能不能在超时时间内处理完:
#mermaid-svg-Zx18xGsmjF541HXt{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-Zx18xGsmjF541HXt .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-Zx18xGsmjF541HXt .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-Zx18xGsmjF541HXt .error-icon{fill:#552222;}#mermaid-svg-Zx18xGsmjF541HXt .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-Zx18xGsmjF541HXt .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-Zx18xGsmjF541HXt .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-Zx18xGsmjF541HXt .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-Zx18xGsmjF541HXt .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-Zx18xGsmjF541HXt .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-Zx18xGsmjF541HXt .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-Zx18xGsmjF541HXt .marker{fill:#333333;stroke:#333333;}#mermaid-svg-Zx18xGsmjF541HXt .marker.cross{stroke:#333333;}#mermaid-svg-Zx18xGsmjF541HXt svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-Zx18xGsmjF541HXt p{margin:0;}#mermaid-svg-Zx18xGsmjF541HXt .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-Zx18xGsmjF541HXt .cluster-label text{fill:#333;}#mermaid-svg-Zx18xGsmjF541HXt .cluster-label span{color:#333;}#mermaid-svg-Zx18xGsmjF541HXt .cluster-label span p{background-color:transparent;}#mermaid-svg-Zx18xGsmjF541HXt .label text,#mermaid-svg-Zx18xGsmjF541HXt span{fill:#333;color:#333;}#mermaid-svg-Zx18xGsmjF541HXt .node rect,#mermaid-svg-Zx18xGsmjF541HXt .node circle,#mermaid-svg-Zx18xGsmjF541HXt .node ellipse,#mermaid-svg-Zx18xGsmjF541HXt .node polygon,#mermaid-svg-Zx18xGsmjF541HXt .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-Zx18xGsmjF541HXt .rough-node .label text,#mermaid-svg-Zx18xGsmjF541HXt .node .label text,#mermaid-svg-Zx18xGsmjF541HXt .image-shape .label,#mermaid-svg-Zx18xGsmjF541HXt .icon-shape .label{text-anchor:middle;}#mermaid-svg-Zx18xGsmjF541HXt .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-Zx18xGsmjF541HXt .rough-node .label,#mermaid-svg-Zx18xGsmjF541HXt .node .label,#mermaid-svg-Zx18xGsmjF541HXt .image-shape .label,#mermaid-svg-Zx18xGsmjF541HXt .icon-shape .label{text-align:center;}#mermaid-svg-Zx18xGsmjF541HXt .node.clickable{cursor:pointer;}#mermaid-svg-Zx18xGsmjF541HXt .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-Zx18xGsmjF541HXt .arrowheadPath{fill:#333333;}#mermaid-svg-Zx18xGsmjF541HXt .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-Zx18xGsmjF541HXt .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-Zx18xGsmjF541HXt .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Zx18xGsmjF541HXt .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-Zx18xGsmjF541HXt .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Zx18xGsmjF541HXt .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-Zx18xGsmjF541HXt .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-Zx18xGsmjF541HXt .cluster text{fill:#333;}#mermaid-svg-Zx18xGsmjF541HXt .cluster span{color:#333;}#mermaid-svg-Zx18xGsmjF541HXt div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-Zx18xGsmjF541HXt .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-Zx18xGsmjF541HXt rect.text{fill:none;stroke-width:0;}#mermaid-svg-Zx18xGsmjF541HXt .icon-shape,#mermaid-svg-Zx18xGsmjF541HXt .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-Zx18xGsmjF541HXt .icon-shape p,#mermaid-svg-Zx18xGsmjF541HXt .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-Zx18xGsmjF541HXt .icon-shape .label rect,#mermaid-svg-Zx18xGsmjF541HXt .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-Zx18xGsmjF541HXt .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-Zx18xGsmjF541HXt .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-Zx18xGsmjF541HXt :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 是

Watchdog.run()

死循环
等 30 秒

(CHECK_INTERVAL)
记录当前时间
给每个被监控的 Handler

发一条空消息
等 60 秒

(DEFAULT_TIMEOUT)
所有 Handler

都处理完了?
★ 记录卡死线程

的堆栈
dump 所有线程堆栈

到 /data/anr/
发 SIGQUIT 信号

让 SystemServer 自杀
init 检测到

SystemServer 挂了
按 init.rc 配置

重启系统

关键参数:

参数 默认值 含义
CHECK_INTERVAL 30s 每 30 秒检查一轮
DEFAULT_TIMEOUT 60s 一条消息 60 秒没处理完就算超时
从卡死到重启 最长 90s 30s 检查间隔 + 60s 超时

注意:SWT 不是实时检测的。从线程真正卡死到 Watchdog 触发重启,最长要 90 秒。


四、SWT 重启的完整流程

#mermaid-svg-MWKgPuPPUrkjLKEE{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-MWKgPuPPUrkjLKEE .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-MWKgPuPPUrkjLKEE .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-MWKgPuPPUrkjLKEE .error-icon{fill:#552222;}#mermaid-svg-MWKgPuPPUrkjLKEE .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-MWKgPuPPUrkjLKEE .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-MWKgPuPPUrkjLKEE .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-MWKgPuPPUrkjLKEE .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-MWKgPuPPUrkjLKEE .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-MWKgPuPPUrkjLKEE .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-MWKgPuPPUrkjLKEE .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-MWKgPuPPUrkjLKEE .marker{fill:#333333;stroke:#333333;}#mermaid-svg-MWKgPuPPUrkjLKEE .marker.cross{stroke:#333333;}#mermaid-svg-MWKgPuPPUrkjLKEE svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-MWKgPuPPUrkjLKEE p{margin:0;}#mermaid-svg-MWKgPuPPUrkjLKEE .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-MWKgPuPPUrkjLKEE .cluster-label text{fill:#333;}#mermaid-svg-MWKgPuPPUrkjLKEE .cluster-label span{color:#333;}#mermaid-svg-MWKgPuPPUrkjLKEE .cluster-label span p{background-color:transparent;}#mermaid-svg-MWKgPuPPUrkjLKEE .label text,#mermaid-svg-MWKgPuPPUrkjLKEE span{fill:#333;color:#333;}#mermaid-svg-MWKgPuPPUrkjLKEE .node rect,#mermaid-svg-MWKgPuPPUrkjLKEE .node circle,#mermaid-svg-MWKgPuPPUrkjLKEE .node ellipse,#mermaid-svg-MWKgPuPPUrkjLKEE .node polygon,#mermaid-svg-MWKgPuPPUrkjLKEE .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-MWKgPuPPUrkjLKEE .rough-node .label text,#mermaid-svg-MWKgPuPPUrkjLKEE .node .label text,#mermaid-svg-MWKgPuPPUrkjLKEE .image-shape .label,#mermaid-svg-MWKgPuPPUrkjLKEE .icon-shape .label{text-anchor:middle;}#mermaid-svg-MWKgPuPPUrkjLKEE .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-MWKgPuPPUrkjLKEE .rough-node .label,#mermaid-svg-MWKgPuPPUrkjLKEE .node .label,#mermaid-svg-MWKgPuPPUrkjLKEE .image-shape .label,#mermaid-svg-MWKgPuPPUrkjLKEE .icon-shape .label{text-align:center;}#mermaid-svg-MWKgPuPPUrkjLKEE .node.clickable{cursor:pointer;}#mermaid-svg-MWKgPuPPUrkjLKEE .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-MWKgPuPPUrkjLKEE .arrowheadPath{fill:#333333;}#mermaid-svg-MWKgPuPPUrkjLKEE .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-MWKgPuPPUrkjLKEE .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-MWKgPuPPUrkjLKEE .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-MWKgPuPPUrkjLKEE .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-MWKgPuPPUrkjLKEE .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-MWKgPuPPUrkjLKEE .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-MWKgPuPPUrkjLKEE .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-MWKgPuPPUrkjLKEE .cluster text{fill:#333;}#mermaid-svg-MWKgPuPPUrkjLKEE .cluster span{color:#333;}#mermaid-svg-MWKgPuPPUrkjLKEE div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-MWKgPuPPUrkjLKEE .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-MWKgPuPPUrkjLKEE rect.text{fill:none;stroke-width:0;}#mermaid-svg-MWKgPuPPUrkjLKEE .icon-shape,#mermaid-svg-MWKgPuPPUrkjLKEE .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-MWKgPuPPUrkjLKEE .icon-shape p,#mermaid-svg-MWKgPuPPUrkjLKEE .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-MWKgPuPPUrkjLKEE .icon-shape .label rect,#mermaid-svg-MWKgPuPPUrkjLKEE .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-MWKgPuPPUrkjLKEE .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-MWKgPuPPUrkjLKEE .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-MWKgPuPPUrkjLKEE :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 默认
设置了 critical

4 分钟 4 次
某个线程

持锁超过 60s
Watchdog 检测到

HandlerChecker 超时
收集堆栈

ActivityManager.dumpStackTraces()
写入 /data/anr/anr_xxx
Process.killProcess

杀 SystemServer PID
SystemServer 崩溃
init 进程

SIGCHLD 处理
init.rc 里怎么配的?
重启 Zygote / SystemServer

相当于软重启
进 recovery 模式

具体崩溃时的日志长这样:

复制代码
01-01 12:00:00.000  1000  1234  5678 W Watchdog: *** WATCHDOG KILLING SYSTEM PROCESS: Blocked in monitor com.android.server.am.ActivityManagerService
01-01 12:00:00.000  1000  1234  5678 W Watchdog: foreground thread stack trace:
01-01 12:00:00.000  1000  1234  5678 W Watchdog:     at com.android.server.am.ActivityManagerService.monitor(ActivityManagerService.java:xxxxx)
01-01 12:00:00.000  1000  1234  5678 W Watchdog:     - waiting to lock <0x0a1b2c3d> (a com.android.server.am.ActivityManagerService)
01-01 12:00:00.000  1000  1234  5678 W Watchdog:     held by thread 42
01-01 12:00:00.000  1000  1234  5678 W Watchdog: main thread stack trace:
01-01 12:00:00.000  1000  1234  5678 W Watchdog:     at com.android.server.wm.WindowManagerService.relayoutWindow(...)
01-01 12:00:00.000  1000  1234  5678 W Watchdog:     - waiting to lock <0x0a1b2c3d> (a com.android.server.am.ActivityManagerService)
01-01 12:00:00.000  1000  1234  5678 W Watchdog:     held by thread 42
01-01 12:00:00.000  1000  1234  5678 I Process : Sending signal. PID: 1234 SIG: 3
01-01 12:00:01.000  1000  1234  5678 I Process : Sending signal. PID: 1234 SIG: 9

关键词:WATCHDOG KILLING SYSTEM PROCESS + Blocked in monitor + waiting to lock。看到这三行,基本就是 SWT 导致的死锁重启。


五、常见的 SWT 触发场景

场景一:主线程死锁(最常见)
#mermaid-svg-fSzaYcL19n8qlOFA{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-fSzaYcL19n8qlOFA .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-fSzaYcL19n8qlOFA .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-fSzaYcL19n8qlOFA .error-icon{fill:#552222;}#mermaid-svg-fSzaYcL19n8qlOFA .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-fSzaYcL19n8qlOFA .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-fSzaYcL19n8qlOFA .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-fSzaYcL19n8qlOFA .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-fSzaYcL19n8qlOFA .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-fSzaYcL19n8qlOFA .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-fSzaYcL19n8qlOFA .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-fSzaYcL19n8qlOFA .marker{fill:#333333;stroke:#333333;}#mermaid-svg-fSzaYcL19n8qlOFA .marker.cross{stroke:#333333;}#mermaid-svg-fSzaYcL19n8qlOFA svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-fSzaYcL19n8qlOFA p{margin:0;}#mermaid-svg-fSzaYcL19n8qlOFA .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-fSzaYcL19n8qlOFA .cluster-label text{fill:#333;}#mermaid-svg-fSzaYcL19n8qlOFA .cluster-label span{color:#333;}#mermaid-svg-fSzaYcL19n8qlOFA .cluster-label span p{background-color:transparent;}#mermaid-svg-fSzaYcL19n8qlOFA .label text,#mermaid-svg-fSzaYcL19n8qlOFA span{fill:#333;color:#333;}#mermaid-svg-fSzaYcL19n8qlOFA .node rect,#mermaid-svg-fSzaYcL19n8qlOFA .node circle,#mermaid-svg-fSzaYcL19n8qlOFA .node ellipse,#mermaid-svg-fSzaYcL19n8qlOFA .node polygon,#mermaid-svg-fSzaYcL19n8qlOFA .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-fSzaYcL19n8qlOFA .rough-node .label text,#mermaid-svg-fSzaYcL19n8qlOFA .node .label text,#mermaid-svg-fSzaYcL19n8qlOFA .image-shape .label,#mermaid-svg-fSzaYcL19n8qlOFA .icon-shape .label{text-anchor:middle;}#mermaid-svg-fSzaYcL19n8qlOFA .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-fSzaYcL19n8qlOFA .rough-node .label,#mermaid-svg-fSzaYcL19n8qlOFA .node .label,#mermaid-svg-fSzaYcL19n8qlOFA .image-shape .label,#mermaid-svg-fSzaYcL19n8qlOFA .icon-shape .label{text-align:center;}#mermaid-svg-fSzaYcL19n8qlOFA .node.clickable{cursor:pointer;}#mermaid-svg-fSzaYcL19n8qlOFA .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-fSzaYcL19n8qlOFA .arrowheadPath{fill:#333333;}#mermaid-svg-fSzaYcL19n8qlOFA .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-fSzaYcL19n8qlOFA .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-fSzaYcL19n8qlOFA .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fSzaYcL19n8qlOFA .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-fSzaYcL19n8qlOFA .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fSzaYcL19n8qlOFA .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-fSzaYcL19n8qlOFA .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-fSzaYcL19n8qlOFA .cluster text{fill:#333;}#mermaid-svg-fSzaYcL19n8qlOFA .cluster span{color:#333;}#mermaid-svg-fSzaYcL19n8qlOFA div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-fSzaYcL19n8qlOFA .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-fSzaYcL19n8qlOFA rect.text{fill:none;stroke-width:0;}#mermaid-svg-fSzaYcL19n8qlOFA .icon-shape,#mermaid-svg-fSzaYcL19n8qlOFA .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-fSzaYcL19n8qlOFA .icon-shape p,#mermaid-svg-fSzaYcL19n8qlOFA .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-fSzaYcL19n8qlOFA .icon-shape .label rect,#mermaid-svg-fSzaYcL19n8qlOFA .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-fSzaYcL19n8qlOFA .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-fSzaYcL19n8qlOFA .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-fSzaYcL19n8qlOFA :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 线程 A

持锁 Lock1

等锁 Lock2
★ 死锁
线程 B

持锁 Lock2

等锁 Lock1
两个线程

都在等对方释放

永远等不到
Watchdog Handler

发的消息

也在等这两个锁
60s 超时

SWT 重启

代码示例:

java 复制代码
// 一个典型的死锁
Object lockA = new Object();
Object lockB = new Object();

// 线程 A
new Thread(() -> {
    synchronized (lockA) {
        Thread.sleep(100);
        synchronized (lockB) {   // 等 lockB ------ 被线程 B 持着
            doSomething();
        }
    }
}).start();

// 线程 B
new Thread(() -> {
    synchronized (lockB) {
        Thread.sleep(100);
        synchronized (lockA) {   // 等 lockA ------ 被线程 A 持着
            doSomethingElse();
        }
    }
}).start();

// 两个线程互相等,永不解锁 → Watchdog 60s 后触发 SWT

场景二:Binder 调用阻塞
#mermaid-svg-jTyHbwTV9xPa1aw6{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-jTyHbwTV9xPa1aw6 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-jTyHbwTV9xPa1aw6 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-jTyHbwTV9xPa1aw6 .error-icon{fill:#552222;}#mermaid-svg-jTyHbwTV9xPa1aw6 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-jTyHbwTV9xPa1aw6 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-jTyHbwTV9xPa1aw6 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-jTyHbwTV9xPa1aw6 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-jTyHbwTV9xPa1aw6 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-jTyHbwTV9xPa1aw6 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-jTyHbwTV9xPa1aw6 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-jTyHbwTV9xPa1aw6 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-jTyHbwTV9xPa1aw6 .marker.cross{stroke:#333333;}#mermaid-svg-jTyHbwTV9xPa1aw6 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-jTyHbwTV9xPa1aw6 p{margin:0;}#mermaid-svg-jTyHbwTV9xPa1aw6 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-jTyHbwTV9xPa1aw6 .cluster-label text{fill:#333;}#mermaid-svg-jTyHbwTV9xPa1aw6 .cluster-label span{color:#333;}#mermaid-svg-jTyHbwTV9xPa1aw6 .cluster-label span p{background-color:transparent;}#mermaid-svg-jTyHbwTV9xPa1aw6 .label text,#mermaid-svg-jTyHbwTV9xPa1aw6 span{fill:#333;color:#333;}#mermaid-svg-jTyHbwTV9xPa1aw6 .node rect,#mermaid-svg-jTyHbwTV9xPa1aw6 .node circle,#mermaid-svg-jTyHbwTV9xPa1aw6 .node ellipse,#mermaid-svg-jTyHbwTV9xPa1aw6 .node polygon,#mermaid-svg-jTyHbwTV9xPa1aw6 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-jTyHbwTV9xPa1aw6 .rough-node .label text,#mermaid-svg-jTyHbwTV9xPa1aw6 .node .label text,#mermaid-svg-jTyHbwTV9xPa1aw6 .image-shape .label,#mermaid-svg-jTyHbwTV9xPa1aw6 .icon-shape .label{text-anchor:middle;}#mermaid-svg-jTyHbwTV9xPa1aw6 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-jTyHbwTV9xPa1aw6 .rough-node .label,#mermaid-svg-jTyHbwTV9xPa1aw6 .node .label,#mermaid-svg-jTyHbwTV9xPa1aw6 .image-shape .label,#mermaid-svg-jTyHbwTV9xPa1aw6 .icon-shape .label{text-align:center;}#mermaid-svg-jTyHbwTV9xPa1aw6 .node.clickable{cursor:pointer;}#mermaid-svg-jTyHbwTV9xPa1aw6 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-jTyHbwTV9xPa1aw6 .arrowheadPath{fill:#333333;}#mermaid-svg-jTyHbwTV9xPa1aw6 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-jTyHbwTV9xPa1aw6 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-jTyHbwTV9xPa1aw6 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-jTyHbwTV9xPa1aw6 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-jTyHbwTV9xPa1aw6 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-jTyHbwTV9xPa1aw6 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-jTyHbwTV9xPa1aw6 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-jTyHbwTV9xPa1aw6 .cluster text{fill:#333;}#mermaid-svg-jTyHbwTV9xPa1aw6 .cluster span{color:#333;}#mermaid-svg-jTyHbwTV9xPa1aw6 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-jTyHbwTV9xPa1aw6 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-jTyHbwTV9xPa1aw6 rect.text{fill:none;stroke-width:0;}#mermaid-svg-jTyHbwTV9xPa1aw6 .icon-shape,#mermaid-svg-jTyHbwTV9xPa1aw6 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-jTyHbwTV9xPa1aw6 .icon-shape p,#mermaid-svg-jTyHbwTV9xPa1aw6 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-jTyHbwTV9xPa1aw6 .icon-shape .label rect,#mermaid-svg-jTyHbwTV9xPa1aw6 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-jTyHbwTV9xPa1aw6 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-jTyHbwTV9xPa1aw6 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-jTyHbwTV9xPa1aw6 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} App 进程

Binder 调 AMS

(比如 startActivity)
AMS 主线程

收到 Binder 请求
AMS 在处理中

去调了 HAL 层

HAL 层卡住了
AMS 主线程被阻塞

Binder 线程池也满了
所有 Binder 请求

全部卡在队列里
Watchdog 检测到

Binder 和主线程

都无响应 → SWT

场景三:IO 操作卡死主线程

在 AMS / WMS 的主线程里同步读文件、写数据库------如果存储芯片坏了或者文件系统卡住了,主线程直接堵死。

java 复制代码
// ❌ 千万不要在 AMS / WMS 的主线程里同步读文件
// 如果 eMMC/UFS 出问题,这行代码能卡一分钟
byte[] data = Files.readAllBytes(Paths.get("/data/system/some_file.xml"));

场景四:内存压力导致 GC 时间过长

SystemServer 堆严重不足,出现 Full GC,一次 GC 卡 30 秒------并且是在持锁的状态下。Watchdog 看到的就是主线程不响应。


六、实战:SWT 日志分析

SWT 发生后,抓这些日志:

bash 复制代码
# 1. 获取 SWT 时刻的 dump 文件
adb pull /data/anr/ .

# 2. 看 SWT 发生的精确时间
adb logcat -b all -d | grep "WATCHDOG KILLING"

# 3. 看它 dump 的线程堆栈
adb logcat -b all -d | grep -A 50 "Blocked in monitor"

# 4. 看完整的 ANR trace(SWT 时也会生成一份)
ls -la /data/anr/anr_*

真实 SWT 日志分析示例:

下面是一份实际的 SWT 日志(脱敏后):

复制代码
W Watchdog: *** WATCHDOG KILLING SYSTEM PROCESS: Blocked in monitor com.android.server.am.ActivityManagerService on foreground thread (fg)
W Watchdog: foreground thread stack trace:
W Watchdog:     at com.android.server.am.ActivityManagerService.monitor(ActivityManagerService.java:14562)
W Watchdog:     at com.android.server.Watchdog$HandlerChecker.run(Watchdog.java:248)
W Watchdog:     - waiting to lock <0x0b2f5c3a> (a com.android.server.am.ActivityManagerService)
W Watchdog:     held by thread 16
W Watchdog: 
W Watchdog: main thread stack trace:
W Watchdog:     at com.android.server.wm.WindowManagerService.removeWindow(WindowManagerService.java:3421)
W Watchdog:     at com.android.server.am.ActivityStack.removeActivityFromHistoryLocked(ActivityStack.java:4567)
W Watchdog:     at com.android.server.am.ActivityManagerService.activityDestroyed(ActivityManagerService.java:5678)
W Watchdog:     - waiting to lock <0x0b2f5c3a> (a com.android.server.am.ActivityManagerService)
W Watchdog:     held by thread 16
W Watchdog:
W Watchdog: thread 16 stack trace:
W Watchdog:     at com.android.server.am.ActivityManagerService.killAllBackgroundProcesses(ActivityManagerService.java:8901)
W Watchdog:     at android.database.ContentObserver.dispatchChange(ContentObserver.java:235)
W Watchdog:     - locked <0x0b2f5c3a> (a com.android.server.am.ActivityManagerService)
W Watchdog:     at android.os.Handler.dispatchMessage(Handler.java:102)

分析思路:

  1. 先看 Blocked in monitor xxx ------ AMS 的 foreground thread 超时了
  2. waiting to lock 后面的对象地址 0x0b2f5c3a ------ 锁是 AMS 对象
  3. held by thread 16 ------ 锁被 thread 16 持着
  4. 翻到 thread 16 的堆栈 ------ 它在 killAllBackgroundProcesses 这个方法里持锁干了什么?
  5. 问题定位:thread 16 在 AMS 里做 killAllBackgroundProcesses,持着 AMS 的锁去调了 ContentObserver.dispatchChange,而这个操作可能又等别的同步调用------锁一直没放,主线程和 fg 线程都在等这个锁 → 60s 超时 → SWT

七、实战:复现和定位 SWT 问题

步骤一:加大日志输出

bash 复制代码
# 把 Watchdog 的日志级别调低,输出更多细节
adb shell dumpsys watchdog
# 看当前哪些线程被监控、它们最后一次心跳的时间

步骤二:手动触发 Watchdog dump

bash 复制代码
# 如果怀疑卡死了,手动让 Watchdog dump 当前所有线程堆栈
adb shell kill -3 <system_server_pid>

# 或者
adb shell am dumpstack
# dump 文件会写到 /data/anr/

步骤三:用 systrace 抓时间线

bash 复制代码
# SWT 问题通常要抓很长的时间(90s+)
python systrace.py -t 120 -a system_server -b 32768 sched freq idle am wm gfx view binder_driver
# -t 120:抓 120 秒,覆盖 Watchdog 的一次完整检测周期

systrace 里看 system_server 的主线程------如果连续 60 秒在同一个函数卡着不动,那个函数就是 SWT 的真凶。

步骤四:复现后用 bugreport 分析

bash 复制代码
# 最好在 SWT 发生后立刻抓 bugreport
adb bugreport bugreport_swt.zip

# 解压后看这三个文件:
# 1. bugreport_xxx.txt → 搜 "WATCHDOG"
# 2. FS/data/anr/anr_xxx → 线程堆栈
# 3. main_entry.txt → 系统日志

步骤五:缩小复现场景

SWT 通常不是必现的------它是特定时序下的死锁。尝试在疑似卡死的操作前后打 log,增大调用密度:

bash 复制代码
# 比如怀疑是某个 Binder 调用导致的------反复调它
adb shell "while true; do am start -n com.example.app/.SomeActivity; sleep 0.5; done"
# 看能不能把 SWT 逼出来

八、实战:SWT 问题修复

修复一:拆锁------减小锁粒度

java 复制代码
// ❌ 持一个大锁做所有事
public class SomeService {
    private final Object mLock = new Object();

    public void methodA() {
        synchronized (mLock) {
            doSlowIo();        // IO 操作
            updateDatabase();  // 数据库操作
            notifyObservers(); // 通知回调------回调里可能再调回来
        }
    }
}

// ✓ 拆成更小的锁
public class SomeService {
    private final Object mDataLock = new Object();
    private final Object mCallbackLock = new Object();

    public void methodA() {
        synchronized (mDataLock) {
            updateDatabase();
        }
        // 回调在数据锁外面执行------避免回调里死锁
        synchronized (mCallbackLock) {
            notifyObservers();
        }
    }
}

修复二:Binder 调用加超时

java 复制代码
// ❌ 同步 Binder 调用,没超时
IBinder service = ServiceManager.getService("xxx");
// 如果对端卡死,这里永远不返回 → SWT

// ✓ 带上超时的调用
try {
    // 方式 1:Future + timeout
    Future<Boolean> future = executor.submit(() -> {
        return binderService.doSomeWork();
    });
    Boolean result = future.get(30, TimeUnit.SECONDS);  // 30 秒超时
} catch (TimeoutException e) {
    // 超时处理------不要无限等
    Log.w(TAG, "Binder call timeout", e);
} catch (Exception e) {
    Log.e(TAG, "Binder call failed", e);
}

修复三:持锁的线程加 Watchdog 心跳

如果你确实需要持一个锁很长时间(比如硬件操作),可以在持锁的过程中定期对 Watchdog 说"我还活着":

java 复制代码
// 通知 Watchdog 该线程正在忙,不要被误判为卡死
Watchdog.getInstance().addThread(
    handler,
    timeoutMillis,
    "MySlowThread"
);

但要谨慎------这只能用在"我知道会慢但必须做"的场景,不能用来掩盖真正的死锁。

修复四:检查调用链中是否有环路

java 复制代码
// A 调 B,B 调 C,C 又调回 A ------ 死循环
// A.method() → B.method() → C.method() → A.method() → ...

// ✓ 加一个计数器或标记防止重入
private boolean isInMethod = false;

public void method() {
    if (isInMethod) {
        Log.w(TAG, "Re-entrant call detected, skipping");
        return;
    }
    isInMethod = true;
    try {
        doWork();
    } finally {
        isInMethod = false;
    }
}

九、Watchdog 的源码走读

Watchdog 创建时机(SystemServer 阶段):

java 复制代码
// frameworks/base/services/java/com/android/server/SystemServer.java
private void startBootstrapServices() {
    // ...前面的服务

    // ★ 创建 Watchdog 单例
    final Watchdog watchdog = Watchdog.getInstance();
    watchdog.start();
}

Watchdog 核心 run 方法(简化版):

java 复制代码
// frameworks/base/services/core/java/com/android/server/Watchdog.java
public void run() {
    boolean waitedHalf = false;
    while (true) {
        synchronized (this) {
            // 1. 每隔 30 秒检查一轮
            wait(CHECK_INTERVAL);  // CHECK_INTERVAL = 30 * 1000

            // 2. 给所有被监控的 Handler 发空消息
            for (HandlerChecker hc : mHandlerCheckers) {
                hc.scheduleCheckLocked();
            }

            // 3. 等 30 秒(第一次不触发,给个机会)
            long timeout = waitedHalf ? DEFAULT_TIMEOUT / 2 : DEFAULT_TIMEOUT;
            wait(timeout);  // 第一次 60s,后续 30s

            // 4. 检查哪些还没处理完
            ArrayList<HandlerChecker> blockedCheckers = getBlockedCheckersLocked();

            if (blockedCheckers.size() > 0) {
                if (!waitedHalf) {
                    // 第一次超时------再给 30 秒
                    waitedHalf = true;
                    continue;
                }
                // ★ 第二次超时------真的卡死了
                Slog.w(TAG, "*** WATCHDOG KILLING SYSTEM PROCESS: "
                    + blockedCheckers.get(0).getName());
                // dump 所有线程堆栈
                ActivityManagerService.dumpStackTraces(...);
                // 杀 SystemServer
                Process.killProcess(Process.myPid());
                System.exit(10);
            }

            waitedHalf = false;
        }
    }
}

逻辑很清楚:每 30s 发一轮心跳 → 第一次超时再给 30s → 第二次超时直接杀。


十、常见踩坑记录

坑 1:把 SWT 和 ANR 搞混

SWT ANR
对象 SystemServer 的内部线程 某个 App 进程的主线程
超时 60s 5s(输入)/ 10s(广播)/ 20s(Service)
后果 手机重启 弹 ANR 对话框(或不弹)
日志关键词 WATCHDOG KILLING SYSTEM PROCESS ANR in ...

坑 2:只看 SWT 日志里 dump 的线程,忽略了前后上下文

SWT dump 出来的是"卡死那一刻"的堆栈。但不一定是"真正导致卡死的那一刻"的堆栈。结合前 30 秒的日志一起看,才能找到起因。

bash 复制代码
# 看 SWT 发生前 60 秒的日志
adb logcat -b all -d -t "01-01 11:59:00.000" | grep -E "Watchdog|ActivityManager|WindowManager"

坑 3:以为是某个服务卡了,其实是 HAL 层死了

AMS 主线程卡住,看起来是 AMS 的问题。往前翻可能是 AMS 调了 SensorService,SensorService 调了 HAL,HAL 层跟硬件通信时卡死了------根因在 HAL。

bash 复制代码
# 看 kernel log,确认是不是硬件驱动层的问题
adb shell dmesg | tail -200

坑 4:系统重启太快,没抓到 dump

SWT 触发后系统不到一秒就重启了。很多时候 /data/anr/ 里的 dump 都没写完。

bash 复制代码
# 一种折中办法------把 Watchdog 的超时调长,给自己留更多抓日志的时间
adb shell settings put global activity_manager_constants watchdog_timeout_millis=120000
# 或者在板子上改 Watchdog 源码的 DEFAULT_TIMEOUT

坑 5:加了 Watchdog 心跳但线程其实在空转

java 复制代码
// ❌ 这种"心跳"骗不了 Watchdog
// Watchdog 是给被监控线程发消息看能不能处理------不是看你有没有更新一个 flag
Watchdog.getInstance().addThread(handler, timeout, "MyThread");

// handler 的主线程消息队列堵了,handler 收不到消息,心跳还是丢的

十一、总结

SWT 触发链路一张图:
#mermaid-svg-p81NHtbJmudfxFk2{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-p81NHtbJmudfxFk2 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-p81NHtbJmudfxFk2 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-p81NHtbJmudfxFk2 .error-icon{fill:#552222;}#mermaid-svg-p81NHtbJmudfxFk2 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-p81NHtbJmudfxFk2 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-p81NHtbJmudfxFk2 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-p81NHtbJmudfxFk2 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-p81NHtbJmudfxFk2 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-p81NHtbJmudfxFk2 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-p81NHtbJmudfxFk2 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-p81NHtbJmudfxFk2 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-p81NHtbJmudfxFk2 .marker.cross{stroke:#333333;}#mermaid-svg-p81NHtbJmudfxFk2 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-p81NHtbJmudfxFk2 p{margin:0;}#mermaid-svg-p81NHtbJmudfxFk2 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-p81NHtbJmudfxFk2 .cluster-label text{fill:#333;}#mermaid-svg-p81NHtbJmudfxFk2 .cluster-label span{color:#333;}#mermaid-svg-p81NHtbJmudfxFk2 .cluster-label span p{background-color:transparent;}#mermaid-svg-p81NHtbJmudfxFk2 .label text,#mermaid-svg-p81NHtbJmudfxFk2 span{fill:#333;color:#333;}#mermaid-svg-p81NHtbJmudfxFk2 .node rect,#mermaid-svg-p81NHtbJmudfxFk2 .node circle,#mermaid-svg-p81NHtbJmudfxFk2 .node ellipse,#mermaid-svg-p81NHtbJmudfxFk2 .node polygon,#mermaid-svg-p81NHtbJmudfxFk2 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-p81NHtbJmudfxFk2 .rough-node .label text,#mermaid-svg-p81NHtbJmudfxFk2 .node .label text,#mermaid-svg-p81NHtbJmudfxFk2 .image-shape .label,#mermaid-svg-p81NHtbJmudfxFk2 .icon-shape .label{text-anchor:middle;}#mermaid-svg-p81NHtbJmudfxFk2 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-p81NHtbJmudfxFk2 .rough-node .label,#mermaid-svg-p81NHtbJmudfxFk2 .node .label,#mermaid-svg-p81NHtbJmudfxFk2 .image-shape .label,#mermaid-svg-p81NHtbJmudfxFk2 .icon-shape .label{text-align:center;}#mermaid-svg-p81NHtbJmudfxFk2 .node.clickable{cursor:pointer;}#mermaid-svg-p81NHtbJmudfxFk2 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-p81NHtbJmudfxFk2 .arrowheadPath{fill:#333333;}#mermaid-svg-p81NHtbJmudfxFk2 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-p81NHtbJmudfxFk2 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-p81NHtbJmudfxFk2 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-p81NHtbJmudfxFk2 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-p81NHtbJmudfxFk2 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-p81NHtbJmudfxFk2 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-p81NHtbJmudfxFk2 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-p81NHtbJmudfxFk2 .cluster text{fill:#333;}#mermaid-svg-p81NHtbJmudfxFk2 .cluster span{color:#333;}#mermaid-svg-p81NHtbJmudfxFk2 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-p81NHtbJmudfxFk2 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-p81NHtbJmudfxFk2 rect.text{fill:none;stroke-width:0;}#mermaid-svg-p81NHtbJmudfxFk2 .icon-shape,#mermaid-svg-p81NHtbJmudfxFk2 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-p81NHtbJmudfxFk2 .icon-shape p,#mermaid-svg-p81NHtbJmudfxFk2 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-p81NHtbJmudfxFk2 .icon-shape .label rect,#mermaid-svg-p81NHtbJmudfxFk2 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-p81NHtbJmudfxFk2 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-p81NHtbJmudfxFk2 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-p81NHtbJmudfxFk2 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 某线程持锁 > 60s
Watchdog HandlerChecker

发到该线程的消息

60s 内没被处理
第二次检查仍超时

(共 75-90s)
dump 所有线程堆栈

→ /data/anr/anr_xxx
Process.killProcess

杀 SystemServer
init 检测到 SystemServer 挂了

按 rc 配置重启
★ 用户看到手机重启

SWT 分析速查表:

你要找的 去哪看
SWT 发生时间 `adb logcat
是哪个 Monitor 卡了 搜 "Blocked in monitor"
锁被谁持着 搜 "held by thread"
持锁的线程在干什么 翻到对应 thread 的堆栈
卡死前发生了什么 往前翻 60s 的日志
完整线程堆栈 /data/anr/anr_xxx
是不是硬件问题 adb shell dmesg

核心源码位置:

文件 路径
Watchdog.java frameworks/base/services/core/java/com/android/server/Watchdog.java
SystemServer(创建 Watchdog) frameworks/base/services/java/com/android/server/SystemServer.java

一句话: SWT 就是 SystemServer 里某条线程持锁超过 60s 不撒手,Watchdog 发现后把整个 SystemServer 杀了导致手机重启。排查路径:找到 "Blocked in monitor" → 找到 "held by thread" → 看那个线程在干嘛 → 顺着锁链往前追根因。

相关推荐
恋猫de小郭1 小时前
GSY 史上最全跨平台/架构/语言的项目,七大项目召唤「神龙」
android·前端·flutter
shuaiqinke1 小时前
【分享】一刻日记 富文本日记+图文混排+导出分享
android·craiyon
__Witheart__2 小时前
Android RK SDK只编译和烧录kernel(boot.img)
android
黄林晴2 小时前
Compose 键盘焦点别乱写!正确姿势只有这一种
android
刮风那天2 小时前
Android ActivityStarter 完整解析
android
liyunlong-java2 小时前
Android 跳转系统相册选取图片/视频/音频/文档(适配全版本权限)
android·gitee·音视频
q20609517102 小时前
文件上传漏洞攻防全解析
android
刮风那天2 小时前
Android 理解requestStartTransition过渡动画
android
流星白龙2 小时前
【MySQL高阶】8.MySQL系统库
android·mysql·adb