Linux电源管理系统架构及驱动实现详细分析
1. 系统架构层次
1.1 整体架构概览
Linux电源管理系统采用分层架构设计,从用户空间到硬件抽象层形成完整的电源管理生态:
用户空间层
↓
系统调用接口(syscalls)
↓
内核核心子系统(CPUFreq、CPUIdle、Runtime PM、Suspend)
↓
设备驱动框架(dev_pm_ops、genpd、regulator)
↓
硬件抽象层(ACPI、DT、arch-specific)
↓
硬件平台(SoC、设备、时钟、电源域)
1.2 用户空间组件详解
1.2.1 systemd-logind电源管理
systemd-logind是现代化的用户会话管理器,负责协调用户级别的电源管理操作:
核心功能:
c
// systemd源码中的关键结构体
struct Manager {
Hashmap *devices;
Hashmap *seats;
Hashmap *sessions;
Hashmap *users;
Hashmap *inhibitors; // 关键:阻止关机列表
Hashmap *buttons;
int inhibit_fd;
sd_event *event;
bool handle_power_key;
bool handle_suspend_key;
bool handle_hibernate_key;
bool handle_lid_switch;
};
电源事件处理流程:
硬件事件(电源键/休眠键/合盖)
↓
内核input子系统
↓
udev规则匹配
↓
systemd-logind接收事件
↓
检查inhibitor阻止列表
↓
执行相应操作(suspend/hibernate/poweroff)
1.2.2 pm-utils工具套件
pm-utils提供传统的电源管理工具集:
主要组件:
bash
# pm-utils核心脚本
/usr/lib/pm-utils/bin/pm-action # 核心执行脚本
/usr/lib/pm-utils/bin/pm-is-supported # 能力检测
/usr/lib/pm-utils/bin/pm-powersave # 省电模式控制
# 配置文件目录
/etc/pm/config.d/ # 全局配置
/etc/pm/sleep.d/ # suspend/resume钩子脚本
/etc/pm/power.d/ # 电源状态变化钩子
执行机制:
c
// pm-action核心逻辑
int main(int argc, char **argv) {
char *action = argv[1];
// 1. 加载配置
load_config();
// 2. 执行前置钩子
run_hooks("before", action);
// 3. 执行内核接口
if (strcmp(action, "suspend") == 0)
write_state("/sys/power/state", "mem");
else if (strcmp(action, "hibernate") == 0)
write_state("/sys/power/state", "disk");
// 4. 执行后置钩子
run_hooks("after", action);
return 0;
}
1.2.3 upower守护进程
upower提供设备电源状态监控和抽象:
架构设计:
upowerd (系统守护进程)
├── UPowerClient (D-Bus客户端)
├── UPowerDevice (设备抽象)
├── UPowerDaemon (核心服务)
└── UPowerWakeups (唤醒源监控)
关键数据结构:
c
struct UpDevice {
gchar *object_path;
gchar *native_path;
gchar *vendor;
gchar *model;
gchar *serial;
gchar *update_time;
UpDeviceKind kind;
UpDeviceState state;
gdouble energy;
gdouble energy_empty;
gdouble energy_full;
gdouble energy_rate;
gint64 time_to_empty;
gint64 time_to_full;
gdouble percentage;
gboolean is_present;
gboolean is_rechargeable;
gboolean power_supply;
};
1.3 内核子系统协作关系
1.3.1 CPUFreq子系统
CPUFreq负责CPU频率动态调节:
核心架构:
CPUFreq Core
├── Governors (性能策略)
│ ├── performance
│ ├── powersave
│ ├── userspace
│ ├── ondemand
│ └── conservative
├── Drivers (硬件驱动)
│ ├── intel_pstate
│ ├── acpi-cpufreq
│ ├── arm_big_little
│ └── cpufreq-dt
└── Stats (统计信息)
关键数据结构:
c
struct cpufreq_policy {
/* 核心策略信息 */
cpumask_var_t cpus; /* 受影响的CPU列表 */
cpumask_var_t related_cpus; /* 相关CPU列表 */
unsigned int min; /* 最小频率 (kHz) */
unsigned int max; /* 最大频率 (kHz) */
unsigned int cur; /* 当前频率 (kHz) */
/* 驱动回调 */
struct cpufreq_governor *governor;
struct cpufreq_driver *driver;
/* 统计信息 */
struct cpufreq_stats *stats;
/* 电源管理 */
struct device *cpu_dev;
struct freq_constraints constraints;
/* 工作队列 */
struct work_struct update;
struct delayed_work work;
};
1.3.2 CPUIdle子系统
CPUIdle管理CPU空闲状态:
状态层次:
CPUIdle Core
├── Governors (空闲策略)
│ ├── ladder
│ ├── menu
│ └── teo
├── Drivers (硬件驱动)
│ ├── intel_idle
│ ├── acpi_idle
│ ├── arm_idle
│ └── cpuidle-dt
└── States (空闲状态)
├── C0 (运行状态)
├── C1 (快速唤醒)
├── C2 (深度睡眠)
└── C3 (最深睡眠)
核心数据结构:
c
struct cpuidle_device {
unsigned int registered:1;
unsigned int enabled:1;
unsigned int poll_time_limit:1;
int last_state_idx;
int last_residency;
u64 next_hrtimer;
struct cpuidle_state_usage states_usage[CPUIDLE_STATE_MAX];
struct cpuidle_state *states;
struct cpuidle_driver *driver;
struct device dev;
/* 统计信息 */
struct cpuidle_device_kstats *kstats;
};
struct cpuidle_state {
char name[CPUIDLE_NAME_LEN];
char desc[CPUIDLE_DESC_LEN];
unsigned int flags;
unsigned int exit_latency; /* 退出延迟 (us) */
unsigned int target_residency; /* 目标驻留时间 (us) */
unsigned int power_usage; /* 功耗 (mW) */
int (*enter)(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index);
};
1.3.3 Runtime PM子系统
Runtime PM实现运行时电源管理:
状态机设计:
RPM_ACTIVE → RPM_SUSPENDING → RPM_SUSPENDED → RPM_RESUMING → RPM_ACTIVE
↑ ↓
└──────────────────────────────────────────────┘
核心实现:
c
struct dev_pm_info {
spinlock_t lock;
struct timer_list suspend_timer;
unsigned long timer_expires;
struct work_struct work;
wait_queue_head_t wait_queue;
atomic_t usage_count;
atomic_t child_count;
atomic_t suspend_depth;
unsigned int ignore_children:1;
unsigned int no_callbacks:1;
unsigned int irq_safe:1;
unsigned int use_autosuspend:1;
unsigned int timer_autosuspends:1;
unsigned int memalloc_noio:1;
int runtime_status;
int request_pending;
int deferred_resume;
unsigned long runtime_error;
int autosuspend_delay;
unsigned long last_busy;
struct pm_subsys_data *subsys_data;
struct pm_domain *domain;
};
1.3.4 Suspend/Hibernate子系统
Suspend子系统管理系统级睡眠状态:
睡眠状态层次:
PM_SUSPEND_ON (正常运行)
PM_SUSPEND_FREEZE (冻结进程)
PM_SUSPEND_STANDBY (待机状态)
PM_SUSPEND_MEM (内存挂起)
PM_SUSPEND_MAX (最大挂起深度)
核心数据结构:
c
struct pm_sleep_state {
const char *name;
unsigned int flags;
int (*enter)(suspend_state_t state);
int (*prepare)(void);
int (*prepare_late)(void);
int (*wake)(void);
int (*finish)(void);
};
struct platform_suspend_ops {
int (*valid)(suspend_state_t state);
int (*begin)(suspend_state_t state);
int (*prepare)(void);
int (*prepare_late)(void);
int (*enter)(suspend_state_t state);
void (*wake)(void);
void (*finish)(void);
void (*end)(void);
void (*recover)(void);
};
1.4 硬件抽象层交互
1.4.1 ACPI框架集成
ACPI提供标准化的电源管理接口:
核心组件:
ACPI CA Core
├── ACPI Tables (FADT, DSDT, SSDT)
├── ACPI Hardware (寄存器接口)
├── ACPI Events (GPE, Fixed Events)
└── ACPI Methods (AML解释器)
电源管理集成:
c
struct acpi_device_power_state {
u8 power_state;
u8 order;
struct list_head resources;
struct list_head children;
};
struct acpi_device {
struct acpi_device_ops *ops;
struct acpi_driver *driver;
struct acpi_device_power_state *power;
int power_state;
int state;
struct list_head children;
struct acpi_device *parent;
struct list_head node;
struct list_head wakeup;
struct work_struct work;
struct mutex physical_node_lock;
struct list_head physical_node_list;
};
1.4.2 Device Tree电源管理
DT提供描述性的电源管理配置:
电源管理节点:
dts
/* 示例:CPU电源管理配置 */
cpu0: cpu@0 {
device_type = "cpu";
compatible = "arm,cortex-a53";
reg = <0x0>;
/* CPUFreq配置 */
operating-points-v2 = <&cpu_opp_table>;
/* CPUIdle配置 */
cpu-idle-states = <&CPU_SLEEP_0 &CPU_SLEEP_1>;
/* 电源域 */
power-domains = <&pd_cpu0>;
/* 时钟源 */
clocks = <&clk_cpu0>;
clock-names = "cpu";
};
/* 操作点表 */
cpu_opp_table: opp-table {
compatible = "operating-points-v2";
opp-shared;
opp-500000000 {
opp-hz = /bits/ 64 <500000000>;
opp-microvolt = <900000>;
opp-supported-hw = <0x1>;
};
opp-1000000000 {
opp-hz = /bits/ 64 <1000000000>;
opp-microvolt = <1100000>;
opp-supported-hw = <0x1>;
};
};
2. 驱动开发要点
2.1 设备电源状态管理
2.1.1 struct dev_pm_ops详解
dev_pm_ops是设备电源管理的核心回调接口:
c
struct dev_pm_ops {
/* 系统级挂起/恢复 */
int (*prepare)(struct device *dev);
void (*complete)(struct device *dev);
int (*suspend)(struct device *dev);
int (*resume)(struct device *dev);
int (*freeze)(struct device *dev);
int (*thaw)(struct device *dev);
int (*poweroff)(struct device *dev);
int (*restore)(struct device *dev);
/* Runtime PM */
int (*runtime_suspend)(struct device *dev);
int (*runtime_resume)(struct device *dev);
int (*runtime_idle)(struct device *dev);
/* 异步挂起/恢复 */
int (*suspend_late)(struct device *dev);
int (*resume_early)(struct device *dev);
int (*freeze_late)(struct device *dev);
int (*thaw_early)(struct device *dev);
int (*poweroff_late)(struct device *dev);
int (*restore_early)(struct device *dev);
/* 无中断上下文 */
int (*suspend_noirq)(struct device *dev);
int (*resume_noirq)(struct device *dev);
int (*freeze_noirq)(struct device *dev);
int (*thaw_noirq)(struct device *dev);
int (*poweroff_noirq)(struct device *dev);
int (*restore_noirq)(struct device *dev);
/* 休眠特定 */
int (*set_wakeup)(struct device *dev, bool enable);
int (*check_wakeup)(struct device *dev);
};
2.1.2 回调函数实现规范
runtime_suspend实现示例:
c
static int my_device_runtime_suspend(struct device *dev)
{
struct my_device *my_dev = dev_get_drvdata(dev);
int ret;
/* 1. 保存设备上下文 */
my_dev->context = my_device_save_context(my_dev);
/* 2. 停止DMA传输 */
ret = my_device_stop_dma(my_dev);
if (ret)
return ret;
/* 3. 禁用中断 */
my_device_disable_irq(my_dev);
/* 4. 关闭设备时钟 */
clk_disable_unprepare(my_dev->clk);
/* 5. 设置电源状态 */
pm_runtime_set_suspended(dev);
/* 6. 可选:降低电压 */
if (my_dev->regulator)
regulator_set_voltage(my_dev->regulator,
my_dev->suspend_voltage,
my_dev->suspend_voltage);
dev_dbg(dev, "runtime suspend completed\n");
return 0;
}
runtime_resume实现示例:
c
static int my_device_runtime_resume(struct device *dev)
{
struct my_device *my_dev = dev_get_drvdata(dev);
int ret;
/* 1. 恢复电压 */
if (my_dev->regulator) {
ret = regulator_set_voltage(my_dev->regulator,
my_dev->resume_voltage,
my_dev->resume_voltage);
if (ret)
return ret;
}
/* 2. 启用时钟 */
ret = clk_prepare_enable(my_dev->clk);
if (ret)
return ret;
/* 3. 恢复设备上下文 */
my_device_restore_context(my_dev, my_dev->context);
/* 4. 启用中断 */
my_device_enable_irq(my_dev);
/* 5. 恢复DMA */
ret = my_device_resume_dma(my_dev);
if (ret)
goto err_dma;
/* 6. 设置电源状态 */
pm_runtime_set_active(dev);
dev_dbg(dev, "runtime resume completed\n");
return 0;
err_dma:
my_device_disable_irq(my_dev);
clk_disable_unprepare(my_dev->clk);
return ret;
}
2.1.3 驱动注册和初始化
完整驱动示例:
c
#include <linux/pm_runtime.h>
#include <linux/pm.h>
static const struct dev_pm_ops my_device_pm_ops = {
.runtime_suspend = my_device_runtime_suspend,
.runtime_resume = my_device_runtime_resume,
.runtime_idle = my_device_runtime_idle,
.suspend = my_device_system_suspend,
.resume = my_device_system_resume,
.prepare = my_device_prepare,
.complete = my_device_complete,
};
static struct platform_driver my_device_driver = {
.probe = my_device_probe,
.remove = my_device_remove,
.driver = {
.name = "my-device",
.pm = &my_device_pm_ops,
.of_match_table = my_device_of_match,
},
};
static int my_device_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
struct my_device *my_dev;
int ret;
/* 分配设备结构体 */
my_dev = devm_kzalloc(dev, sizeof(*my_dev), GFP_KERNEL);
if (!my_dev)
return -ENOMEM;
platform_set_drvdata(pdev, my_dev);
/* 初始化Runtime PM */
pm_runtime_set_autosuspend_delay(dev, 1000); /* 1秒自动挂起延迟 */
pm_runtime_use_autosuspend(dev);
pm_runtime_enable(dev);
/* 获取时钟 */
my_dev->clk = devm_clk_get(dev, NULL);
if (IS_ERR(my_dev->clk))
return PTR_ERR(my_dev->clk);
/* 获取调节器 */
my_dev->regulator = devm_regulator_get(dev, "vcc");
if (IS_ERR(my_dev->regulator))
my_dev->regulator = NULL;
/* 硬件初始化 */
ret = my_device_hw_init(my_dev);
if (ret)
goto err_hw;
/* 设置初始状态为活跃 */
pm_runtime_set_active(dev);
pm_runtime_mark_last_busy(dev);
dev_info(dev, "probe completed\n");
return 0;
err_hw:
pm_runtime_disable(dev);
return ret;
}
static int my_device_remove(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
/* 禁用Runtime PM */
pm_runtime_disable(dev);
pm_runtime_set_suspended(dev);
return 0;
}
2.2 电源域(Power Domain)管理
2.2.1 genpd框架架构
genpd(Generic Power Domain)提供通用的电源域管理框架:
genpd Core
├── Power Domain Tree (层次结构)
├── Performance States (性能状态)
├── Power State Callbacks (状态回调)
├── Device Attachments (设备关联)
└── Power State Coordination (状态协调)
2.2.2 电源域定义和配置
简单电源域示例:
c
#include <linux/pm_domain.h>
static int my_pd_power_on(struct generic_pm_domain *domain)
{
struct my_domain *my_pd = pd_to_my_domain(domain);
/* 启用电源域时钟 */
clk_prepare_enable(my_pd->clk);
/* 设置调节器电压 */
if (my_pd->regulator)
regulator_set_voltage(my_pd->regulator,
my_pd->on_voltage, my_pd->on_voltage);
/* 可选:等待电源稳定 */
msleep(my_pd->power_on_delay);
return 0;
}
static int my_pd_power_off(struct generic_pm_domain *domain)
{
struct my_domain *my_pd = pd_to_my_domain(domain);
/* 关闭调节器 */
if (my_pd->regulator)
regulator_set_voltage(my_pd->regulator, 0, 0);
/* 关闭时钟 */
clk_disable_unprepare(my_pd->clk);
return 0;
}
static struct generic_pm_domain my_pd = {
.name = "my-power-domain",
.power_on = my_pd_power_on,
.power_off = my_pd_power_off,
.flags = GENPD_FLAG_ALWAYS_ON, /* 或者 GENPD_FLAG_PM_CLK */
};
复杂电源域层次结构:
c
/* 父电源域 */
static struct generic_pm_domain *parent_pd;
/* 子电源域 */
static struct generic_pm_domain *child_pd1;
static struct generic_pm_domain *child_pd2;
static int __init my_pd_init(void)
{
int ret;
/* 注册父电源域 */
parent_pd = devm_genpd_init(&pdev->dev, NULL, &my_parent_pd);
if (IS_ERR(parent_pd))
return PTR_ERR(parent_pd);
/* 设置父电源域 */
genpd_set_parent(child_pd1, parent_pd);
genpd_set_parent(child_pd2, parent_pd);
/* 注册子电源域 */
ret = devm_genpd_add_provider(&pdev->dev, child_pd1, "domain1");
if (ret)
return ret;
ret = devm_genpd_add_provider(&pdev->dev, child_pd2, "domain2");
if (ret)
return ret;
return 0;
}
2.2.3 性能状态管理
性能状态定义:
c
static struct genpd_power_state my_pd_states[] = {
{
.power_on_latency_ns = 100000, /* 100us */
.power_off_latency_ns = 200000, /* 200us */
.residency_ns = 1000000, /* 1ms */
.usage_count = 0,
},
{
.power_on_latency_ns = 500000, /* 500us */
.power_off_latency_ns = 1000000, /* 1ms */
.residency_ns = 5000000, /* 5ms */
.usage_count = 0,
},
};
static struct generic_pm_domain my_pd = {
.name = "my-pd",
.power_on = my_pd_power_on,
.power_off = my_pd_power_off,
.states = my_pd_states,
.state_count = ARRAY_SIZE(my_pd_states),
.flags = GENPD_FLAG_PM_CLK | GENPD_FLAG_CPU_DOMAIN,
};
2.3 时钟和电压调节
2.3.1 cpufreq-dt集成
设备树配置:
dts
/* CPU频率调节配置 */
cpu0: cpu@0 {
operating-points-v2 = <&cpu_opp_table>;
clock-latency = <100000>; /* 100us */
/* 电压调节器 */
cpu-supply = <&vdd_cpu>;
};
/* 操作点表 */
cpu_opp_table: opp-table {
compatible = "operating-points-v2";
opp-shared;
opp-500000000 {
opp-hz = /bits/ 64 <500000000>;
opp-microvolt = <900000>;
opp-microvolt-l = <875000>;
opp-microvolt-h = <925000>;
};
opp-1000000000 {
opp-hz = /bits/ 64 <1000000000>;
opp-microvolt = <1100000>;
opp-microvolt-l = <1075000>;
opp-microvolt-h = <1125000>;
};
};
驱动集成:
c
static int my_cpufreq_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
struct cpufreq_dt *cdt;
int ret;
cdt = devm_kzalloc(dev, sizeof(*cdt), GFP_KERNEL);
if (!cdt)
return -ENOMEM;
/* 解析操作点 */
ret = dev_pm_opp_of_add_table(dev);
if (ret)
return ret;
/* 获取调节器 */
cdt->reg = devm_regulator_get(dev, "cpu");
if (IS_ERR(cdt->reg))
return PTR_ERR(cdt->reg);
/* 注册cpufreq驱动 */
ret = cpufreq_register_driver(&cpufreq_dt_driver);
if (ret)
goto err_reg;
platform_set_drvdata(pdev, cdt);
return 0;
err_reg:
dev_pm_opp_of_remove_table(dev);
return ret;
}
2.3.2 Regulator框架集成
调节器配置:
c
static const struct regulator_desc my_regulator_desc = {
.name = "vdd_cpu",
.type = REGULATOR_VOLTAGE,
.owner = THIS_MODULE,
.n_voltages = 32,
.min_uV = 800000,
.uV_step = 25000,
.vsel_reg = REGULATOR_VOLTAGE_SELECTION,
.vsel_mask = 0x1F,
.enable_reg = REGULATOR_ENABLE_REGISTER,
.enable_mask = 0x01,
.enable_time = 500, /* 500us */
.ramp_delay = 12500, /* 12.5mV/us */
.ops = &my_regulator_ops,
};
static const struct regulator_ops my_regulator_ops = {
.list_voltage = regulator_list_voltage_linear,
.set_voltage_sel = my_regulator_set_voltage_sel,
.get_voltage_sel = my_regulator_get_voltage_sel,
.enable = my_regulator_enable,
.disable = my_regulator_disable,
.is_enabled = my_regulator_is_enabled,
};
时钟门控集成:
c
static int my_device_clk_notify(struct notifier_block *nb,
unsigned long action, void *data)
{
struct clk_notifier_data *clk_data = data;
struct my_device *my_dev = container_of(nb, struct my_device, clk_nb);
switch (action) {
case PRE_RATE_CHANGE:
/* 准备频率变化 */
my_device_prepare_freq_change(my_dev, clk_data->new_rate);
break;
case POST_RATE_CHANGE:
/* 频率变化完成 */
my_device_complete_freq_change(my_dev, clk_data->new_rate);
break;
case ABORT_RATE_CHANGE:
/* 频率变化取消 */
my_device_abort_freq_change(my_dev, clk_data->old_rate);
break;
}
return NOTIFY_OK;
}
static int my_device_probe(struct platform_device *pdev)
{
struct my_device *my_dev;
int ret;
my_dev = devm_kzalloc(&pdev->dev, sizeof(*my_dev), GFP_KERNEL);
if (!my_dev)
return -ENOMEM;
/* 获取时钟 */
my_dev->clk = devm_clk_get(&pdev->dev, NULL);
if (IS_ERR(my_dev->clk))
return PTR_ERR(my_dev->clk);
/* 注册时钟通知 */
my_dev->clk_nb.notifier_call = my_device_clk_notify;
ret = clk_notifier_register(my_dev->clk, &my_dev->clk_nb);
if (ret)
return ret;
/* 启用时钟 */
ret = clk_prepare_enable(my_dev->clk);
if (ret)
goto err_clk;
platform_set_drvdata(pdev, my_dev);
return 0;
err_clk:
clk_notifier_unregister(my_dev->clk, &my_dev->clk_nb);
return ret;
}
3. 关键实现机制
3.1 Suspend执行流程详解
3.1.1 系统挂起到内存(Suspend-to-RAM)
完整执行流程:
用户空间请求挂起
↓
sysfs写入"mem"到/sys/power/state
↓
enter_state() [kernel/power/suspend.c]
├── pm_suspend_begin()
│ ├── 获取挂起状态
│ ├── 调用platform_suspend_begin()
│ └── 启动挂起控制台
├── suspend_prepare()
│ ├── 冻结用户空间进程
│ ├── 禁用用户模式helper
│ └── 调用device_prepare()
├── suspend_suspend()
│ ├── 禁用非启动CPU
│ ├── 挂起时间keeping
│ ├── 调用device_suspend()
│ └── 架构特定挂起准备
├── suspend_enter()
│ ├── 禁用中断
│ ├── 调用device_suspend_late()
│ ├── 调用device_suspend_noirq()
│ ├── 架构特定挂起
│ │ └── CPU进入睡眠状态
│ └── 唤醒后恢复
└── suspend_finish()
├── 调用device_resume_noirq()
├── 调用device_resume_early()
├── 启用中断
├── 调用device_resume()
├── 恢复时间keeping
├── 启用非启动CPU
└── 恢复用户空间
3.1.2 关键步骤详细分析
进程冻结机制:
c
static int suspend_prepare(void)
{
int error;
/* 1. 冻结用户空间进程 */
error = freeze_processes();
if (error)
goto out;
/* 2. 禁用用户模式helper */
usermodehelper_disable();
/* 3. 通知设备准备挂起 */
error = device_prepare();
if (error)
goto thaw;
/* 4. 控制台挂起 */
suspend_console();
return 0;
thaw:
usermodehelper_enable();
thaw_processes();
out:
return error;
}
int freeze_processes(void)
{
int error;
/* 1. 保存当前状态 */
error = freeze_kernel_threads();
if (error)
goto out;
/* 2. 冻结用户进程 */
error = freeze_user_processes();
if (error)
goto thaw_kernel;
/* 3. 冻结工作队列 */
error = freeze_workqueues();
if (error)
goto thaw_user;
return 0;
thaw_user:
thaw_user_processes();
thaw_kernel:
thaw_kernel_threads();
out:
return error;
}
设备挂起序列:
c
static int device_suspend(struct device *dev)
{
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
int error = 0;
/* 1. 检查设备状态 */
if (dev->power.status != DPM_ON)
return 0;
/* 2. 调用prepare回调 */
if (pm && pm->prepare) {
error = pm->prepare(dev);
if (error)
goto err_out;
}
/* 3. 挂起设备 */
if (pm && pm->suspend) {
error = pm->suspend(dev);
if (error)
goto err_finish;
}
/* 4. 更新设备状态 */
dev->power.status = DPM_SUSPENDED;
return 0;
err_finish:
if (pm && pm->complete)
pm->complete(dev);
err_out:
dev->power.status = DPM_ON;
return error;
}
3.2 QoS框架协调机制
3.2.1 PM QoS架构
PM QoS提供系统级功耗质量协调:
PM QoS Core
├── CPU延迟要求 (cpu_latency_qos)
├── 网络延迟要求 (network_latency_qos)
├── 网络吞吐量要求 (network_throughput_qos)
├── 内存带宽要求 (memory_bandwidth_qos)
└── 设备特定要求 (device_pm_qos)
3.2.2 CPU延迟QoS实现
核心数据结构:
c
struct pm_qos_constraints {
struct plist_head list;
s32 target_value; /* 聚合目标值 */
s32 default_value; /* 默认值 */
enum pm_qos_type type; /* 约束类型 */
char *name;
/* 通知机制 */
struct blocking_notifier_head *notifiers;
/* 同步机制 */
spinlock_t lock;
/* 调试信息 */
struct pm_qos_object *parent;
};
struct pm_qos_request {
struct plist_node node;
s32 value;
enum pm_qos_type type;
struct pm_qos_constraints *constraints;
char *name;
};
约束聚合算法:
c
static void pm_qos_set_value(struct pm_qos_constraints *c, s32 value)
{
struct plist_node *node;
s32 aggregate_value;
/* 1. 更新请求值 */
c->target_value = value;
/* 2. 计算聚合值 */
if (plist_head_empty(&c->list)) {
aggregate_value = c->default_value;
} else {
switch (c->type) {
case PM_QOS_MIN:
/* 取最小值 */
node = plist_first(&c->list);
aggregate_value = plist_node_value(node);
break;
case PM_QOS_MAX:
/* 取最大值 */
node = plist_last(&c->list);
aggregate_value = plist_node_value(node);
break;
case PM_QOS_SUM:
/* 求和 */
aggregate_value = 0;
plist_for_each_entry(node, &c->list, node) {
aggregate_value += plist_node_value(node);
}
break;
default:
aggregate_value = c->default_value;
break;
}
}
/* 3. 应用聚合值 */
if (aggregate_value != c->target_value) {
c->target_value = aggregate_value;
blocking_notifier_call_chain(c->notifiers,
PM_QOS_UPDATE, &aggregate_value);
}
}
3.2.3 设备PM QoS集成
设备驱动集成:
c
static int my_device_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
struct my_device *my_dev;
int ret;
my_dev = devm_kzalloc(dev, sizeof(*my_dev), GFP_KERNEL);
if (!my_dev)
return -ENOMEM;
/* 1. 初始化PM QoS */
pm_runtime_enable(dev);
/* 2. 设置延迟要求 */
ret = pm_qos_add_request(&my_dev->pm_qos_req,
PM_QOS_CPU_DMA_LATENCY,
PM_QOS_DEFAULT_VALUE);
if (ret)
return ret;
/* 3. 设备初始化 */
ret = my_device_hw_init(my_dev);
if (ret)
goto err_hw;
platform_set_drvdata(pdev, my_dev);
return 0;
err_hw:
pm_qos_remove_request(&my_dev->pm_qos_req);
return ret;
}
/* 运行时更新QoS要求 */
static void my_device_set_performance(struct my_device *my_dev,
enum my_device_perf perf)
{
switch (perf) {
case PERFORMANCE_LOW:
/* 低性能模式:允许高延迟 */
pm_qos_update_request(&my_dev->pm_qos_req, 1000); /* 1ms */
break;
case PERFORMANCE_NORMAL:
/* 正常性能:中等延迟 */
pm_qos_update_request(&my_dev->pm_qos_req, 100); /* 100us */
break;
case PERFORMANCE_HIGH:
/* 高性能模式:最低延迟 */
pm_qos_update_request(&my_dev->pm_qos_req, 0); /* 无延迟 */
break;
}
}
3.3 Wakeup Source机制
3.3.1 唤醒源架构
Wakeup source管理系统唤醒事件:
Wakeup Source Core
├── Active Wakeup Sources (活跃唤醒源)
├── Wakeup Events (唤醒事件计数)
├── Wakeup IRQs (唤醒中断)
└── Wakeup Statistics (唤醒统计)
3.3.2 唤醒源注册和管理
唤醒源结构体:
c
struct wakeup_source {
const char *name;
struct list_head entry;
/* 状态计数 */
atomic_t active;
atomic_t event_count;
atomic_t relax_count;
/* 时间统计 */
ktime_t active_time;
ktime_t total_time;
ktime_t max_time;
ktime_t last_time;
/* 防止并发访问 */
spinlock_t lock;
/* 统计信息 */
struct wakeup_source_stats stats;
/* 关联设备 */
struct device *dev;
/* 工作队列 */
struct work_struct work;
};
驱动实现示例:
c
static int my_device_probe(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
struct my_device *my_dev;
int ret;
my_dev = devm_kzalloc(dev, sizeof(*my_dev), GFP_KERNEL);
if (!my_dev)
return -ENOMEM;
/* 1. 创建唤醒源 */
my_dev->ws = wakeup_source_register(dev, "my-device");
if (!my_dev->ws)
return -ENOMEM;
/* 2. 初始化中断 */
my_dev->irq = platform_get_irq(pdev, 0);
if (my_dev->irq < 0)
return my_dev->irq;
ret = devm_request_irq(dev, my_dev->irq, my_device_irq_handler,
IRQF_TRIGGER_RISING | IRQF_ONESHOT,
"my-device", my_dev);
if (ret)
return ret;
/* 3. 设置唤醒能力 */
device_init_wakeup(dev, true);
enable_irq_wake(my_dev->irq);
platform_set_drvdata(pdev, my_dev);
return 0;
}
/* 中断处理函数 */
static irqreturn_t my_device_irq_handler(int irq, void *data)
{
struct my_device *my_dev = data;
/* 标记唤醒事件 */
__pm_stay_awake(my_dev->ws);
/* 处理中断 */
my_device_handle_interrupt(my_dev);
/* 延迟释放唤醒锁 */
pm_relax(my_dev->ws);
return IRQ_HANDLED;
}
3.3.3 唤醒事件统计
统计信息获取:
c
static ssize_t wakeup_stats_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct wakeup_source *ws;
ktime_t total_time;
ktime_t max_time;
ktime_t active_time;
unsigned long flags;
ws = dev_get_drvdata(dev);
spin_lock_irqsave(&ws->lock, flags);
total_time = ws->total_time;
max_time = ws->max_time;
if (ws->active) {
active_time = ktime_sub(ktime_get(), ws->last_time);
total_time = ktime_add(total_time, active_time);
if (active_time > max_time)
max_time = active_time;
}
spin_unlock_irqrestore(&ws->lock, flags);
return sprintf(buf,
"name: %s\n"
"active_count: %d\n"
"event_count: %d\n"
"wakeup_count: %d\n"
"expire_count: %d\n"
"active_since: %lld\n"
"total_time: %lld\n"
"max_time: %lld\n"
"last_change: %lld\n",
ws->name,
atomic_read(&ws->active),
atomic_read(&ws->event_count),
ws->stats.wakeup_count,
ws->stats.expire_count,
ws->active ? ktime_to_ms(ktime_sub(ktime_get(), ws->last_time)) : 0,
ktime_to_ms(total_time),
ktime_to_ms(max_time),
ktime_to_ms(ws->last_time));
}
4. 调试与优化
4.1 调试接口详解
4.1.1 /sys/power/接口
核心调试文件:
bash
# 电源状态信息
cat /sys/power/state # 支持的睡眠状态
cat /sys/power/pm_trace # PM跟踪状态
cat /sys/power/pm_test # PM测试模式
cat /sys/power/reserved_size # 保留内存大小
cat /sys/power/image_size # 休眠镜像大小
cat /sys/power/wakeup_count # 唤醒事件计数
# 调试控制
echo 1 > /sys/power/pm_trace # 启用PM跟踪
echo mem > /sys/power/state # 挂起到内存
echo disk > /sys/power/state # 挂起到磁盘
4.1.2 设备电源状态调试
设备电源信息:
bash
# 查看设备电源状态
for dev in /sys/devices/*/power/control; do
echo "Device: $dev"
cat $dev
done
# 查看Runtime PM统计
for dev in /sys/devices/*/power/runtime_status; do
echo "Device: $(dirname $dev)"
echo " Status: $(cat $dev)"
echo " Usage: $(cat $(dirname $dev)/runtime_usage)"
echo " Active: $(cat $(dirname $dev)/runtime_active_time)"
echo " Suspended: $(cat $(dirname $dev)/runtime_suspended_time)"
done
# 查看唤醒源
cat /sys/kernel/debug/wakeup_sources
4.2 Ftrace功耗分析
4.2.1 功耗相关跟踪点
关键跟踪事件:
bash
# 启用PM相关跟踪
echo 1 > /sys/kernel/debug/tracing/events/power/enable
# 查看可用跟踪点
cat /sys/kernel/debug/tracing/available_events | grep power
# 具体跟踪点示例
cpu_frequency # CPU频率变化
cpu_idle # CPU空闲状态变化
machine_suspend # 系统挂起事件
device_pm_callback_start # 设备PM回调开始
device_pm_callback_end # 设备PM回调结束
suspend_resume # 挂起/恢复事件
4.2.2 功耗分析脚本
自动化分析脚本:
bash
#!/bin/bash
# pm_analyze.sh - 功耗问题分析脚本
TRACE_DIR="/sys/kernel/debug/tracing"
RESULT_DIR="/tmp/pm_analysis"
mkdir -p $RESULT_DIR
# 1. 启用跟踪
echo 1 > $TRACE_DIR/events/power/enable
echo 1 > $TRACE_DIR/events/cpufreq/enable
echo 1 > $TRACE_DIR/events/cpuidle/enable
# 2. 清空缓冲区
echo > $TRACE_DIR/trace
# 3. 执行测试(例如挂起/恢复)
echo mem > /sys/power/state
# 4. 保存跟踪结果
cat $TRACE_DIR/trace > $RESULT_DIR/trace.txt
# 5. 分析结果
echo "=== 功耗分析结果 ===" > $RESULT_DIR/analysis.txt
echo "" >> $RESULT_DIR/analysis.txt
# 分析挂起时间
echo "挂起时间分析:" >> $RESULT_DIR/analysis.txt
grep "machine_suspend" $RESULT_DIR/trace.txt | \
awk '{print $1, $2, $6}' >> $RESULT_DIR/analysis.txt
# 分析设备回调时间
echo "" >> $RESULT_DIR/analysis.txt
echo "设备回调时间分析:" >> $RESULT_DIR/analysis.txt
grep "device_pm_callback" $RESULT_DIR/trace.txt | \
awk '{if($6=="start") start=$1; else if($6=="end") print $4, $1-start}' >> $RESULT_DIR/analysis.txt
# 分析频率变化
echo "" >> $RESULT_DIR/analysis.txt
echo "频率变化分析:" >> $RESULT_DIR/analysis.txt
grep "cpu_frequency" $RESULT_DIR/trace.txt | \
awk '{print $1, $7}' >> $RESULT_DIR/analysis.txt
echo "分析完成,结果保存在 $RESULT_DIR"
4.3 性能分析工具
4.3.1 pm_print_times分析
启用时间统计:
bash
# 启用PM时间统计
echo 1 > /sys/module/pm/parameters/pm_print_times
# 查看内核日志
dmesg | grep "PM:"
典型输出分析:
PM: suspend entry 2023-12-01 10:30:45.123456789
PM: Preparing system for mem sleep
PM: Freezing user space processes ... (elapsed 0.001 seconds) done.
PM: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: Suspending system devices ...
PM: suspend of devices complete after 123.456 msecs
PM: suspend of devices late complete after 45.789 msecs
PM: suspend of devices noirq complete after 12.345 msecs
PM: Suspending cpus ...
PM: resume of devices noirq complete after 8.765 msecs
PM: resume of devices early complete after 34.567 msecs
PM: resume of devices complete after 156.789 msecs
PM: Finishing wakeup.
PM: suspend exit 2023-12-01 10:30:45.789012345
4.3.2 PowerTOP分析
PowerTOP使用:
bash
# 安装PowerTOP
apt-get install powertop
# 校准(需要电池供电)
powertop --calibrate
# 生成报告
powertop --html=powertop_report.html
# 查看唤醒源
powertop --wakeups
# 查看功耗建议
powertop --tune
4.3.3 turbostat性能监控
CPU功耗监控:
bash
# 监控CPU功耗和性能
turbostat --show PkgWatt,PkgTmp,Busy%,Bzy_MHz,IRQ --interval 1
# 挂起恢复测试
turbostat --show PkgWatt,PkgTmp --interval 0.1 sleep 5
5. 最佳实践总结
5.1 驱动开发最佳实践
- 始终实现完整的pm_ops回调
- 正确处理错误路径和恢复
- 使用autosuspend优化功耗
- 合理设置唤醒源
- 集成PM QoS要求
5.2 系统优化建议
- 启用适当的 governors
- 配置合理的空闲状态
- 优化设备挂起时间
- 监控和分析功耗数据
- 定期更新和调优
5.3 调试技巧
- 使用ftrace进行详细分析
- 启用pm_print_times
- 监控/sys/power/接口
- 使用专业工具(PowerTOP/turbostat)
- 建立自动化测试流程
这份文档提供了Linux电源管理的完整技术架构分析,涵盖了从用户空间到硬件抽象层的完整实现细节,为驱动开发和系统优化提供了详细的技术指导。