【Linux驱动开发】Linux电源管理系统架构及驱动实现详细分析

Linux电源管理系统架构及驱动实现详细分析

1. 系统架构层次

1.1 整体架构概览

Linux电源管理系统采用分层架构设计,从用户空间到硬件抽象层形成完整的电源管理生态:

复制代码
用户空间层
    ↓
系统调用接口(syscalls)
    ↓
内核核心子系统(CPUFreq、CPUIdle、Runtime PM、Suspend)
    ↓
设备驱动框架(dev_pm_ops、genpd、regulator)
    ↓
硬件抽象层(ACPI、DT、arch-specific)
    ↓
硬件平台(SoC、设备、时钟、电源域)

1.2 用户空间组件详解

1.2.1 systemd-logind电源管理

systemd-logind是现代化的用户会话管理器,负责协调用户级别的电源管理操作:

核心功能:

c 复制代码
// systemd源码中的关键结构体
struct Manager {
    Hashmap *devices;
    Hashmap *seats;
    Hashmap *sessions;
    Hashmap *users;
    Hashmap *inhibitors;  // 关键:阻止关机列表
    Hashmap *buttons;
    
    int inhibit_fd;
    sd_event *event;
    
    bool handle_power_key;
    bool handle_suspend_key;
    bool handle_hibernate_key;
    bool handle_lid_switch;
};

电源事件处理流程:

复制代码
硬件事件(电源键/休眠键/合盖)
    ↓
内核input子系统
    ↓
udev规则匹配
    ↓
systemd-logind接收事件
    ↓
检查inhibitor阻止列表
    ↓
执行相应操作(suspend/hibernate/poweroff)
1.2.2 pm-utils工具套件

pm-utils提供传统的电源管理工具集:

主要组件:

bash 复制代码
# pm-utils核心脚本
/usr/lib/pm-utils/bin/pm-action      # 核心执行脚本
/usr/lib/pm-utils/bin/pm-is-supported  # 能力检测
/usr/lib/pm-utils/bin/pm-powersave    # 省电模式控制

# 配置文件目录
/etc/pm/config.d/     # 全局配置
/etc/pm/sleep.d/      # suspend/resume钩子脚本
/etc/pm/power.d/      # 电源状态变化钩子

执行机制:

c 复制代码
// pm-action核心逻辑
int main(int argc, char **argv) {
    char *action = argv[1];
    
    // 1. 加载配置
    load_config();
    
    // 2. 执行前置钩子
    run_hooks("before", action);
    
    // 3. 执行内核接口
    if (strcmp(action, "suspend") == 0)
        write_state("/sys/power/state", "mem");
    else if (strcmp(action, "hibernate") == 0)
        write_state("/sys/power/state", "disk");
    
    // 4. 执行后置钩子
    run_hooks("after", action);
    
    return 0;
}
1.2.3 upower守护进程

upower提供设备电源状态监控和抽象:

架构设计:

复制代码
upowerd (系统守护进程)
    ├── UPowerClient (D-Bus客户端)
    ├── UPowerDevice (设备抽象)
    ├── UPowerDaemon (核心服务)
    └── UPowerWakeups (唤醒源监控)

关键数据结构:

c 复制代码
struct UpDevice {
    gchar *object_path;
    gchar *native_path;
    gchar *vendor;
    gchar *model;
    gchar *serial;
    gchar *update_time;
    
    UpDeviceKind kind;
    UpDeviceState state;
    
    gdouble energy;
    gdouble energy_empty;
    gdouble energy_full;
    gdouble energy_rate;
    
    gint64 time_to_empty;
    gint64 time_to_full;
    gdouble percentage;
    
    gboolean is_present;
    gboolean is_rechargeable;
    gboolean power_supply;
};

1.3 内核子系统协作关系

1.3.1 CPUFreq子系统

CPUFreq负责CPU频率动态调节:

核心架构:

复制代码
CPUFreq Core
    ├── Governors (性能策略)
    │   ├── performance
    │   ├── powersave
    │   ├── userspace
    │   ├── ondemand
    │   └── conservative
    ├── Drivers (硬件驱动)
    │   ├── intel_pstate
    │   ├── acpi-cpufreq
    │   ├── arm_big_little
    │   └── cpufreq-dt
    └── Stats (统计信息)

关键数据结构:

c 复制代码
struct cpufreq_policy {
    /* 核心策略信息 */
    cpumask_var_t cpus;          /* 受影响的CPU列表 */
    cpumask_var_t related_cpus;  /* 相关CPU列表 */
    
    unsigned int min;            /* 最小频率 (kHz) */
    unsigned int max;            /* 最大频率 (kHz) */
    unsigned int cur;            /* 当前频率 (kHz) */
    
    /* 驱动回调 */
    struct cpufreq_governor *governor;
    struct cpufreq_driver *driver;
    
    /* 统计信息 */
    struct cpufreq_stats *stats;
    
    /* 电源管理 */
    struct device *cpu_dev;
    struct freq_constraints constraints;
    
    /* 工作队列 */
    struct work_struct update;
    struct delayed_work work;
};
1.3.2 CPUIdle子系统

CPUIdle管理CPU空闲状态:

状态层次:

复制代码
CPUIdle Core
    ├── Governors (空闲策略)
    │   ├── ladder
    │   ├── menu
    │   └── teo
    ├── Drivers (硬件驱动)
    │   ├── intel_idle
    │   ├── acpi_idle
    │   ├── arm_idle
    │   └── cpuidle-dt
    └── States (空闲状态)
        ├── C0 (运行状态)
        ├── C1 (快速唤醒)
        ├── C2 (深度睡眠)
        └── C3 (最深睡眠)

核心数据结构:

c 复制代码
struct cpuidle_device {
    unsigned int registered:1;
    unsigned int enabled:1;
    unsigned int poll_time_limit:1;
    
    int last_state_idx;
    int last_residency;
    
    u64 next_hrtimer;
    
    struct cpuidle_state_usage states_usage[CPUIDLE_STATE_MAX];
    struct cpuidle_state *states;
    
    struct cpuidle_driver *driver;
    struct device dev;
    
    /* 统计信息 */
    struct cpuidle_device_kstats *kstats;
};

struct cpuidle_state {
    char name[CPUIDLE_NAME_LEN];
    char desc[CPUIDLE_DESC_LEN];
    
    unsigned int flags;
    unsigned int exit_latency;     /* 退出延迟 (us) */
    unsigned int target_residency; /* 目标驻留时间 (us) */
    unsigned int power_usage;      /* 功耗 (mW) */
    
    int (*enter)(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index);
};
1.3.3 Runtime PM子系统

Runtime PM实现运行时电源管理:

状态机设计:

复制代码
RPM_ACTIVE → RPM_SUSPENDING → RPM_SUSPENDED → RPM_RESUMING → RPM_ACTIVE
     ↑                                              ↓
     └──────────────────────────────────────────────┘

核心实现:

c 复制代码
struct dev_pm_info {
    spinlock_t lock;
    
    struct timer_list suspend_timer;
    unsigned long timer_expires;
    
    struct work_struct work;
    wait_queue_head_t wait_queue;
    
    atomic_t usage_count;
    atomic_t child_count;
    atomic_t suspend_depth;
    
    unsigned int ignore_children:1;
    unsigned int no_callbacks:1;
    unsigned int irq_safe:1;
    unsigned int use_autosuspend:1;
    unsigned int timer_autosuspends:1;
    unsigned int memalloc_noio:1;
    
    int runtime_status;
    int request_pending;
    int deferred_resume;
    
    unsigned long runtime_error;
    
    int autosuspend_delay;
    unsigned long last_busy;
    
    struct pm_subsys_data *subsys_data;
    struct pm_domain *domain;
};
1.3.4 Suspend/Hibernate子系统

Suspend子系统管理系统级睡眠状态:

睡眠状态层次:

复制代码
PM_SUSPEND_ON        (正常运行)
PM_SUSPEND_FREEZE    (冻结进程)
PM_SUSPEND_STANDBY   (待机状态)
PM_SUSPEND_MEM       (内存挂起)
PM_SUSPEND_MAX       (最大挂起深度)

核心数据结构:

c 复制代码
struct pm_sleep_state {
    const char *name;
    unsigned int flags;
    
    int (*enter)(suspend_state_t state);
    int (*prepare)(void);
    int (*prepare_late)(void);
    int (*wake)(void);
    int (*finish)(void);
};

struct platform_suspend_ops {
    int (*valid)(suspend_state_t state);
    int (*begin)(suspend_state_t state);
    int (*prepare)(void);
    int (*prepare_late)(void);
    int (*enter)(suspend_state_t state);
    void (*wake)(void);
    void (*finish)(void);
    void (*end)(void);
    void (*recover)(void);
};

1.4 硬件抽象层交互

1.4.1 ACPI框架集成

ACPI提供标准化的电源管理接口:

核心组件:

复制代码
ACPI CA Core
    ├── ACPI Tables (FADT, DSDT, SSDT)
    ├── ACPI Hardware (寄存器接口)
    ├── ACPI Events (GPE, Fixed Events)
    └── ACPI Methods (AML解释器)

电源管理集成:

c 复制代码
struct acpi_device_power_state {
    u8 power_state;
    u8 order;
    struct list_head resources;
    struct list_head children;
};

struct acpi_device {
    struct acpi_device_ops *ops;
    struct acpi_driver *driver;
    struct acpi_device_power_state *power;
    
    int power_state;
    int state;
    
    struct list_head children;
    struct acpi_device *parent;
    struct list_head node;
    struct list_head wakeup;
    
    struct work_struct work;
    struct mutex physical_node_lock;
    struct list_head physical_node_list;
};
1.4.2 Device Tree电源管理

DT提供描述性的电源管理配置:

电源管理节点:

dts 复制代码
/* 示例:CPU电源管理配置 */
cpu0: cpu@0 {
    device_type = "cpu";
    compatible = "arm,cortex-a53";
    reg = <0x0>;
    
    /* CPUFreq配置 */
    operating-points-v2 = <&cpu_opp_table>;
    
    /* CPUIdle配置 */
    cpu-idle-states = <&CPU_SLEEP_0 &CPU_SLEEP_1>;
    
    /* 电源域 */
    power-domains = <&pd_cpu0>;
    
    /* 时钟源 */
    clocks = <&clk_cpu0>;
    clock-names = "cpu";
};

/* 操作点表 */
cpu_opp_table: opp-table {
    compatible = "operating-points-v2";
    opp-shared;
    
    opp-500000000 {
        opp-hz = /bits/ 64 <500000000>;
        opp-microvolt = <900000>;
        opp-supported-hw = <0x1>;
    };
    
    opp-1000000000 {
        opp-hz = /bits/ 64 <1000000000>;
        opp-microvolt = <1100000>;
        opp-supported-hw = <0x1>;
    };
};

2. 驱动开发要点

2.1 设备电源状态管理

2.1.1 struct dev_pm_ops详解

dev_pm_ops是设备电源管理的核心回调接口:

c 复制代码
struct dev_pm_ops {
    /* 系统级挂起/恢复 */
    int (*prepare)(struct device *dev);
    void (*complete)(struct device *dev);
    int (*suspend)(struct device *dev);
    int (*resume)(struct device *dev);
    int (*freeze)(struct device *dev);
    int (*thaw)(struct device *dev);
    int (*poweroff)(struct device *dev);
    int (*restore)(struct device *dev);
    
    /* Runtime PM */
    int (*runtime_suspend)(struct device *dev);
    int (*runtime_resume)(struct device *dev);
    int (*runtime_idle)(struct device *dev);
    
    /* 异步挂起/恢复 */
    int (*suspend_late)(struct device *dev);
    int (*resume_early)(struct device *dev);
    int (*freeze_late)(struct device *dev);
    int (*thaw_early)(struct device *dev);
    int (*poweroff_late)(struct device *dev);
    int (*restore_early)(struct device *dev);
    
    /* 无中断上下文 */
    int (*suspend_noirq)(struct device *dev);
    int (*resume_noirq)(struct device *dev);
    int (*freeze_noirq)(struct device *dev);
    int (*thaw_noirq)(struct device *dev);
    int (*poweroff_noirq)(struct device *dev);
    int (*restore_noirq)(struct device *dev);
    
    /* 休眠特定 */
    int (*set_wakeup)(struct device *dev, bool enable);
    int (*check_wakeup)(struct device *dev);
};
2.1.2 回调函数实现规范

runtime_suspend实现示例:

c 复制代码
static int my_device_runtime_suspend(struct device *dev)
{
    struct my_device *my_dev = dev_get_drvdata(dev);
    int ret;
    
    /* 1. 保存设备上下文 */
    my_dev->context = my_device_save_context(my_dev);
    
    /* 2. 停止DMA传输 */
    ret = my_device_stop_dma(my_dev);
    if (ret)
        return ret;
    
    /* 3. 禁用中断 */
    my_device_disable_irq(my_dev);
    
    /* 4. 关闭设备时钟 */
    clk_disable_unprepare(my_dev->clk);
    
    /* 5. 设置电源状态 */
    pm_runtime_set_suspended(dev);
    
    /* 6. 可选:降低电压 */
    if (my_dev->regulator)
        regulator_set_voltage(my_dev->regulator, 
                             my_dev->suspend_voltage, 
                             my_dev->suspend_voltage);
    
    dev_dbg(dev, "runtime suspend completed\n");
    return 0;
}

runtime_resume实现示例:

c 复制代码
static int my_device_runtime_resume(struct device *dev)
{
    struct my_device *my_dev = dev_get_drvdata(dev);
    int ret;
    
    /* 1. 恢复电压 */
    if (my_dev->regulator) {
        ret = regulator_set_voltage(my_dev->regulator,
                                  my_dev->resume_voltage,
                                  my_dev->resume_voltage);
        if (ret)
            return ret;
    }
    
    /* 2. 启用时钟 */
    ret = clk_prepare_enable(my_dev->clk);
    if (ret)
        return ret;
    
    /* 3. 恢复设备上下文 */
    my_device_restore_context(my_dev, my_dev->context);
    
    /* 4. 启用中断 */
    my_device_enable_irq(my_dev);
    
    /* 5. 恢复DMA */
    ret = my_device_resume_dma(my_dev);
    if (ret)
        goto err_dma;
    
    /* 6. 设置电源状态 */
    pm_runtime_set_active(dev);
    
    dev_dbg(dev, "runtime resume completed\n");
    return 0;
    
err_dma:
    my_device_disable_irq(my_dev);
    clk_disable_unprepare(my_dev->clk);
    return ret;
}
2.1.3 驱动注册和初始化

完整驱动示例:

c 复制代码
#include <linux/pm_runtime.h>
#include <linux/pm.h>

static const struct dev_pm_ops my_device_pm_ops = {
    .runtime_suspend = my_device_runtime_suspend,
    .runtime_resume = my_device_runtime_resume,
    .runtime_idle = my_device_runtime_idle,
    .suspend = my_device_system_suspend,
    .resume = my_device_system_resume,
    .prepare = my_device_prepare,
    .complete = my_device_complete,
};

static struct platform_driver my_device_driver = {
    .probe = my_device_probe,
    .remove = my_device_remove,
    .driver = {
        .name = "my-device",
        .pm = &my_device_pm_ops,
        .of_match_table = my_device_of_match,
    },
};

static int my_device_probe(struct platform_device *pdev)
{
    struct device *dev = &pdev->dev;
    struct my_device *my_dev;
    int ret;
    
    /* 分配设备结构体 */
    my_dev = devm_kzalloc(dev, sizeof(*my_dev), GFP_KERNEL);
    if (!my_dev)
        return -ENOMEM;
    
    platform_set_drvdata(pdev, my_dev);
    
    /* 初始化Runtime PM */
    pm_runtime_set_autosuspend_delay(dev, 1000); /* 1秒自动挂起延迟 */
    pm_runtime_use_autosuspend(dev);
    pm_runtime_enable(dev);
    
    /* 获取时钟 */
    my_dev->clk = devm_clk_get(dev, NULL);
    if (IS_ERR(my_dev->clk))
        return PTR_ERR(my_dev->clk);
    
    /* 获取调节器 */
    my_dev->regulator = devm_regulator_get(dev, "vcc");
    if (IS_ERR(my_dev->regulator))
        my_dev->regulator = NULL;
    
    /* 硬件初始化 */
    ret = my_device_hw_init(my_dev);
    if (ret)
        goto err_hw;
    
    /* 设置初始状态为活跃 */
    pm_runtime_set_active(dev);
    pm_runtime_mark_last_busy(dev);
    
    dev_info(dev, "probe completed\n");
    return 0;
    
err_hw:
    pm_runtime_disable(dev);
    return ret;
}

static int my_device_remove(struct platform_device *pdev)
{
    struct device *dev = &pdev->dev;
    
    /* 禁用Runtime PM */
    pm_runtime_disable(dev);
    pm_runtime_set_suspended(dev);
    
    return 0;
}

2.2 电源域(Power Domain)管理

2.2.1 genpd框架架构

genpd(Generic Power Domain)提供通用的电源域管理框架:

复制代码
genpd Core
    ├── Power Domain Tree (层次结构)
    ├── Performance States (性能状态)
    ├── Power State Callbacks (状态回调)
    ├── Device Attachments (设备关联)
    └── Power State Coordination (状态协调)
2.2.2 电源域定义和配置

简单电源域示例:

c 复制代码
#include <linux/pm_domain.h>

static int my_pd_power_on(struct generic_pm_domain *domain)
{
    struct my_domain *my_pd = pd_to_my_domain(domain);
    
    /* 启用电源域时钟 */
    clk_prepare_enable(my_pd->clk);
    
    /* 设置调节器电压 */
    if (my_pd->regulator)
        regulator_set_voltage(my_pd->regulator, 
                            my_pd->on_voltage, my_pd->on_voltage);
    
    /* 可选:等待电源稳定 */
    msleep(my_pd->power_on_delay);
    
    return 0;
}

static int my_pd_power_off(struct generic_pm_domain *domain)
{
    struct my_domain *my_pd = pd_to_my_domain(domain);
    
    /* 关闭调节器 */
    if (my_pd->regulator)
        regulator_set_voltage(my_pd->regulator, 0, 0);
    
    /* 关闭时钟 */
    clk_disable_unprepare(my_pd->clk);
    
    return 0;
}

static struct generic_pm_domain my_pd = {
    .name = "my-power-domain",
    .power_on = my_pd_power_on,
    .power_off = my_pd_power_off,
    .flags = GENPD_FLAG_ALWAYS_ON,  /* 或者 GENPD_FLAG_PM_CLK */
};

复杂电源域层次结构:

c 复制代码
/* 父电源域 */
static struct generic_pm_domain *parent_pd;

/* 子电源域 */
static struct generic_pm_domain *child_pd1;
static struct generic_pm_domain *child_pd2;

static int __init my_pd_init(void)
{
    int ret;
    
    /* 注册父电源域 */
    parent_pd = devm_genpd_init(&pdev->dev, NULL, &my_parent_pd);
    if (IS_ERR(parent_pd))
        return PTR_ERR(parent_pd);
    
    /* 设置父电源域 */
    genpd_set_parent(child_pd1, parent_pd);
    genpd_set_parent(child_pd2, parent_pd);
    
    /* 注册子电源域 */
    ret = devm_genpd_add_provider(&pdev->dev, child_pd1, "domain1");
    if (ret)
        return ret;
    
    ret = devm_genpd_add_provider(&pdev->dev, child_pd2, "domain2");
    if (ret)
        return ret;
    
    return 0;
}
2.2.3 性能状态管理

性能状态定义:

c 复制代码
static struct genpd_power_state my_pd_states[] = {
    {
        .power_on_latency_ns = 100000,   /* 100us */
        .power_off_latency_ns = 200000,  /* 200us */
        .residency_ns = 1000000,          /* 1ms */
        .usage_count = 0,
    },
    {
        .power_on_latency_ns = 500000,   /* 500us */
        .power_off_latency_ns = 1000000, /* 1ms */
        .residency_ns = 5000000,         /* 5ms */
        .usage_count = 0,
    },
};

static struct generic_pm_domain my_pd = {
    .name = "my-pd",
    .power_on = my_pd_power_on,
    .power_off = my_pd_power_off,
    .states = my_pd_states,
    .state_count = ARRAY_SIZE(my_pd_states),
    .flags = GENPD_FLAG_PM_CLK | GENPD_FLAG_CPU_DOMAIN,
};

2.3 时钟和电压调节

2.3.1 cpufreq-dt集成

设备树配置:

dts 复制代码
/* CPU频率调节配置 */
cpu0: cpu@0 {
    operating-points-v2 = <&cpu_opp_table>;
    clock-latency = <100000>; /* 100us */
    
    /* 电压调节器 */
    cpu-supply = <&vdd_cpu>;
};

/* 操作点表 */
cpu_opp_table: opp-table {
    compatible = "operating-points-v2";
    opp-shared;
    
    opp-500000000 {
        opp-hz = /bits/ 64 <500000000>;
        opp-microvolt = <900000>;
        opp-microvolt-l = <875000>;
        opp-microvolt-h = <925000>;
    };
    
    opp-1000000000 {
        opp-hz = /bits/ 64 <1000000000>;
        opp-microvolt = <1100000>;
        opp-microvolt-l = <1075000>;
        opp-microvolt-h = <1125000>;
    };
};

驱动集成:

c 复制代码
static int my_cpufreq_probe(struct platform_device *pdev)
{
    struct device *dev = &pdev->dev;
    struct cpufreq_dt *cdt;
    int ret;
    
    cdt = devm_kzalloc(dev, sizeof(*cdt), GFP_KERNEL);
    if (!cdt)
        return -ENOMEM;
    
    /* 解析操作点 */
    ret = dev_pm_opp_of_add_table(dev);
    if (ret)
        return ret;
    
    /* 获取调节器 */
    cdt->reg = devm_regulator_get(dev, "cpu");
    if (IS_ERR(cdt->reg))
        return PTR_ERR(cdt->reg);
    
    /* 注册cpufreq驱动 */
    ret = cpufreq_register_driver(&cpufreq_dt_driver);
    if (ret)
        goto err_reg;
    
    platform_set_drvdata(pdev, cdt);
    return 0;
    
err_reg:
    dev_pm_opp_of_remove_table(dev);
    return ret;
}
2.3.2 Regulator框架集成

调节器配置:

c 复制代码
static const struct regulator_desc my_regulator_desc = {
    .name = "vdd_cpu",
    .type = REGULATOR_VOLTAGE,
    .owner = THIS_MODULE,
    
    .n_voltages = 32,
    .min_uV = 800000,
    .uV_step = 25000,
    
    .vsel_reg = REGULATOR_VOLTAGE_SELECTION,
    .vsel_mask = 0x1F,
    
    .enable_reg = REGULATOR_ENABLE_REGISTER,
    .enable_mask = 0x01,
    
    .enable_time = 500, /* 500us */
    .ramp_delay = 12500, /* 12.5mV/us */
    
    .ops = &my_regulator_ops,
};

static const struct regulator_ops my_regulator_ops = {
    .list_voltage = regulator_list_voltage_linear,
    .set_voltage_sel = my_regulator_set_voltage_sel,
    .get_voltage_sel = my_regulator_get_voltage_sel,
    .enable = my_regulator_enable,
    .disable = my_regulator_disable,
    .is_enabled = my_regulator_is_enabled,
};

时钟门控集成:

c 复制代码
static int my_device_clk_notify(struct notifier_block *nb,
                               unsigned long action, void *data)
{
    struct clk_notifier_data *clk_data = data;
    struct my_device *my_dev = container_of(nb, struct my_device, clk_nb);
    
    switch (action) {
    case PRE_RATE_CHANGE:
        /* 准备频率变化 */
        my_device_prepare_freq_change(my_dev, clk_data->new_rate);
        break;
        
    case POST_RATE_CHANGE:
        /* 频率变化完成 */
        my_device_complete_freq_change(my_dev, clk_data->new_rate);
        break;
        
    case ABORT_RATE_CHANGE:
        /* 频率变化取消 */
        my_device_abort_freq_change(my_dev, clk_data->old_rate);
        break;
    }
    
    return NOTIFY_OK;
}

static int my_device_probe(struct platform_device *pdev)
{
    struct my_device *my_dev;
    int ret;
    
    my_dev = devm_kzalloc(&pdev->dev, sizeof(*my_dev), GFP_KERNEL);
    if (!my_dev)
        return -ENOMEM;
    
    /* 获取时钟 */
    my_dev->clk = devm_clk_get(&pdev->dev, NULL);
    if (IS_ERR(my_dev->clk))
        return PTR_ERR(my_dev->clk);
    
    /* 注册时钟通知 */
    my_dev->clk_nb.notifier_call = my_device_clk_notify;
    ret = clk_notifier_register(my_dev->clk, &my_dev->clk_nb);
    if (ret)
        return ret;
    
    /* 启用时钟 */
    ret = clk_prepare_enable(my_dev->clk);
    if (ret)
        goto err_clk;
    
    platform_set_drvdata(pdev, my_dev);
    return 0;
    
err_clk:
    clk_notifier_unregister(my_dev->clk, &my_dev->clk_nb);
    return ret;
}

3. 关键实现机制

3.1 Suspend执行流程详解

3.1.1 系统挂起到内存(Suspend-to-RAM)

完整执行流程:

复制代码
用户空间请求挂起
    ↓
sysfs写入"mem"到/sys/power/state
    ↓
enter_state() [kernel/power/suspend.c]
    ├── pm_suspend_begin()
    │   ├── 获取挂起状态
    │   ├── 调用platform_suspend_begin()
    │   └── 启动挂起控制台
    ├── suspend_prepare()
    │   ├── 冻结用户空间进程
    │   ├── 禁用用户模式helper
    │   └── 调用device_prepare()
    ├── suspend_suspend()
    │   ├── 禁用非启动CPU
    │   ├── 挂起时间keeping
    │   ├── 调用device_suspend()
    │   └── 架构特定挂起准备
    ├── suspend_enter()
    │   ├── 禁用中断
    │   ├── 调用device_suspend_late()
    │   ├── 调用device_suspend_noirq()
    │   ├── 架构特定挂起
    │   │   └── CPU进入睡眠状态
    │   └── 唤醒后恢复
    └── suspend_finish()
        ├── 调用device_resume_noirq()
        ├── 调用device_resume_early()
        ├── 启用中断
        ├── 调用device_resume()
        ├── 恢复时间keeping
        ├── 启用非启动CPU
        └── 恢复用户空间
3.1.2 关键步骤详细分析

进程冻结机制:

c 复制代码
static int suspend_prepare(void)
{
    int error;
    
    /* 1. 冻结用户空间进程 */
    error = freeze_processes();
    if (error)
        goto out;
    
    /* 2. 禁用用户模式helper */
    usermodehelper_disable();
    
    /* 3. 通知设备准备挂起 */
    error = device_prepare();
    if (error)
        goto thaw;
    
    /* 4. 控制台挂起 */
    suspend_console();
    
    return 0;
    
thaw:
    usermodehelper_enable();
    thaw_processes();
out:
    return error;
}

int freeze_processes(void)
{
    int error;
    
    /* 1. 保存当前状态 */
    error = freeze_kernel_threads();
    if (error)
        goto out;
    
    /* 2. 冻结用户进程 */
    error = freeze_user_processes();
    if (error)
        goto thaw_kernel;
    
    /* 3. 冻结工作队列 */
    error = freeze_workqueues();
    if (error)
        goto thaw_user;
    
    return 0;
    
thaw_user:
    thaw_user_processes();
thaw_kernel:
    thaw_kernel_threads();
out:
    return error;
}

设备挂起序列:

c 复制代码
static int device_suspend(struct device *dev)
{
    const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
    int error = 0;
    
    /* 1. 检查设备状态 */
    if (dev->power.status != DPM_ON)
        return 0;
    
    /* 2. 调用prepare回调 */
    if (pm && pm->prepare) {
        error = pm->prepare(dev);
        if (error)
            goto err_out;
    }
    
    /* 3. 挂起设备 */
    if (pm && pm->suspend) {
        error = pm->suspend(dev);
        if (error)
            goto err_finish;
    }
    
    /* 4. 更新设备状态 */
    dev->power.status = DPM_SUSPENDED;
    
    return 0;
    
err_finish:
    if (pm && pm->complete)
        pm->complete(dev);
err_out:
    dev->power.status = DPM_ON;
    return error;
}

3.2 QoS框架协调机制

3.2.1 PM QoS架构

PM QoS提供系统级功耗质量协调:

复制代码
PM QoS Core
    ├── CPU延迟要求 (cpu_latency_qos)
    ├── 网络延迟要求 (network_latency_qos)
    ├── 网络吞吐量要求 (network_throughput_qos)
    ├── 内存带宽要求 (memory_bandwidth_qos)
    └── 设备特定要求 (device_pm_qos)
3.2.2 CPU延迟QoS实现

核心数据结构:

c 复制代码
struct pm_qos_constraints {
    struct plist_head list;
    s32 target_value;      /* 聚合目标值 */
    s32 default_value;     /* 默认值 */
    enum pm_qos_type type; /* 约束类型 */
    char *name;
    
    /* 通知机制 */
    struct blocking_notifier_head *notifiers;
    
    /* 同步机制 */
    spinlock_t lock;
    
    /* 调试信息 */
    struct pm_qos_object *parent;
};

struct pm_qos_request {
    struct plist_node node;
    s32 value;
    enum pm_qos_type type;
    struct pm_qos_constraints *constraints;
    char *name;
};

约束聚合算法:

c 复制代码
static void pm_qos_set_value(struct pm_qos_constraints *c, s32 value)
{
    struct plist_node *node;
    s32 aggregate_value;
    
    /* 1. 更新请求值 */
    c->target_value = value;
    
    /* 2. 计算聚合值 */
    if (plist_head_empty(&c->list)) {
        aggregate_value = c->default_value;
    } else {
        switch (c->type) {
        case PM_QOS_MIN:
            /* 取最小值 */
            node = plist_first(&c->list);
            aggregate_value = plist_node_value(node);
            break;
            
        case PM_QOS_MAX:
            /* 取最大值 */
            node = plist_last(&c->list);
            aggregate_value = plist_node_value(node);
            break;
            
        case PM_QOS_SUM:
            /* 求和 */
            aggregate_value = 0;
            plist_for_each_entry(node, &c->list, node) {
                aggregate_value += plist_node_value(node);
            }
            break;
            
        default:
            aggregate_value = c->default_value;
            break;
        }
    }
    
    /* 3. 应用聚合值 */
    if (aggregate_value != c->target_value) {
        c->target_value = aggregate_value;
        blocking_notifier_call_chain(c->notifiers, 
                                     PM_QOS_UPDATE, &aggregate_value);
    }
}
3.2.3 设备PM QoS集成

设备驱动集成:

c 复制代码
static int my_device_probe(struct platform_device *pdev)
{
    struct device *dev = &pdev->dev;
    struct my_device *my_dev;
    int ret;
    
    my_dev = devm_kzalloc(dev, sizeof(*my_dev), GFP_KERNEL);
    if (!my_dev)
        return -ENOMEM;
    
    /* 1. 初始化PM QoS */
    pm_runtime_enable(dev);
    
    /* 2. 设置延迟要求 */
    ret = pm_qos_add_request(&my_dev->pm_qos_req, 
                           PM_QOS_CPU_DMA_LATENCY, 
                           PM_QOS_DEFAULT_VALUE);
    if (ret)
        return ret;
    
    /* 3. 设备初始化 */
    ret = my_device_hw_init(my_dev);
    if (ret)
        goto err_hw;
    
    platform_set_drvdata(pdev, my_dev);
    return 0;
    
err_hw:
    pm_qos_remove_request(&my_dev->pm_qos_req);
    return ret;
}

/* 运行时更新QoS要求 */
static void my_device_set_performance(struct my_device *my_dev, 
                                    enum my_device_perf perf)
{
    switch (perf) {
    case PERFORMANCE_LOW:
        /* 低性能模式:允许高延迟 */
        pm_qos_update_request(&my_dev->pm_qos_req, 1000); /* 1ms */
        break;
        
    case PERFORMANCE_NORMAL:
        /* 正常性能:中等延迟 */
        pm_qos_update_request(&my_dev->pm_qos_req, 100); /* 100us */
        break;
        
    case PERFORMANCE_HIGH:
        /* 高性能模式:最低延迟 */
        pm_qos_update_request(&my_dev->pm_qos_req, 0); /* 无延迟 */
        break;
    }
}

3.3 Wakeup Source机制

3.3.1 唤醒源架构

Wakeup source管理系统唤醒事件:

复制代码
Wakeup Source Core
    ├── Active Wakeup Sources (活跃唤醒源)
    ├── Wakeup Events (唤醒事件计数)
    ├── Wakeup IRQs (唤醒中断)
    └── Wakeup Statistics (唤醒统计)
3.3.2 唤醒源注册和管理

唤醒源结构体:

c 复制代码
struct wakeup_source {
    const char *name;
    struct list_head entry;
    
    /* 状态计数 */
    atomic_t active;
    atomic_t event_count;
    atomic_t relax_count;
    
    /* 时间统计 */
    ktime_t active_time;
    ktime_t total_time;
    ktime_t max_time;
    ktime_t last_time;
    
    /* 防止并发访问 */
    spinlock_t lock;
    
    /* 统计信息 */
    struct wakeup_source_stats stats;
    
    /* 关联设备 */
    struct device *dev;
    
    /* 工作队列 */
    struct work_struct work;
};

驱动实现示例:

c 复制代码
static int my_device_probe(struct platform_device *pdev)
{
    struct device *dev = &pdev->dev;
    struct my_device *my_dev;
    int ret;
    
    my_dev = devm_kzalloc(dev, sizeof(*my_dev), GFP_KERNEL);
    if (!my_dev)
        return -ENOMEM;
    
    /* 1. 创建唤醒源 */
    my_dev->ws = wakeup_source_register(dev, "my-device");
    if (!my_dev->ws)
        return -ENOMEM;
    
    /* 2. 初始化中断 */
    my_dev->irq = platform_get_irq(pdev, 0);
    if (my_dev->irq < 0)
        return my_dev->irq;
    
    ret = devm_request_irq(dev, my_dev->irq, my_device_irq_handler,
                          IRQF_TRIGGER_RISING | IRQF_ONESHOT,
                          "my-device", my_dev);
    if (ret)
        return ret;
    
    /* 3. 设置唤醒能力 */
    device_init_wakeup(dev, true);
    enable_irq_wake(my_dev->irq);
    
    platform_set_drvdata(pdev, my_dev);
    return 0;
}

/* 中断处理函数 */
static irqreturn_t my_device_irq_handler(int irq, void *data)
{
    struct my_device *my_dev = data;
    
    /* 标记唤醒事件 */
    __pm_stay_awake(my_dev->ws);
    
    /* 处理中断 */
    my_device_handle_interrupt(my_dev);
    
    /* 延迟释放唤醒锁 */
    pm_relax(my_dev->ws);
    
    return IRQ_HANDLED;
}
3.3.3 唤醒事件统计

统计信息获取:

c 复制代码
static ssize_t wakeup_stats_show(struct device *dev,
                                struct device_attribute *attr, char *buf)
{
    struct wakeup_source *ws;
    ktime_t total_time;
    ktime_t max_time;
    ktime_t active_time;
    unsigned long flags;
    
    ws = dev_get_drvdata(dev);
    
    spin_lock_irqsave(&ws->lock, flags);
    
    total_time = ws->total_time;
    max_time = ws->max_time;
    
    if (ws->active) {
        active_time = ktime_sub(ktime_get(), ws->last_time);
        total_time = ktime_add(total_time, active_time);
        if (active_time > max_time)
            max_time = active_time;
    }
    
    spin_unlock_irqrestore(&ws->lock, flags);
    
    return sprintf(buf, 
                   "name: %s\n"
                   "active_count: %d\n"
                   "event_count: %d\n"
                   "wakeup_count: %d\n"
                   "expire_count: %d\n"
                   "active_since: %lld\n"
                   "total_time: %lld\n"
                   "max_time: %lld\n"
                   "last_change: %lld\n",
                   ws->name,
                   atomic_read(&ws->active),
                   atomic_read(&ws->event_count),
                   ws->stats.wakeup_count,
                   ws->stats.expire_count,
                   ws->active ? ktime_to_ms(ktime_sub(ktime_get(), ws->last_time)) : 0,
                   ktime_to_ms(total_time),
                   ktime_to_ms(max_time),
                   ktime_to_ms(ws->last_time));
}

4. 调试与优化

4.1 调试接口详解

4.1.1 /sys/power/接口

核心调试文件:

bash 复制代码
# 电源状态信息
cat /sys/power/state              # 支持的睡眠状态
cat /sys/power/pm_trace           # PM跟踪状态
cat /sys/power/pm_test            # PM测试模式
cat /sys/power/reserved_size      # 保留内存大小
cat /sys/power/image_size         # 休眠镜像大小
cat /sys/power/wakeup_count       # 唤醒事件计数

# 调试控制
echo 1 > /sys/power/pm_trace      # 启用PM跟踪
echo mem > /sys/power/state       # 挂起到内存
echo disk > /sys/power/state      # 挂起到磁盘
4.1.2 设备电源状态调试

设备电源信息:

bash 复制代码
# 查看设备电源状态
for dev in /sys/devices/*/power/control; do
    echo "Device: $dev"
    cat $dev
done

# 查看Runtime PM统计
for dev in /sys/devices/*/power/runtime_status; do
    echo "Device: $(dirname $dev)"
    echo "  Status: $(cat $dev)"
    echo "  Usage: $(cat $(dirname $dev)/runtime_usage)"
    echo "  Active: $(cat $(dirname $dev)/runtime_active_time)"
    echo "  Suspended: $(cat $(dirname $dev)/runtime_suspended_time)"
done

# 查看唤醒源
cat /sys/kernel/debug/wakeup_sources

4.2 Ftrace功耗分析

4.2.1 功耗相关跟踪点

关键跟踪事件:

bash 复制代码
# 启用PM相关跟踪
echo 1 > /sys/kernel/debug/tracing/events/power/enable

# 查看可用跟踪点
cat /sys/kernel/debug/tracing/available_events | grep power

# 具体跟踪点示例
cpu_frequency       # CPU频率变化
cpu_idle            # CPU空闲状态变化
machine_suspend     # 系统挂起事件
device_pm_callback_start  # 设备PM回调开始
device_pm_callback_end    # 设备PM回调结束
suspend_resume        # 挂起/恢复事件
4.2.2 功耗分析脚本

自动化分析脚本:

bash 复制代码
#!/bin/bash
# pm_analyze.sh - 功耗问题分析脚本

TRACE_DIR="/sys/kernel/debug/tracing"
RESULT_DIR="/tmp/pm_analysis"

mkdir -p $RESULT_DIR

# 1. 启用跟踪
echo 1 > $TRACE_DIR/events/power/enable
echo 1 > $TRACE_DIR/events/cpufreq/enable
echo 1 > $TRACE_DIR/events/cpuidle/enable

# 2. 清空缓冲区
echo > $TRACE_DIR/trace

# 3. 执行测试(例如挂起/恢复)
echo mem > /sys/power/state

# 4. 保存跟踪结果
cat $TRACE_DIR/trace > $RESULT_DIR/trace.txt

# 5. 分析结果
echo "=== 功耗分析结果 ===" > $RESULT_DIR/analysis.txt
echo "" >> $RESULT_DIR/analysis.txt

# 分析挂起时间
echo "挂起时间分析:" >> $RESULT_DIR/analysis.txt
grep "machine_suspend" $RESULT_DIR/trace.txt | \
    awk '{print $1, $2, $6}' >> $RESULT_DIR/analysis.txt

# 分析设备回调时间
echo "" >> $RESULT_DIR/analysis.txt
echo "设备回调时间分析:" >> $RESULT_DIR/analysis.txt
grep "device_pm_callback" $RESULT_DIR/trace.txt | \
    awk '{if($6=="start") start=$1; else if($6=="end") print $4, $1-start}' >> $RESULT_DIR/analysis.txt

# 分析频率变化
echo "" >> $RESULT_DIR/analysis.txt
echo "频率变化分析:" >> $RESULT_DIR/analysis.txt
grep "cpu_frequency" $RESULT_DIR/trace.txt | \
    awk '{print $1, $7}' >> $RESULT_DIR/analysis.txt

echo "分析完成,结果保存在 $RESULT_DIR"

4.3 性能分析工具

4.3.1 pm_print_times分析

启用时间统计:

bash 复制代码
# 启用PM时间统计
echo 1 > /sys/module/pm/parameters/pm_print_times

# 查看内核日志
dmesg | grep "PM:"

典型输出分析:

复制代码
PM: suspend entry 2023-12-01 10:30:45.123456789
PM: Preparing system for mem sleep
PM: Freezing user space processes ... (elapsed 0.001 seconds) done.
PM: Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: Suspending system devices ...
PM: suspend of devices complete after 123.456 msecs
PM: suspend of devices late complete after 45.789 msecs
PM: suspend of devices noirq complete after 12.345 msecs
PM: Suspending cpus ...
PM: resume of devices noirq complete after 8.765 msecs
PM: resume of devices early complete after 34.567 msecs
PM: resume of devices complete after 156.789 msecs
PM: Finishing wakeup.
PM: suspend exit 2023-12-01 10:30:45.789012345
4.3.2 PowerTOP分析

PowerTOP使用:

bash 复制代码
# 安装PowerTOP
apt-get install powertop

# 校准(需要电池供电)
powertop --calibrate

# 生成报告
powertop --html=powertop_report.html

# 查看唤醒源
powertop --wakeups

# 查看功耗建议
powertop --tune
4.3.3 turbostat性能监控

CPU功耗监控:

bash 复制代码
# 监控CPU功耗和性能
turbostat --show PkgWatt,PkgTmp,Busy%,Bzy_MHz,IRQ --interval 1

# 挂起恢复测试
turbostat --show PkgWatt,PkgTmp --interval 0.1 sleep 5

5. 最佳实践总结

5.1 驱动开发最佳实践

  1. 始终实现完整的pm_ops回调
  2. 正确处理错误路径和恢复
  3. 使用autosuspend优化功耗
  4. 合理设置唤醒源
  5. 集成PM QoS要求

5.2 系统优化建议

  1. 启用适当的 governors
  2. 配置合理的空闲状态
  3. 优化设备挂起时间
  4. 监控和分析功耗数据
  5. 定期更新和调优

5.3 调试技巧

  1. 使用ftrace进行详细分析
  2. 启用pm_print_times
  3. 监控/sys/power/接口
  4. 使用专业工具(PowerTOP/turbostat)
  5. 建立自动化测试流程

这份文档提供了Linux电源管理的完整技术架构分析,涵盖了从用户空间到硬件抽象层的完整实现细节,为驱动开发和系统优化提供了详细的技术指导。

相关推荐
a123560mh4 小时前
国产信创操作系统银河麒麟常见软件适配(MongoDB、 Redis、Nginx、Tomcat)
linux·redis·nginx·mongodb·tomcat·kylin
赖small强4 小时前
【Linux驱动开发】Linux MMC子系统技术分析报告 - 第二部分:协议实现与性能优化
linux·驱动开发·mmc
guygg884 小时前
Linux服务器上安装配置GitLab
linux·运维·gitlab
百***35514 小时前
Linux(CentOS)安装 Nginx
linux·nginx·centos
tzhou644525 小时前
Linux文本处理工具:cut、sort、uniq、tr
linux·运维·服务器
顾安r5 小时前
11.19 脚本 最小web控制linux/termux
linux·服务器·css·flask
Saniffer_SH6 小时前
通过近期测试简单聊一下究竟是直接选择Nvidia Spark还是4090/5090 GPU自建环境
大数据·服务器·图像处理·人工智能·驱动开发·spark·硬件工程
程序媛_MISS_zhang_01106 小时前
vant-ui中List 组件可以与 PullRefresh 组件结合使用,实现下拉刷新的效果
java·linux·ui
dragoooon346 小时前
[Linux网络——Lesson2.socket套接字 && 简易UDP网络程序]
linux·网络·udp
大聪明-PLUS6 小时前
编程语言保证是安全软件开发的基础
linux·嵌入式·arm·smarc