【AI测试全栈:质量】40、数据平权之路:Python+Java+Vue全栈实战偏见检测与公平性测试


点击投票为我的2025博客之星评选助力!


数据平权之路:Python+Java+Vue全栈实战偏见检测与公平性测试

某银行AI信贷系统因为对特定地域人群的"歧视性"利率决策,陷入舆论风波与监管调查;而仅仅半年前,其公平性指标在内部测试中全部"达标"。这揭示了现代AI系统公平性测试的复杂性与严峻挑战。

01 算法公平性危机的时代背景

2023年全球数据隐私与算法公平性相关诉讼增长了 167% ,企业因算法歧视面临的罚款总额超过 20亿美元。不仅是金融领域,招聘、医疗、司法等关键行业的AI决策系统都面临着前所未有的公平性质疑。

欧盟AI法案将"高风险AI系统"的公平性审计列为法律要求,中国《互联网信息服务算法推荐管理规定》也明确要求算法提供者防止算法歧视。在这种背景下,开发人员不再能仅关注模型准确率,必须建立全面的偏见检测与公平性保障机制。

算法偏见问题为何如此隐蔽?一个重要原因是**"指标达标不等于实际公平"**。系统可以通过多种技术手段在统计指标上"作弊",而不解决深层次的结构性偏见。
算法公平性危机
数据偏见三大类型
选择偏见
标注偏见
历史偏见
样本代表性不足
群体覆盖不均
标注者主观偏差
标注标准不一致
历史歧视数据
结构性不平等
公平性检测三大维度
统计公平性
因果公平性
业务公平性
人口平等
机会平等
不同影响
因果图建模
反事实分析
业务规则检查
用户反馈分析

02 数据偏见的三大类型与测试场景

选择偏见:当样本不再代表整体

选择偏见是最常见的数据偏见类型,指训练数据未能代表目标总体。在招聘AI系统中,如果历史招聘数据中男性简历占85% ,女性仅15%,那么模型会从数据中"学习"到对男性候选人的偏好。

python 复制代码
# Python实现:招聘数据选择偏见检测
import pandas as pd
import numpy as np
from scipy import stats

class SelectionBiasDetector:
    def __init__(self, data_path=None):
        if data_path:
            self.data = pd.read_csv(data_path)
        else:
            # 模拟招聘数据
            np.random.seed(42)
            n_samples = 5000
            
            # 人为制造性别偏见:男性样本占80%
            genders = ['男'] * int(n_samples * 0.8) + ['女'] * int(n_samples * 0.2)
            np.random.shuffle(genders)
            
            self.data = pd.DataFrame({
                'gender': genders,
                'age': np.random.randint(22, 55, n_samples),
                'education_years': np.random.choice([12, 16, 19], n_samples, p=[0.3, 0.6, 0.1]),
                'work_experience': np.random.exponential(10, n_samples),
                'skill_score': np.random.normal(70, 15, n_samples),
                'interview_score': np.random.normal(75, 10, n_samples),
                'hired': np.random.choice([0, 1], n_samples, p=[0.7, 0.3])
            })
            
            # 增加性别与雇佣的相关性(偏见)
            male_mask = self.data['gender'] == '男'
            self.data.loc[male_mask, 'hired'] = np.random.choice(
                [0, 1], sum(male_mask), p=[0.6, 0.4]
            )
            self.data.loc[~male_mask, 'hired'] = np.random.choice(
                [0, 1], sum(~male_mask), p=[0.85, 0.15]
            )
    
    def analyze_sample_representation(self, sensitive_attribute='gender'):
        """分析敏感属性在样本中的代表性"""
        analysis = {}
        
        # 计算各群体比例
        group_counts = self.data[sensitive_attribute].value_counts()
        group_proportions = group_counts / len(self.data)
        
        analysis['group_counts'] = group_counts.to_dict()
        analysis['group_proportions'] = group_proportions.to_dict()
        
        # 计算与理想分布(如人口普查数据)的差异
        # 这里假设理想分布是男50%,女50%
        ideal_distribution = {'男': 0.5, '女': 0.5}
        
        distribution_gap = {}
        for group in group_proportions.index:
            if group in ideal_distribution:
                gap = group_proportions[group] - ideal_distribution[group]
                distribution_gap[group] = gap
        
        analysis['distribution_gap'] = distribution_gap
        
        # 卡方检验:检查样本分布是否与理想分布显著不同
        observed = list(group_counts)
        # 假设理想分布下的期望频数
        expected = [ideal_distribution.get(g, 0.5) * len(self.data) 
                   for g in group_counts.index]
        
        chi2, p_value = stats.chisquare(observed, f_exp=expected)
        analysis['chi2_test'] = {
            'chi2_statistic': chi2,
            'p_value': p_value,
            'significant_bias': p_value < 0.05
        }
        
        return analysis
    
    def analyze_outcome_disparity(self, sensitive_attribute='gender', outcome='hired'):
        """分析不同群体的结果差异"""
        disparity_analysis = {}
        
        # 按敏感属性分组计算结果
        groups = self.data[sensitive_attribute].unique()
        
        for group in groups:
            group_data = self.data[self.data[sensitive_attribute] == group]
            
            # 计算通过率
            hire_rate = group_data[outcome].mean()
            
            # 计算置信区间
            n = len(group_data)
            p = hire_rate
            
            if n > 0 and p > 0 and p < 1:
                # Wald方法计算95%置信区间
                z = 1.96  # 95%置信水平
                se = np.sqrt(p * (1 - p) / n)
                ci_lower = max(0, p - z * se)
                ci_upper = min(1, p + z * se)
            else:
                ci_lower, ci_upper = 0, 0
            
            disparity_analysis[group] = {
                'sample_size': len(group_data),
                'hire_rate': hire_rate,
                'hire_count': group_data[outcome].sum(),
                'confidence_interval': (ci_lower, ci_upper)
            }
        
        # 计算群体间差异
        if len(groups) >= 2:
            groups_list = list(groups)
            reference_group = groups_list[0]
            
            for group in groups_list[1:]:
                rate_diff = (disparity_analysis[group]['hire_rate'] - 
                           disparity_analysis[reference_group]['hire_rate'])
                
                # 计算差异的统计显著性
                n1 = disparity_analysis[reference_group]['sample_size']
                n2 = disparity_analysis[group]['sample_size']
                p1 = disparity_analysis[reference_group]['hire_rate']
                p2 = disparity_analysis[group]['hire_rate']
                
                if n1 > 0 and n2 > 0:
                    # 两比例差异的z检验
                    p_pool = (p1 * n1 + p2 * n2) / (n1 + n2)
                    se_pool = np.sqrt(p_pool * (1 - p_pool) * (1/n1 + 1/n2))
                    
                    if se_pool > 0:
                        z_score = rate_diff / se_pool
                        p_value = 2 * (1 - stats.norm.cdf(abs(z_score)))
                    else:
                        z_score, p_value = 0, 1
                else:
                    z_score, p_value = 0, 1
                
                disparity_analysis[f'{group}_vs_{reference_group}'] = {
                    'rate_difference': rate_diff,
                    'z_score': z_score,
                    'p_value': p_value,
                    'significant': p_value < 0.05
                }
        
        return disparity_analysis

# 使用示例
def detect_selection_bias():
    detector = SelectionBiasDetector()
    
    # 分析样本代表性
    representation = detector.analyze_sample_representation('gender')
    print("样本代表性分析:")
    print(f"  性别分布: {representation['group_proportions']}")
    print(f"  与理想分布差异: {representation['distribution_gap']}")
    print(f"  卡方检验p值: {representation['chi2_test']['p_value']:.4f}")
    print(f"  是否存在显著偏见: {representation['chi2_test']['significant_bias']}")
    
    # 分析结果差异
    disparity = detector.analyze_outcome_disparity('gender', 'hired')
    print("\n雇佣结果差异分析:")
    for group in ['男', '女']:
        info = disparity[group]
        print(f"  {group}: 雇佣率={info['hire_rate']:.3f}, 样本数={info['sample_size']}")
    
    if '女_vs_男' in disparity:
        diff_info = disparity['女_vs_男']
        print(f"  女性vs男性差异: {diff_info['rate_difference']:.3f}")
        print(f"  差异显著性p值: {diff_info['p_value']:.4f}")
    
    return representation, disparity

标注偏见:主观判断的算法化风险

标注偏见源于数据标注过程中的人类主观性。在情感分析任务中,不同标注员对"打工人"一词可能赋予不同的情感极性,有的标注为中性,有的标注为负面,这取决于标注员的社会经济背景和价值观。

java 复制代码
// Java实现:标注一致性分析与偏见检测
import org.apache.commons.math3.stat.inference.ChiSquareTest;
import java.util.*;
import java.util.stream.Collectors;

public class AnnotationBiasDetector {
    
    public static class AnnotationResult {
        private String itemId;
        private String annotatorId;
        private String label;
        private Map<String, Object> metadata;
        
        // 构造器、getter、setter省略
    }
    
    public static class BiasAnalysisResult {
        private double cohensKappa;
        private double fleissKappa;
        private Map<String, Double> annotatorBiasScores;
        private Map<String, Map<String, Double>> labelDistributionByAnnotator;
        private List<String> problematicItems;
        
        // 构造器、getter、setter省略
    }
    
    /**
     * 计算科恩卡帕系数(两个标注者间一致性)
     */
    public double calculateCohensKappa(int[][] confusionMatrix) {
        int n = confusionMatrix.length;
        int total = 0;
        
        // 计算观察一致性和随机一致性
        double po = 0.0; // 观察一致性
        double pe = 0.0; // 随机一致性
        
        int[] rowSums = new int[n];
        int[] colSums = new int[n];
        
        for (int i = 0; i < n; i++) {
            for (int j = 0; j < n; j++) {
                total += confusionMatrix[i][j];
                rowSums[i] += confusionMatrix[i][j];
                colSums[j] += confusionMatrix[i][j];
                
                if (i == j) {
                    po += confusionMatrix[i][j];
                }
            }
        }
        
        po /= total;
        
        for (int i = 0; i < n; i++) {
            pe += (rowSums[i] * colSums[i]);
        }
        
        pe /= (total * total);
        
        // 计算科恩卡帕系数
        if (pe == 1) {
            return 1.0; // 完全一致
        }
        
        return (po - pe) / (1 - pe);
    }
    
    /**
     * 检测标注者系统性偏见
     */
    public Map<String, Double> detectAnnotatorBias(
            List<AnnotationResult> annotations,
            List<String> labelOrder) {
        
        // 按标注者分组
        Map<String, List<AnnotationResult>> byAnnotator = annotations.stream()
            .collect(Collectors.groupingBy(AnnotationResult::getAnnotatorId));
        
        // 总体标签分布
        Map<String, Long> overallDistribution = annotations.stream()
            .collect(Collectors.groupingBy(
                AnnotationResult::getLabel,
                Collectors.counting()
            ));
        
        long totalAnnotations = annotations.size();
        
        Map<String, Double> biasScores = new HashMap<>();
        
        // 计算每个标注者的分布与总体分布的差异
        for (Map.Entry<String, List<AnnotationResult>> entry : byAnnotator.entrySet()) {
            String annotatorId = entry.getKey();
            List<AnnotationResult> annotatorAnnotations = entry.getValue();
            long annotatorTotal = annotatorAnnotations.size();
            
            // 标注者的标签分布
            Map<String, Long> annotatorDistribution = annotatorAnnotations.stream()
                .collect(Collectors.groupingBy(
                    AnnotationResult::getLabel,
                    Collectors.counting()
                ));
            
            // 计算卡方统计量
            double chiSquare = 0.0;
            int df = labelOrder.size() - 1;
            
            for (String label : labelOrder) {
                long observed = annotatorDistribution.getOrDefault(label, 0L);
                double expectedProp = overallDistribution.getOrDefault(label, 0L) / (double) totalAnnotations;
                double expected = expectedProp * annotatorTotal;
                
                if (expected > 0) {
                    chiSquare += Math.pow(observed - expected, 2) / expected;
                }
            }
            
            // 计算偏差分数(基于卡方统计量)
            // 使用卡方分布的p值转换
            ChiSquareTest chiSquareTest = new ChiSquareTest();
            double pValue = 1 - chiSquareTest.chiSquareCumulativeProbability(chiSquare, df);
            
            // 偏差分数:p值越小,偏差越大
            double biasScore = 1 - pValue;
            biasScores.put(annotatorId, biasScore);
        }
        
        return biasScores;
    }
    
    /**
     * 检测特定群体词汇的标注偏见
     */
    public Map<String, Map<String, Double>> detectTermBias(
            List<AnnotationResult> annotations,
            Map<String, List<String>> termGroups) {
        // termGroups: {"职业术语": ["打工人", "程序员", "公务员", ...], ...}
        
        Map<String, Map<String, Double>> biasResults = new HashMap<>();
        
        for (Map.Entry<String, List<String>> entry : termGroups.entrySet()) {
            String groupName = entry.getKey();
            List<String> terms = entry.getValue();
            
            // 筛选包含这些术语的标注
            List<AnnotationResult> groupAnnotations = annotations.stream()
                .filter(ann -> {
                    // 假设metadata包含文本内容
                    String text = (String) ann.getMetadata().get("text");
                    return text != null && terms.stream().anyMatch(text::contains);
                })
                .collect(Collectors.toList());
            
            // 计算该组术语的整体标签分布
            Map<String, Long> groupLabelDist = groupAnnotations.stream()
                .collect(Collectors.groupingBy(
                    AnnotationResult::getLabel,
                    Collectors.counting()
                ));
            
            // 计算整体标注的标签分布
            Map<String, Long> overallLabelDist = annotations.stream()
                .collect(Collectors.groupingBy(
                    AnnotationResult::getLabel,
                    Collectors.counting()
                ));
            
            // 计算差异
            Map<String, Double> differences = new HashMap<>();
            Set<String> allLabels = new HashSet<>();
            allLabels.addAll(groupLabelDist.keySet());
            allLabels.addAll(overallLabelDist.keySet());
            
            long groupTotal = groupAnnotations.size();
            long overallTotal = annotations.size();
            
            for (String label : allLabels) {
                double groupProp = groupTotal > 0 ? 
                    groupLabelDist.getOrDefault(label, 0L) / (double) groupTotal : 0;
                double overallProp = overallTotal > 0 ? 
                    overallLabelDist.getOrDefault(label, 0L) / (double) overallTotal : 0;
                
                double diff = groupProp - overallProp;
                differences.put(label, diff);
            }
            
            biasResults.put(groupName, differences);
        }
        
        return biasResults;
    }
}

历史偏见:当数据固化社会不公

历史偏见是最具挑战性的偏见类型,因为它植根于历史数据中已有的社会不平等。例如,在贷款数据中,历史上女性或少数族裔获得贷款的机会较少、利率较高,这种历史歧视会被机器学习模型学习和放大。

python 复制代码
# Python实现:历史偏见检测与修正
import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
import warnings
warnings.filterwarnings('ignore')

class HistoricalBiasAnalyzer:
    def __init__(self):
        self.bias_metrics = {}
    
    def generate_historical_loan_data(self):
        """生成模拟的历史贷款数据,包含历史偏见"""
        np.random.seed(42)
        n_samples = 10000
        
        # 生成人口特征
        data = pd.DataFrame({
            'age': np.random.normal(40, 10, n_samples).clip(18, 70),
            'income': np.random.lognormal(10.5, 0.8, n_samples),
            'credit_score': np.random.normal(650, 50, n_samples).clip(300, 850),
            'employment_years': np.random.exponential(5, n_samples).clip(0, 40),
            'debt_to_income': np.random.beta(2, 5, n_samples) * 0.8 + 0.1,
            'loan_amount': np.random.lognormal(9, 1.2, n_samples)
        })
        
        # 添加敏感属性(历史上受歧视的群体)
        # 假设群体A历史上受到系统性歧视
        data['historical_group'] = np.random.choice(['A', 'B'], n_samples, p=[0.3, 0.7])
        
        # 生成历史偏见:群体A的贷款通过率更低
        base_approval_prob = 1 / (1 + np.exp(
            -(-2 + 
              0.05 * (data['credit_score'] - 650) / 50 +
              0.1 * (data['income'] - np.log(30000)) / 0.5 -
              0.3 * data['debt_to_income'])
        ))
        
        # 对群体A施加历史偏见
        bias_factor = np.where(data['historical_group'] == 'A', 0.6, 1.0)
        approval_prob = base_approval_prob * bias_factor
        
        # 生成贷款批准结果
        data['loan_approved'] = (np.random.random(n_samples) < approval_prob).astype(int)
        
        # 添加历史利率歧视:群体A获得更高利率
        base_rate = 0.05 + 0.1 * (1 - data['credit_score'] / 850) + 0.02 * data['debt_to_income']
        data['interest_rate'] = base_rate + np.where(
            data['historical_group'] == 'A', 0.015, 0
        )
        
        return data
    
    def analyze_historical_bias(self, data):
        """分析历史偏见的存在和程度"""
        
        analysis = {}
        
        # 按历史群体分析贷款批准率
        group_approval = data.groupby('historical_group')['loan_approved'].agg(['mean', 'count'])
        analysis['approval_by_group'] = group_approval.to_dict()
        
        # 计算不同影响(Disparate Impact)
        # 通常定义:弱势群体通过率 / 优势群体通过率
        approval_rates = group_approval['mean']
        groups = sorted(approval_rates.index)
        
        if len(groups) >= 2:
            # 假设群体B是历史优势群体
            di_ratio = approval_rates['A'] / approval_rates['B']
            analysis['disparate_impact'] = {
                'ratio': di_ratio,
                'threshold_violation': di_ratio < 0.8  # 80%规则
            }
        
        # 分析利率差异
        group_interest = data.groupby('historical_group')['interest_rate'].mean()
        analysis['interest_rate_by_group'] = group_interest.to_dict()
        
        # 控制其他变量后,检查群体影响
        # 使用逻辑回归控制信用分数、收入等
        X = data[['credit_score', 'income', 'debt_to_income', 'employment_years']]
        X = (X - X.mean()) / X.std()  # 标准化
        X['group_A'] = (data['historical_group'] == 'A').astype(int)
        y = data['loan_approved']
        
        model = LogisticRegression(max_iter=1000)
        model.fit(X, y)
        
        # 检查群体变量的系数(控制其他因素后)
        coef_df = pd.DataFrame({
            'feature': X.columns,
            'coefficient': model.coef_[0],
            'abs_effect': np.abs(model.coef_[0])
        })
        
        analysis['controlled_analysis'] = {
            'group_coefficient': coef_df[coef_df['feature'] == 'group_A']['coefficient'].values[0],
            'group_significant': np.abs(coef_df[coef_df['feature'] == 'group_A']['coefficient'].values[0]) > 0.1
        }
        
        # 预测并计算各群体在模型中的表现
        predictions = model.predict(X)
        data['predicted'] = predictions
        
        # 计算各群体的预测准确率和差异
        group_metrics = {}
        for group in data['historical_group'].unique():
            group_data = data[data['historical_group'] == group]
            accuracy = accuracy_score(group_data['loan_approved'], group_data['predicted'])
            
            # 计算假阳性和假阴性率
            from sklearn.metrics import confusion_matrix
            tn, fp, fn, tp = confusion_matrix(
                group_data['loan_approved'], group_data['predicted']
            ).ravel()
            
            group_metrics[group] = {
                'accuracy': accuracy,
                'false_positive_rate': fp / (fp + tn) if (fp + tn) > 0 else 0,
                'false_negative_rate': fn / (fn + tp) if (fn + tp) > 0 else 0,
                'sample_size': len(group_data)
            }
        
        analysis['prediction_metrics_by_group'] = group_metrics
        
        return analysis
    
    def mitigate_historical_bias(self, data, method='reweighting'):
        """减轻历史偏见的方法"""
        
        mitigated_data = data.copy()
        
        if method == 'reweighting':
            # 重新加权:给历史上弱势群体的样本更高权重
            group_counts = data['historical_group'].value_counts()
            total_samples = len(data)
            
            # 计算权重:使每个群体的总权重相等
            weights = {}
            for group in group_counts.index:
                weights[group] = total_samples / (len(group_counts) * group_counts[group])
            
            mitigated_data['sample_weight'] = mitigated_data['historical_group'].map(weights)
            
        elif method == 'resampling':
            # 过采样弱势群体
            group_a_data = data[data['historical_group'] == 'A']
            group_b_data = data[data['historical_group'] == 'B']
            
            # 过采样群体A以达到平衡
            n_to_sample = len(group_b_data) - len(group_a_data)
            if n_to_sample > 0:
                oversampled = group_a_data.sample(n=n_to_sample, replace=True, random_state=42)
                mitigated_data = pd.concat([data, oversampled], ignore_index=True)
        
        elif method == 'disparate_impact_remover':
            # 使用预处理方法减少不同影响
            from aif360.algorithms.preprocessing import DisparateImpactRemover
            
            # 这里简化为调整特征值
            # 实际应用中需要使用aif360库
            for feature in ['credit_score', 'income']:
                # 按群体标准化,减少群体间差异
                for group in ['A', 'B']:
                    group_mask = mitigated_data['historical_group'] == group
                    mean_val = mitigated_data.loc[group_mask, feature].mean()
                    std_val = mitigated_data.loc[group_mask, feature].std()
                    
                    if std_val > 0:
                        mitigated_data.loc[group_mask, f'{feature}_adjusted'] = (
                            mitigated_data.loc[group_mask, feature] - mean_val
                        ) / std_val
        
        return mitigated_data

# 使用示例
def analyze_historical_bias_example():
    analyzer = HistoricalBiasAnalyzer()
    
    # 生成包含历史偏见的数据
    loan_data = analyzer.generate_historical_loan_data()
    print("数据统计:")
    print(f"  总样本数: {len(loan_data)}")
    print(f"  群体A比例: {(loan_data['historical_group'] == 'A').mean():.2%}")
    print(f"  总体贷款通过率: {loan_data['loan_approved'].mean():.2%}")
    
    # 分析历史偏见
    bias_analysis = analyzer.analyze_historical_bias(loan_data)
    
    print("\n历史偏见分析:")
    print("  各群体贷款通过率:")
    for group, metrics in bias_analysis['approval_by_group']['mean'].items():
        print(f"    群体{group}: {metrics:.2%}")
    
    if 'disparate_impact' in bias_analysis:
        di = bias_analysis['disparate_impact']
        print(f"  不同影响比率: {di['ratio']:.3f}")
        print(f"  是否违反80%规则: {di['threshold_violation']}")
    
    print("\n  控制变量后群体影响:")
    controlled = bias_analysis['controlled_analysis']
    print(f"    群体A系数: {controlled['group_coefficient']:.4f}")
    print(f"    是否显著: {controlled['group_significant']}")
    
    # 尝试减轻偏见
    print("\n尝试减轻历史偏见...")
    mitigated_data = analyzer.mitigate_historical_bias(loan_data, method='reweighting')
    
    # 重新分析减轻后的数据
    post_mitigation = analyzer.analyze_historical_bias(mitigated_data)
    
    if 'disparate_impact' in post_mitigation:
        di_post = post_mitigation['disparate_impact']
        print(f"  减轻后不同影响比率: {di_post['ratio']:.3f}")
        print(f"  是否仍违反80%规则: {di_post['threshold_violation']}")
    
    return loan_data, bias_analysis, mitigated_data

03 统计公平性指标与全栈实现

人口平等:不同群体通过率差异

人口平等要求不同群体在决策结果中获得有利结果的比例相同。在招聘场景中,这意味着男性与女性的简历通过率应该大致相等。

python 复制代码
# Python实现:使用Fairlearn计算人口平等性差异
import pandas as pd
import numpy as np
from fairlearn.metrics import demographic_parity_difference
from fairlearn.metrics import demographic_parity_ratio

def calculate_demographic_parity(y_true, y_pred, sensitive_features):
    """
    计算人口平等性指标
    
    参数:
    y_true: 真实标签(可选,用于某些变体计算)
    y_pred: 预测标签
    sensitive_features: 敏感属性数组
    
    返回:
    人口平等性差异和比率
    """
    
    # 计算人口平等性差异
    # 值域[0, 1],0表示完全平等,1表示完全不平等
    dp_difference = demographic_parity_difference(
        y_true=y_true,
        y_pred=y_pred,
        sensitive_features=sensitive_features
    )
    
    # 计算人口平等性比率
    # 值域[0, 1],1表示完全平等
    dp_ratio = demographic_parity_ratio(
        y_true=y_true,
        y_pred=y_pred,
        sensitive_features=sensitive_features
    )
    
    # 按群体详细分析
    groups = np.unique(sensitive_features)
    group_metrics = {}
    
    for group in groups:
        group_mask = sensitive_features == group
        group_size = np.sum(group_mask)
        
        if group_size > 0:
            group_positive_rate = np.mean(y_pred[group_mask])
            group_metrics[group] = {
                'sample_size': group_size,
                'positive_rate': group_positive_rate,
                'positive_count': np.sum(y_pred[group_mask])
            }
    
    return {
        'demographic_parity_difference': dp_difference,
        'demographic_parity_ratio': dp_ratio,
        'group_metrics': group_metrics,
        'interpretation': interpret_dp_metrics(dp_difference, dp_ratio)
    }

def interpret_dp_metrics(dp_difference, dp_ratio):
    """解释人口平等性指标"""
    
    interpretations = []
    
    # 基于差异的解释
    if dp_difference < 0.05:
        interpretations.append("人口平等性良好:不同群体间通过率差异很小")
    elif dp_difference < 0.1:
        interpretations.append("人口平等性一般:存在一定群体差异")
    elif dp_difference < 0.2:
        interpretations.append("人口平等性问题:群体间差异显著")
    else:
        interpretations.append("严重人口不平等问题:群体间差异非常大")
    
    # 基于比率的解释(80%规则)
    if dp_ratio >= 0.8:
        interpretations.append("满足80%规则:弱势群体通过率至少达到优势群体的80%")
    else:
        interpretations.append("违反80%规则:存在潜在的歧视风险")
    
    return interpretations

# Java实现:自定义公平性计算工具类
public class FairnessMetricsCalculator {
    
    /**
     * 计算人口平等性差异
     * @param predictions 预测结果数组 (0或1)
     * @param sensitiveAttributes 敏感属性数组
     * @param sensitiveValue 关注的敏感属性值(如"女性")
     * @return DemographicParityMetrics 人口平等性指标
     */
    public static DemographicParityMetrics calculateDemographicParity(
            int[] predictions, 
            String[] sensitiveAttributes,
            String sensitiveValue) {
        
        if (predictions.length != sensitiveAttributes.length) {
            throw new IllegalArgumentException("预测结果和敏感属性数组长度必须相同");
        }
        
        int totalSamples = predictions.length;
        
        // 统计敏感群体和非敏感群体
        int sensitiveGroupCount = 0;
        int nonSensitiveGroupCount = 0;
        int sensitiveGroupPositives = 0;
        int nonSensitiveGroupPositives = 0;
        
        for (int i = 0; i < totalSamples; i++) {
            boolean isSensitive = sensitiveAttributes[i].equals(sensitiveValue);
            
            if (isSensitive) {
                sensitiveGroupCount++;
                if (predictions[i] == 1) {
                    sensitiveGroupPositives++;
                }
            } else {
                nonSensitiveGroupCount++;
                if (predictions[i] == 1) {
                    nonSensitiveGroupPositives++;
                }
            }
        }
        
        // 计算通过率
        double sensitiveGroupRate = sensitiveGroupCount > 0 ? 
            (double) sensitiveGroupPositives / sensitiveGroupCount : 0;
        double nonSensitiveGroupRate = nonSensitiveGroupCount > 0 ? 
            (double) nonSensitiveGroupPositives / nonSensitiveGroupCount : 0;
        
        // 计算人口平等性差异
        double demographicParityDifference = Math.abs(sensitiveGroupRate - nonSensitiveGroupRate);
        
        // 计算人口平等性比率(不同影响)
        double disparateImpactRatio = 0;
        if (nonSensitiveGroupRate > 0) {
            disparateImpactRatio = sensitiveGroupRate / nonSensitiveGroupRate;
        }
        
        // 计算置信区间
        ConfidenceInterval sensitiveCI = calculateProportionCI(
            sensitiveGroupPositives, sensitiveGroupCount, 0.95);
        ConfidenceInterval nonSensitiveCI = calculateProportionCI(
            nonSensitiveGroupPositives, nonSensitiveGroupCount, 0.95);
        
        return new DemographicParityMetrics(
            sensitiveGroupRate,
            nonSensitiveGroupRate,
            demographicParityDifference,
            disparateImpactRatio,
            sensitiveGroupCount,
            nonSensitiveGroupCount,
            sensitiveCI,
            nonSensitiveCI
        );
    }
    
    /**
     * 计算比例的置信区间
     */
    private static ConfidenceInterval calculateProportionCI(int successes, int trials, double confidenceLevel) {
        if (trials == 0) {
            return new ConfidenceInterval(0, 0);
        }
        
        double p = (double) successes / trials;
        double z = 1.96; // 95%置信水平对应的z值
        
        double se = Math.sqrt(p * (1 - p) / trials);
        double margin = z * se;
        
        double lower = Math.max(0, p - margin);
        double upper = Math.min(1, p + margin);
        
        return new ConfidenceInterval(lower, upper);
    }
    
    // 内部类定义
    public static class DemographicParityMetrics {
        private final double sensitiveGroupRate;
        private final double nonSensitiveGroupRate;
        private final double demographicParityDifference;
        private final double disparateImpactRatio;
        private final int sensitiveGroupCount;
        private final int nonSensitiveGroupCount;
        private final ConfidenceInterval sensitiveGroupCI;
        private final ConfidenceInterval nonSensitiveGroupCI;
        
        // 构造器、getter方法省略
        
        public boolean passesEightyPercentRule() {
            return disparateImpactRatio >= 0.8;
        }
        
        public boolean hasSignificantDifference(double threshold) {
            // 检查置信区间是否重叠来判断差异是否显著
            return (sensitiveGroupCI.getUpper() < nonSensitiveGroupCI.getLower() - threshold) ||
                   (nonSensitiveGroupCI.getUpper() < sensitiveGroupCI.getLower() - threshold);
        }
    }
    
    public static class ConfidenceInterval {
        private final double lower;
        private final double upper;
        
        // 构造器、getter方法省略
    }
}
vue 复制代码
<!-- Vue实现:人口平等性可视化组件 -->
<template>
  <div class="demographic-parity-dashboard">
    <div class="dashboard-header">
      <h2>人口平等性分析</h2>
      <div class="fairness-score" :class="getFairnessScoreClass(overallScore)">
        {{ overallScore }}%
      </div>
    </div>
    
    <div class="metrics-summary">
      <div class="metric-card" v-for="metric in summaryMetrics" :key="metric.name">
        <div class="metric-name">{{ metric.name }}</div>
        <div class="metric-value" :class="getMetricValueClass(metric.value, metric.threshold)">
          {{ metric.value }}
        </div>
        <div class="metric-threshold">阈值: {{ metric.threshold }}</div>
      </div>
    </div>
    
    <div class="visualization-section">
      <div class="chart-container">
        <h3>不同群体通过率对比</h3>
        <div ref="approvalRateChart" style="width: 100%; height: 400px;"></div>
      </div>
      
      <div class="chart-container">
        <h3>人口平等性差异趋势</h3>
        <div ref="parityTrendChart" style="width: 100%; height: 400px;"></div>
      </div>
    </div>
    
    <div class="detailed-analysis">
      <h3>详细群体分析</h3>
      <el-table :data="groupAnalysisData" style="width: 100%" border>
        <el-table-column prop="group" label="群体" width="120"></el-table-column>
        <el-table-column prop="sampleSize" label="样本数" width="100"></el-table-column>
        <el-table-column prop="positiveCount" label="通过数" width="100"></el-table-column>
        <el-table-column prop="positiveRate" label="通过率" width="150">
          <template #default="{ row }">
            <div class="rate-display">
              <span class="rate-value">{{ (row.positiveRate * 100).toFixed(1) }}%</span>
              <el-progress 
                :percentage="row.positiveRate * 100" 
                :stroke-width="12"
                :show-text="false"
                :status="getRateStatus(row.positiveRate, row.baselineRate)">
              </el-progress>
            </div>
          </template>
        </el-table-column>
        <el-table-column prop="confidenceInterval" label="置信区间" width="180">
          <template #default="{ row }">
            {{ formatConfidenceInterval(row.confidenceInterval) }}
          </template>
        </el-table-column>
        <el-table-column prop="disparateImpact" label="不同影响" width="120">
          <template #default="{ row }">
            <span :class="getDisparateImpactClass(row.disparateImpact)">
              {{ row.disparateImpact.toFixed(2) }}
            </span>
          </template>
        </el-table-column>
        <el-table-column label="状态" width="120">
          <template #default="{ row }">
            <el-tag :type="getStatusType(row)">
              {{ getStatusText(row) }}
            </el-tag>
          </template>
        </el-table-column>
      </el-table>
    </div>
    
    <div class="interpretation-section">
      <h3>结果解读与建议</h3>
      <div class="interpretation-content">
        <div v-for="(item, index) in interpretations" :key="index" class="interpretation-item">
          <div class="interpretation-icon" :class="item.type">
            <i :class="item.icon"></i>
          </div>
          <div class="interpretation-text">{{ item.text }}</div>
        </div>
      </div>
      
      <div class="recommendations" v-if="recommendations.length > 0">
        <h4>改进建议</h4>
        <ul>
          <li v-for="(rec, index) in recommendations" :key="index">
            {{ rec }}
          </li>
        </ul>
      </div>
    </div>
  </div>
</template>

<script>
import * as echarts from 'echarts';
import { ElTable, ElTableColumn, ElTag, ElProgress } from 'element-ui';

export default {
  components: { ElTable, ElTableColumn, ElTag, ElProgress },
  data() {
    return {
      overallScore: 76,
      summaryMetrics: [
        { name: '人口平等性差异', value: '0.18', threshold: '< 0.1', status: 'warning' },
        { name: '不同影响比率', value: '0.72', threshold: '≥ 0.8', status: 'danger' },
        { name: '最大群体差异', value: '24.5%', threshold: '< 10%', status: 'danger' },
        { name: '统计显著性', value: '显著', threshold: '不显著', status: 'warning' }
      ],
      groupAnalysisData: [
        {
          group: '男性',
          sampleSize: 4250,
          positiveCount: 1785,
          positiveRate: 0.42,
          baselineRate: 0.36,
          confidenceInterval: [0.405, 0.435],
          disparateImpact: 1.17,
          status: 'advantaged'
        },
        {
          group: '女性',
          sampleSize: 1750,
          positiveCount: 525,
          positiveRate: 0.30,
          baselineRate: 0.36,
          confidenceInterval: [0.278, 0.322],
          disparateImpact: 0.83,
          status: 'disadvantaged'
        },
        {
          group: '其他',
          sampleSize: 500,
          positiveCount: 150,
          positiveRate: 0.30,
          baselineRate: 0.36,
          confidenceInterval: [0.265, 0.335],
          disparateImpact: 0.83,
          status: 'disadvantaged'
        }
      ],
      interpretations: [
        {
          type: 'warning',
          icon: 'fas fa-exclamation-triangle',
          text: '发现显著人口不平等:女性群体通过率比男性低12个百分点'
        },
        {
          type: 'danger',
          icon: 'fas fa-times-circle',
          text: '违反80%规则:女性通过率仅为男性的83%,未达到80%阈值'
        },
        {
          type: 'info',
          icon: 'fas fa-info-circle',
          text: '差异具有统计显著性(p < 0.05),不是随机波动'
        }
      ],
      recommendations: [
        '审查招聘算法中可能对女性不利的特征权重',
        '考虑使用公平性约束重新训练模型',
        '实施匿名简历筛选减少人口统计信息影响',
        '建立定期公平性审计机制'
      ]
    };
  },
  mounted() {
    this.renderApprovalRateChart();
    this.renderParityTrendChart();
  },
  methods: {
    renderApprovalRateChart() {
      const chart = echarts.init(this.$refs.approvalRateChart);
      
      const option = {
        tooltip: {
          trigger: 'axis',
          formatter: function(params) {
            let result = params[0].name + '<br/>';
            params.forEach(param => {
              result += `${param.seriesName}: ${(param.value * 100).toFixed(1)}%<br/>`;
            });
            return result;
          }
        },
        legend: {
          data: ['实际通过率', '期望通过率'],
          bottom: 10
        },
        xAxis: {
          type: 'category',
          data: this.groupAnalysisData.map(item => item.group),
          axisLabel: {
            interval: 0,
            rotate: 0
          }
        },
        yAxis: {
          type: 'value',
          name: '通过率',
          axisLabel: {
            formatter: '{value}%'
          },
          min: 0,
          max: 50
        },
        series: [
          {
            name: '实际通过率',
            type: 'bar',
            data: this.groupAnalysisData.map(item => item.positiveRate * 100),
            itemStyle: {
              color: function(params) {
                const group = params.name;
                const dataItem = this.groupAnalysisData.find(item => item.group === group);
                return dataItem.status === 'disadvantaged' ? '#f56c6c' : 
                       dataItem.status === 'advantaged' ? '#67c23a' : '#e6a23c';
              }.bind(this)
            },
            label: {
              show: true,
              position: 'top',
              formatter: '{c}%'
            }
          },
          {
            name: '期望通过率',
            type: 'line',
            data: this.groupAnalysisData.map(() => 36),
            lineStyle: {
              type: 'dashed',
              color: '#909399'
            },
            symbol: 'none'
          }
        ],
        grid: {
          left: '3%',
          right: '4%',
          bottom: '15%',
          containLabel: true
        }
      };
      
      chart.setOption(option);
    },
    
    getFairnessScoreClass(score) {
      if (score >= 90) return 'excellent';
      if (score >= 70) return 'good';
      if (score >= 50) return 'fair';
      return 'poor';
    },
    
    getDisparateImpactClass(value) {
      if (value >= 0.9) return 'good';
      if (value >= 0.8) return 'fair';
      return 'poor';
    },
    
    getStatusType(row) {
      if (row.status === 'advantaged') return 'success';
      if (row.status === 'disadvantaged') return 'danger';
      return 'warning';
    },
    
    formatConfidenceInterval(interval) {
      return `[${(interval[0] * 100).toFixed(1)}%, ${(interval[1] * 100).toFixed(1)}%]`;
    }
  }
};
</script>

<style>
.fairness-score {
  display: inline-block;
  padding: 8px 16px;
  background: #409eff;
  color: white;
  border-radius: 20px;
  font-weight: bold;
  font-size: 18px;
  margin-left: 20px;
}

.fairness-score.excellent { background: #67c23a; }
.fairness-score.good { background: #e6a23c; }
.fairness-score.fair { background: #f56c6c; }
.fairness-score.poor { background: #909399; }

.metrics-summary {
  display: flex;
  gap: 16px;
  margin: 20px 0;
}

.metric-card {
  flex: 1;
  padding: 16px;
  background: #f5f7fa;
  border-radius: 4px;
  text-align: center;
}

.metric-name {
  font-size: 14px;
  color: #606266;
  margin-bottom: 8px;
}

.metric-value {
  font-size: 24px;
  font-weight: bold;
  margin-bottom: 4px;
}

.metric-value.good { color: #67c23a; }
.metric-value.fair { color: #e6a23c; }
.metric-value.poor { color: #f56c6c; }

.metric-threshold {
  font-size: 12px;
  color: #909399;
}

.rate-display {
  display: flex;
  align-items: center;
  gap: 10px;
}

.rate-value {
  min-width: 50px;
}

.interpretation-item {
  display: flex;
  align-items: center;
  padding: 12px;
  margin-bottom: 8px;
  background: #f9f9f9;
  border-radius: 4px;
}

.interpretation-icon {
  width: 30px;
  height: 30px;
  border-radius: 50%;
  display: flex;
  align-items: center;
  justify-content: center;
  margin-right: 12px;
}

.interpretation-icon.warning {
  background: #fdf6ec;
  color: #e6a23c;
}

.interpretation-icon.danger {
  background: #fef0f0;
  color: #f56c6c;
}

.interpretation-icon.info {
  background: #f4f4f5;
  color: #909399;
}
</style>

机会平等:真阳性率的群体公平性

机会平等比人口平等更加严格,要求不同群体不仅有相同的总体通过率,而且在相同真实结果的情况下,获得正确预测的概率也相同。这意味着对于同样合格的候选人,不同群体应该有相同的被选中概率。

python 复制代码
# Python实现:使用AIF360计算机会平等指标
import numpy as np
from aif360.metrics import ClassificationMetric
from aif360.datasets import BinaryLabelDataset

def calculate_equalized_odds(y_true, y_pred, sensitive_features, privileged_group):
    """
    计算机会平等指标
    
    参数:
    y_true: 真实标签
    y_pred: 预测标签
    sensitive_features: 敏感属性
    privileged_group: 特权群体标识
    
    返回:
    机会平等性差异指标
    """
    
    # 准备AIF360数据集格式
    # 假设敏感特征是二元的(如性别)
    dataset = BinaryLabelDataset(
        favorable_label=1,
        unfavorable_label=0,
        df=pd.DataFrame({
            'y_true': y_true,
            'y_pred': y_pred,
            'sensitive': sensitive_features
        }),
        label_names=['y_true'],
        protected_attribute_names=['sensitive'],
        privileged_protected_attributes=[[privileged_group]]
    )
    
    # 创建分类指标对象
    metric = ClassificationMetric(
        dataset, 
        dataset,  # 使用相同数据集作为预测
        unprivileged_groups=[{'sensitive': 0}],  # 非特权群体
        privileged_groups=[{'sensitive': 1}]      # 特权群体
    )
    
    # 计算机会平等性差异
    equal_odds_difference = metric.equal_opportunity_difference()
    
    # 计算详细指标
    metrics = {
        # 真阳性率差异(机会平等核心)
        'true_positive_rate_difference': metric.true_positive_rate_difference(),
        'false_positive_rate_difference': metric.false_positive_rate_difference(),
        'false_negative_rate_difference': metric.false_negative_rate_difference(),
        'false_omission_rate_difference': metric.false_omission_rate_difference(),
        
        # 各组详细统计
        'privileged_true_positive_rate': metric.true_positive_rate(privileged=True),
        'unprivileged_true_positive_rate': metric.true_positive_rate(privileged=False),
        'privileged_false_positive_rate': metric.false_positive_rate(privileged=True),
        'unprivileged_false_positive_rate': metric.false_positive_rate(privileged=False),
        
        # 机会平等性综合指标
        'equal_opportunity_difference': equal_odds_difference,
        'average_odds_difference': metric.average_odds_difference(),
        
        # 统计检验
        'statistical_parity_difference': metric.statistical_parity_difference(),
        'disparate_impact': metric.disparate_impact()
    }
    
    # 解释结果
    metrics['interpretation'] = interpret_equalized_odds(metrics)
    
    return metrics

def interpret_equalized_odds(metrics):
    """解释机会平等性指标"""
    
    interpretations = []
    threshold = 0.1  # 10%差异阈值
    
    # 真阳性率差异
    tpr_diff = abs(metrics['true_positive_rate_difference'])
    if tpr_diff < 0.05:
        interpretations.append("真阳性率平等性优秀:不同群体合格者被正确选中的概率相近")
    elif tpr_diff < threshold:
        interpretations.append("真阳性率平等性可接受:存在一定差异但在可接受范围")
    else:
        interpretations.append(f"真阳性率平等性问题:差异达{tpr_diff:.1%},可能对某些群体不公平")
    
    # 假阳性率差异
    fpr_diff = abs(metrics['false_positive_rate_difference'])
    if fpr_diff > threshold:
        interpretations.append(f"假阳性率不平等:不同群体不合格者被误选的概率差异达{fpr_diff:.1%}")
    
    # 综合评估
    avg_odds_diff = abs(metrics['average_odds_difference'])
    if avg_odds_diff < 0.05:
        interpretations.append("机会平等性总体良好")
    elif avg_odds_diff < 0.1:
        interpretations.append("机会平等性需要关注")
    else:
        interpretations.append("机会平等性存在显著问题")
    
    return interpretations

# Java实现:基于混淆矩阵的机会平等计算
public class EqualizedOddsCalculator {
    
    /**
     * 计算机会平等性指标
     */
    public static EqualizedOddsMetrics calculateEqualizedOdds(
            int[] yTrue, 
            int[] yPred,
            int[] sensitiveAttributes,
            int privilegedValue) {
        
        if (yTrue.length != yPred.length || yTrue.length != sensitiveAttributes.length) {
            throw new IllegalArgumentException("输入数组长度必须相同");
        }
        
        // 初始化计数器
        ConfusionMatrix privilegedMatrix = new ConfusionMatrix();
        ConfusionMatrix unprivilegedMatrix = new ConfusionMatrix();
        
        // 填充混淆矩阵
        for (int i = 0; i < yTrue.length; i++) {
            boolean isPrivileged = sensitiveAttributes[i] == privilegedValue;
            boolean isPositivePrediction = yPred[i] == 1;
            boolean isPositiveActual = yTrue[i] == 1;
            
            if (isPrivileged) {
                privilegedMatrix.update(isPositiveActual, isPositivePrediction);
            } else {
                unprivilegedMatrix.update(isPositiveActual, isPositivePrediction);
            }
        }
        
        // 计算各种率
        double privilegedTPR = privilegedMatrix.getTruePositiveRate();
        double unprivilegedTPR = unprivilegedMatrix.getTruePositiveRate();
        double tprDifference = Math.abs(privilegedTPR - unprivilegedTPR);
        
        double privilegedFPR = privilegedMatrix.getFalsePositiveRate();
        double unprivilegedFPR = unprivilegedMatrix.getFalsePositiveRate();
        double fprDifference = Math.abs(privilegedFPR - unprivilegedFPR);
        
        double privilegedFNR = privilegedMatrix.getFalseNegativeRate();
        double unprivilegedFNR = unprivilegedMatrix.getFalseNegativeRate();
        double fnrDifference = Math.abs(privilegedFNR - unprivilegedFNR);
        
        // 计算平均机会差异
        double averageOddsDifference = (tprDifference + fprDifference) / 2.0;
        
        // 计算机会平等性差异
        double equalOpportunityDifference = tprDifference;
        
        return new EqualizedOddsMetrics(
            privilegedTPR, unprivilegedTPR, tprDifference,
            privilegedFPR, unprivilegedFPR, fprDifference,
            privilegedFNR, unprivilegedFNR, fnrDifference,
            averageOddsDifference, equalOpportunityDifference,
            privilegedMatrix.getTotalCount(),
            unprivilegedMatrix.getTotalCount()
        );
    }
    
    /**
     * 混淆矩阵类
     */
    public static class ConfusionMatrix {
        private int truePositives = 0;
        private int falsePositives = 0;
        private int trueNegatives = 0;
        private int falseNegatives = 0;
        
        public void update(boolean actualPositive, boolean predictedPositive) {
            if (actualPositive && predictedPositive) {
                truePositives++;
            } else if (!actualPositive && predictedPositive) {
                falsePositives++;
            } else if (actualPositive && !predictedPositive) {
                falseNegatives++;
            } else {
                trueNegatives++;
            }
        }
        
        public double getTruePositiveRate() {
            int totalPositives = truePositives + falseNegatives;
            return totalPositives > 0 ? (double) truePositives / totalPositives : 0;
        }
        
        public double getFalsePositiveRate() {
            int totalNegatives = falsePositives + trueNegatives;
            return totalNegatives > 0 ? (double) falsePositives / totalNegatives : 0;
        }
        
        public double getFalseNegativeRate() {
            int totalPositives = truePositives + falseNegatives;
            return totalPositives > 0 ? (double) falseNegatives / totalPositives : 0;
        }
        
        public int getTotalCount() {
            return truePositives + falsePositives + trueNegatives + falseNegatives;
        }
    }
    
    /**
     * 机会平等性指标类
     */
    public static class EqualizedOddsMetrics {
        private final double privilegedTPR;
        private final double unprivilegedTPR;
        private final double tprDifference;
        
        private final double privilegedFPR;
        private final double unprivilegedFPR;
        private final double fprDifference;
        
        private final double privilegedFNR;
        private final double unprivilegedFNR;
        private final double fnrDifference;
        
        private final double averageOddsDifference;
        private final double equalOpportunityDifference;
        
        private final int privilegedSampleCount;
        private final int unprivilegedSampleCount;
        
        // 构造器、getter方法省略
        
        /**
         * 检查是否满足机会平等性
         * @param threshold 阈值(如0.1)
         * @return 是否满足
         */
        public boolean satisfiesEqualizedOdds(double threshold) {
            return tprDifference <= threshold && fprDifference <= threshold;
        }
        
        /**
         * 检查是否满足机会平等性
         */
        public boolean satisfiesEqualOpportunity(double threshold) {
            return tprDifference <= threshold;
        }
    }
}

不同影响:80%规则的量化实施

不同影响是美国就业机会均等委员会(EEOC)提出的"80%规则",用于判断是否存在歧视性影响。计算方式为:弱势群体通过率 ÷ 优势群体通过率,结果低于0.8(80%)可能被视为歧视性影响。

python 复制代码
# Python+Java双语言实现:不同影响计算与结果对齐
# Python实现
def calculate_disparate_impact_python(y_pred, sensitive_features, protected_group):
    """
    Python实现不同影响计算
    
    参数:
    y_pred: 预测结果
    sensitive_features: 敏感特征
    protected_group: 受保护群体标识
    
    返回:
    不同影响比率和详细统计
    """
    import numpy as np
    
    # 转换为numpy数组
    y_pred = np.array(y_pred)
    sensitive_features = np.array(sensitive_features)
    
    # 识别受保护群体和非受保护群体
    protected_mask = sensitive_features == protected_group
    unprotected_mask = ~protected_mask
    
    # 计算通过率
    protected_rate = np.mean(y_pred[protected_mask]) if np.sum(protected_mask) > 0 else 0
    unprotected_rate = np.mean(y_pred[unprotected_mask]) if np.sum(unprotected_mask) > 0 else 0
    
    # 计算不同影响比率
    # 通常计算: 受保护群体通过率 / 非受保护群体通过率
    if unprotected_rate > 0:
        disparate_impact_ratio = protected_rate / unprotected_rate
    else:
        disparate_impact_ratio = float('inf')  # 非受保护群体没有通过的情况
    
    # 计算反向比率(有时也使用)
    if protected_rate > 0:
        reverse_ratio = unprotected_rate / protected_rate
    else:
        reverse_ratio = float('inf')
    
    # 计算统计显著性
    from scipy import stats
    
    n_protected = np.sum(protected_mask)
    n_unprotected = np.sum(unprotected_mask)
    
    protected_success = np.sum(y_pred[protected_mask])
    unprotected_success = np.sum(y_pred[unprotected_mask])
    
    # 使用卡方检验检查通过率差异是否显著
    from scipy.stats import chi2_contingency
    
    contingency_table = [
        [protected_success, n_protected - protected_success],
        [unprotected_success, n_unprotected - unprotected_success]
    ]
    
    chi2, p_value, dof, expected = chi2_contingency(contingency_table)
    
    # 计算置信区间
    from statsmodels.stats.proportion import proportion_confint
    
    protected_ci = proportion_confint(
        protected_success, n_protected, alpha=0.05, method='wilson'
    ) if n_protected > 0 else (0, 0)
    
    unprotected_ci = proportion_confint(
        unprotected_success, n_unprotected, alpha=0.05, method='wilson'
    ) if n_unprotected > 0 else (0, 0)
    
    result = {
        'disparate_impact_ratio': disparate_impact_ratio,
        'reverse_ratio': reverse_ratio,
        'protected_rate': protected_rate,
        'unprotected_rate': unprotected_rate,
        'protected_count': int(n_protected),
        'unprotected_count': int(n_unprotected),
        'protected_success': int(protected_success),
        'unprotected_success': int(unprotected_success),
        'chi2_statistic': chi2,
        'p_value': p_value,
        'significant_difference': p_value < 0.05,
        'protected_confidence_interval': protected_ci,
        'unprotected_confidence_interval': unprotected_ci,
        'passes_80_percent_rule': disparate_impact_ratio >= 0.8 if unprotected_rate > 0 else True
    }
    
    return result

# Java实现
public class DisparateImpactCalculator {
    
    /**
     * Java实现不同影响计算
     */
    public static DisparateImpactResult calculateDisparateImpact(
            int[] predictions,
            int[] sensitiveAttributes,
            int protectedValue) {
        
        // 统计受保护和非受保护群体
        int protectedCount = 0;
        int unprotectedCount = 0;
        int protectedSuccess = 0;
        int unprotectedSuccess = 0;
        
        for (int i = 0; i < predictions.length; i++) {
            boolean isProtected = sensitiveAttributes[i] == protectedValue;
            
            if (isProtected) {
                protectedCount++;
                if (predictions[i] == 1) {
                    protectedSuccess++;
                }
            } else {
                unprotectedCount++;
                if (predictions[i] == 1) {
                    unprotectedSuccess++;
                }
            }
        }
        
        // 计算通过率
        double protectedRate = protectedCount > 0 ? 
            (double) protectedSuccess / protectedCount : 0;
        double unprotectedRate = unprotectedCount > 0 ? 
            (double) unprotectedSuccess / unprotectedCount : 0;
        
        // 计算不同影响比率
        double disparateImpactRatio = 0;
        boolean passesEightyPercentRule = false;
        
        if (unprotectedRate > 0) {
            disparateImpactRatio = protectedRate / unprotectedRate;
            passesEightyPercentRule = disparateImpactRatio >= 0.8;
        } else {
            // 非受保护群体没有通过的情况
            disparateImpactRatio = Double.POSITIVE_INFINITY;
            passesEightyPercentRule = true;
        }
        
        // 计算卡方检验
        ChiSquareTest chiSquareTest = new ChiSquareTest();
        
        long[][] contingencyTable = {
            {(long) protectedSuccess, (long) (protectedCount - protectedSuccess)},
            {(long) unprotectedSuccess, (long) (unprotectedCount - unprotectedSuccess)}
        };
        
        double chiSquare = chiSquareTest.chiSquare(contingencyTable);
        double pValue = chiSquareTest.chiSquareTest(contingencyTable);
        
        // 计算置信区间
        ConfidenceInterval protectedCI = calculateWilsonCI(protectedSuccess, protectedCount, 0.95);
        ConfidenceInterval unprotectedCI = calculateWilsonCI(unprotectedSuccess, unprotectedCount, 0.95);
        
        return new DisparateImpactResult(
            disparateImpactRatio,
            protectedRate,
            unprotectedRate,
            protectedCount,
            unprotectedCount,
            protectedSuccess,
            unprotectedSuccess,
            chiSquare,
            pValue,
            pValue < 0.05,
            protectedCI,
            unprotectedCI,
            passesEightyPercentRule
        );
    }
    
    /**
     * 计算Wilson置信区间
     */
    private static ConfidenceInterval calculateWilsonCI(int successes, int trials, double confidenceLevel) {
        if (trials == 0) {
            return new ConfidenceInterval(0, 0);
        }
        
        double z = 1.96; // 95%置信水平
        double p = (double) successes / trials;
        
        double denominator = 1 + z * z / trials;
        double center = (p + z * z / (2 * trials)) / denominator;
        double spread = z * Math.sqrt(
            (p * (1 - p) + z * z / (4 * trials)) / trials
        ) / denominator;
        
        double lower = Math.max(0, center - spread);
        double upper = Math.min(1, center + spread);
        
        return new ConfidenceInterval(lower, upper);
    }
    
    /**
     * 结果对齐:比较Python和Java计算结果
     */
    public static ComparisonResult compareWithPython(
            DisparateImpactResult javaResult,
            Map<String, Object> pythonResult) {
        
        ComparisonResult comparison = new ComparisonResult();
        
        // 比较核心指标
        double javaRatio = javaResult.getDisparateImpactRatio();
        double pythonRatio = (double) pythonResult.get("disparate_impact_ratio");
        
        comparison.setRatioDifference(Math.abs(javaRatio - pythonRatio));
        
        // 比较通过率
        double javaProtectedRate = javaResult.getProtectedRate();
        double pythonProtectedRate = (double) pythonResult.get("protected_rate");
        comparison.setProtectedRateDifference(Math.abs(javaProtectedRate - pythonProtectedRate));
        
        // 比较统计显著性
        boolean javaSignificant = javaResult.isSignificantDifference();
        boolean pythonSignificant = (boolean) pythonResult.get("significant_difference");
        comparison.setSignificanceMatch(javaSignificant == pythonSignificant);
        
        // 比较80%规则结果
        boolean javaPasses = javaResult.isPassesEightyPercentRule();
        boolean pythonPasses = (boolean) pythonResult.get("passes_80_percent_rule");
        comparison.setRuleMatch(javaPasses == pythonPasses);
        
        // 计算总体一致性分数
        double consistencyScore = calculateConsistencyScore(comparison);
        comparison.setConsistencyScore(consistencyScore);
        
        return comparison;
    }
    
    private static double calculateConsistencyScore(ComparisonResult comparison) {
        double score = 100.0;
        
        // 比率差异惩罚
        if (comparison.getRatioDifference() > 0.01) {
            score -= 20;
        } else if (comparison.getRatioDifference() > 0.001) {
            score -= 10;
        }
        
        // 通过率差异惩罚
        if (comparison.getProtectedRateDifference() > 0.01) {
            score -= 20;
        }
        
        // 显著性匹配惩罚
        if (!comparison.isSignificanceMatch()) {
            score -= 30;
        }
        
        // 规则匹配惩罚
        if (!comparison.isRuleMatch()) {
            score -= 30;
        }
        
        return Math.max(0, score);
    }
    
    // 内部类定义
    public static class DisparateImpactResult {
        // 字段和getter/setter省略
    }
    
    public static class ConfidenceInterval {
        private final double lower;
        private final double upper;
        // 构造器、getter省略
    }
    
    public static class ComparisonResult {
        private double ratioDifference;
        private double protectedRateDifference;
        private boolean significanceMatch;
        private boolean ruleMatch;
        private double consistencyScore;
        // getter/setter省略
    }
}

04 因果公平性测试:超越统计相关性

因果公平性测试旨在识别和量化敏感属性与决策结果之间的因果关系,而不仅仅是统计相关性。这种方法可以区分直接歧视(基于敏感属性的直接因果影响)和间接歧视(通过其他变量的间接影响)。

python 复制代码
# Python实现:使用DoWhy进行因果公平性分析
import dowhy
from dowhy import CausalModel
import pandas as pd
import numpy as np

class CausalFairnessAnalyzer:
    def __init__(self, data, treatment, outcome, sensitive_attribute):
        """
        初始化因果公平性分析器
        
        参数:
        data: 数据集
        treatment: 处理变量(如是否被聘用)
        outcome: 结果变量(如工作表现)
        sensitive_attribute: 敏感属性(如性别)
        """
        self.data = data
        self.treatment = treatment
        self.outcome = outcome
        self.sensitive_attribute = sensitive_attribute
        
    def build_causal_graph(self, common_causes=None):
        """
        构建因果图
        
        参数:
        common_causes: 共同原因变量列表
        """
        # 定义因果图
        graph = f"""
        digraph {{
            {self.sensitive_attribute} -> {self.treatment};
            {self.treatment} -> {self.outcome};
            {self.sensitive_attribute} -> {self.outcome};
        """
        
        # 添加共同原因
        if common_causes:
            for cause in common_causes:
                graph += f"    {cause} -> {self.treatment};\n"
                graph += f"    {cause} -> {self.outcome};\n"
        
        graph += "}"
        
        self.causal_graph = graph
        return graph
    
    def estimate_causal_effects(self, method="propensity_score_stratification"):
        """
        估计因果效应
        """
        # 创建因果模型
        model = CausalModel(
            data=self.data,
            treatment=self.treatment,
            outcome=self.outcome,
            common_causes=[self.sensitive_attribute],
            graph=self.causal_graph
        )
        
        # 识别因果效应
        identified_estimand = model.identify_effect()
        print(f"识别到的因果估计量:\n{identified_estimand}")
        
        # 估计因果效应
        estimate = model.estimate_effect(
            identified_estimand,
            method_name=method,
            method_params={}
        )
        
        # 进行反驳测试
        refutations = self.perform_refutation_tests(model, identified_estimand, estimate)
        
        return {
            'estimand': identified_estimand,
            'estimate': estimate,
            'refutations': refutations,
            'interpretation': self.interpret_causal_effects(estimate)
        }
    
    def perform_refutation_tests(self, model, estimand, estimate):
        """
        执行反驳测试验证因果推断的稳健性
        """
        refutations = {}
        
        # 1. 随机共同原因测试
        refutation_random = model.refute_estimate(
            estimand, estimate, method_name="random_common_cause"
        )
        refutations['random_common_cause'] = refutation_random
        
        # 2. 添加未观测混杂测试
        refutation_unobserved = model.refute_estimate(
            estimand, estimate, 
            method_name="add_unobserved_common_cause",
            method_params={
                'confounders_effect_on_treatment': "binary_flip",
                'confounders_effect_on_outcome': "linear",
                'effect_strength_on_treatment': 0.1,
                'effect_strength_on_outcome': 0.1
            }
        )
        refutations['unobserved_confounding'] = refutation_unobserved
        
        # 3. 数据子集验证
        refutation_subset = model.refute_estimate(
            estimand, estimate, method_name="data_subset_refuter",
            method_params={'subset_fraction': 0.8}
        )
        refutations['data_subset'] = refutation_subset
        
        return refutations
    
    def analyze_discrimination_paths(self):
        """
        分析歧视路径:直接效应 vs 间接效应
        """
        # 使用mediation分析
        # 直接效应:敏感属性 -> 结果
        # 间接效应:敏感属性 -> 中介变量 -> 结果
        
        # 构建更详细的因果图
        detailed_graph = f"""
        digraph {{
            // 直接路径
            {self.sensitive_attribute} -> {self.outcome} [label="direct"];
            
            // 间接路径通过treatment
            {self.sensitive_attribute} -> {self.treatment};
            {self.treatment} -> {self.outcome} [label="indirect"];
            
            // 可能的混淆变量
            education -> {self.treatment};
            education -> {self.outcome};
            experience -> {self.treatment};
            experience -> {self.outcome};
        }}
        """
        
        # 计算自然直接效应和自然间接效应
        # 这里简化实现,实际需要使用mediation分析技术
        mediation_results = self.estimate_mediation_effects()
        
        return {
            'causal_graph': detailed_graph,
            'mediation_analysis': mediation_results,
            'direct_effect': mediation_results.get('direct_effect', 0),
            'indirect_effect': mediation_results.get('indirect_effect', 0),
            'total_effect': mediation_results.get('total_effect', 0)
        }
    
    def estimate_mediation_effects(self):
        """
        估计中介效应(简化实现)
        """
        # 在实际应用中,这里应该使用正式的mediation分析
        # 如使用mediation包或自定义估计
        
        # 简化的mediation分析
        from sklearn.linear_model import LinearRegression
        
        # 估计总效应
        X_total = self.data[[self.sensitive_attribute]]
        y = self.data[self.outcome]
        model_total = LinearRegression().fit(X_total, y)
        total_effect = model_total.coef_[0]
        
        # 估计直接效应(控制中介变量)
        # 这里假设treatment是主要的中介变量
        X_direct = self.data[[self.sensitive_attribute, self.treatment]]
        model_direct = LinearRegression().fit(X_direct, y)
        direct_effect = model_direct.coef_[0]  # 敏感属性的系数
        
        # 间接效应 = 总效应 - 直接效应
        indirect_effect = total_effect - direct_effect
        
        return {
            'total_effect': total_effect,
            'direct_effect': direct_effect,
            'indirect_effect': indirect_effect,
            'proportion_mediated': indirect_effect / total_effect if total_effect != 0 else 0
        }
    
    def interpret_causal_effects(self, estimate):
        """
        解释因果效应结果
        """
        interpretations = []
        
        # 提取估计值
        ate = estimate.value  # 平均处理效应
        
        if hasattr(estimate, 'conf_int'):
            ci_lower, ci_upper = estimate.conf_int()
        else:
            ci_lower, ci_upper = ate - 0.1, ate + 0.1
        
        # 解释ATE
        if abs(ate) < 0.05:
            interpretations.append("平均因果效应很小,表明处理对结果影响有限")
        elif abs(ate) < 0.1:
            interpretations.append(f"中等因果效应:处理使结果变化{ate:.2f}")
        else:
            interpretations.append(f"强因果效应:处理使结果变化{ate:.2f}")
        
        # 检查置信区间是否包含0
        if ci_lower <= 0 <= ci_upper:
            interpretations.append("因果效应在统计上不显著(置信区间包含0)")
        else:
            interpretations.append("因果效应在统计上显著")
        
        # 公平性解释
        if ate > 0.1:
            interpretations.append("⚠️  发现显著的因果不公平性:敏感属性对结果有强影响")
        elif ate > 0.05:
            interpretations.append("⚠️  存在中等程度的因果不公平性")
        else:
            interpretations.append("因果公平性可接受:敏感属性对结果影响有限")
        
        return interpretations

# 使用示例
def causal_fairness_example():
    # 生成模拟数据
    np.random.seed(42)
    n = 2000
    
    # 生成数据
    data = pd.DataFrame({
        'gender': np.random.choice([0, 1], n, p=[0.4, 0.6]),  # 0:女性, 1:男性
        'education': np.random.normal(15, 3, n).clip(8, 22),
        'experience': np.random.exponential(5, n).clip(0, 30),
        'skill_test': np.random.normal(70, 15, n)
    })
    
    # 生成雇佣决定(包含偏见)
    # 男性更可能被雇佣,即使控制其他变量
    hire_prob = 1 / (1 + np.exp(-(
        -1.5 +
        0.1 * (data['education'] - 15) / 3 +
        0.2 * (data['experience'] - 5) / 5 +
        0.3 * (data['skill_test'] - 70) / 15 +
        0.4 * data['gender']  # 性别偏见:男性更可能被雇佣
    )))
    
    data['hired'] = (np.random.random(n) < hire_prob).astype(int)
    
    # 生成工作表现(与技能相关,但与性别无关)
    performance = 60 + 0.3 * data['skill_test'] + 0.2 * data['education'] + np.random.normal(0, 10, n)
    data['performance'] = performance.clip(0, 100)
    
    # 进行因果公平性分析
    analyzer = CausalFairnessAnalyzer(
        data=data,
        treatment='hired',
        outcome='performance',
        sensitive_attribute='gender'
    )
    
    # 构建因果图
    graph = analyzer.build_causal_graph(common_causes=['education', 'experience', 'skill_test'])
    print("因果图:")
    print(graph)
    
    # 估计因果效应
    results = analyzer.estimate_causal_effects(method="propensity_score_weighting")
    
    print("\n因果效应估计:")
    print(f"平均处理效应(ATE): {results['estimate'].value:.4f}")
    
    if hasattr(results['estimate'], 'conf_int'):
        ci = results['estimate'].conf_int()
        print(f"95%置信区间: [{ci[0]:.4f}, {ci[1]:.4f}]")
    
    print("\n解释:")
    for interpretation in results['interpretation']:
        print(f"  - {interpretation}")
    
    # 分析歧视路径
    print("\n歧视路径分析:")
    path_analysis = analyzer.analyze_discrimination_paths()
    
    print(f"总效应: {path_analysis['total_effect']:.4f}")
    print(f"直接效应: {path_analysis['direct_effect']:.4f}")
    print(f"间接效应: {path_analysis['indirect_effect']:.4f}")
    print(f"中介比例: {path_analysis['mediation_analysis']['proportion_mediated']:.2%}")
    
    return analyzer, results, path_analysis

公平性判断
因果路径分析
直接路径
间接路径
中介路径




性别
雇佣决定
工作表现
教育水平
工作经验
技能测试
未观测混杂因素
直接效应
是否显著?
间接效应
是否合理?
存在直接歧视
存在间接歧视
无直接歧视
无间接歧视
需要干预
可接受

05 实战:招聘AI简历筛选系统公平性测试

在这个实战案例中,我们将构建一个完整的招聘AI简历筛选系统公平性测试平台,涵盖性别和地域偏见的检测,生成详细的公平性报告,并提供修正建议。

python 复制代码
# Python实现:招聘AI系统公平性检测平台
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score
from fairlearn.metrics import demographic_parity_difference, equalized_odds_difference
from fairlearn.reductions import ExponentiatedGradient, DemographicParity
import warnings
warnings.filterwarnings('ignore')

class RecruitmentAIFairnessTester:
    """招聘AI系统公平性测试器"""
    
    def __init__(self):
        self.data = None
        self.model = None
        self.results = {}
        self.fairness_metrics = {}
        
    def load_and_preprocess_data(self, filepath=None):
        """加载和预处理数据"""
        if filepath:
            self.data = pd.read_csv(filepath)
        else:
            # 生成模拟招聘数据
            self.data = self.generate_simulated_recruitment_data()
        
        # 数据预处理
        self.preprocess_data()
        
        return self.data
    
    def generate_simulated_recruitment_data(self, n_samples=5000):
        """生成模拟招聘数据(包含多种偏见)"""
        np.random.seed(42)
        
        # 基础特征
        data = pd.DataFrame({
            'candidate_id': [f'C{10000 + i}' for i in range(n_samples)],
            'age': np.random.normal(35, 8, n_samples).clip(22, 60).astype(int),
            'years_experience': np.random.exponential(8, n_samples).clip(0, 30),
            'education_level': np.random.choice([1, 2, 3, 4], n_samples, p=[0.1, 0.4, 0.4, 0.1]),
            'skill_score': np.random.normal(70, 15, n_samples).clip(30, 100),
            'interview_score': np.random.normal(75, 12, n_samples).clip(40, 100),
            'portfolio_quality': np.random.beta(3, 2, n_samples) * 100,
        })
        
        # 敏感属性
        # 性别分布:60%男性,40%女性
        data['gender'] = np.random.choice(['M', 'F'], n_samples, p=[0.6, 0.4])
        
        # 地域分布(模拟地域偏见)
        regions = ['North', 'South', 'East', 'West', 'Central']
        region_probs = [0.25, 0.20, 0.25, 0.15, 0.15]  # 故意让某些地区样本较少
        data['region'] = np.random.choice(regions, n_samples, p=region_probs)
        
        # 历史偏见:女性候选人历史上获得的机会较少
        base_hiring_prob = 1 / (1 + np.exp(-(
            -1.8 +
            0.05 * (data['skill_score'] - 70) / 15 +
            0.08 * (data['interview_score'] - 75) / 12 +
            0.1 * (data['education_level'] - 2) +
            0.06 * (data['years_experience'] - 8) / 8 -
            0.1 * (data['age'] - 35) / 8 +  # 年龄稍大略有不利
            0.07 * data['portfolio_quality'] / 50
        )))
        
        # 添加性别偏见
        gender_bias = np.where(data['gender'] == 'F', -0.3, 0)
        
        # 添加地域偏见
        region_bias_map = {'North': 0, 'South': -0.2, 'East': 0.1, 'West': -0.1, 'Central': 0}
        region_bias = data['region'].map(region_bias_map)
        
        # 最终雇佣概率
        hiring_logit = np.log(base_hiring_prob / (1 - base_hiring_prob))
        hiring_logit += gender_bias + region_bias
        hiring_prob = 1 / (1 + np.exp(-hiring_logit))
        
        # 生成雇佣结果
        data['hired'] = (np.random.random(n_samples) < hiring_prob).astype(int)
        
        # 添加一些噪声
        noise = np.random.normal(0, 0.1, n_samples)
        data['skill_score'] += noise * 5
        
        return data
    
    def preprocess_data(self):
        """数据预处理"""
        if self.data is None:
            raise ValueError("数据未加载")
        
        # 编码分类变量
        self.data_encoded = self.data.copy()
        
        # 对性别进行编码
        self.data_encoded['gender_encoded'] = self.data_encoded['gender'].map({'M': 0, 'F': 1})
        
        # 对地域进行one-hot编码
        region_dummies = pd.get_dummies(self.data_encoded['region'], prefix='region')
        self.data_encoded = pd.concat([self.data_encoded, region_dummies], axis=1)
        
        # 准备特征和目标变量
        feature_columns = [
            'age', 'years_experience', 'education_level', 
            'skill_score', 'interview_score', 'portfolio_quality',
            'gender_encoded'
        ] + [col for col in self.data_encoded.columns if col.startswith('region_')]
        
        self.X = self.data_encoded[feature_columns]
        self.y = self.data_encoded['hired']
        
        # 敏感属性
        self.sensitive_attributes = {
            'gender': self.data_encoded['gender'].values,
            'region': self.data_encoded['region'].values
        }
    
    def train_model(self, use_fairness_constraint=False):
        """训练招聘预测模型"""
        # 划分训练测试集
        X_train, X_test, y_train, y_test = train_test_split(
            self.X, self.y, test_size=0.3, random_state=42, stratify=self.y
        )
        
        # 保存测试集的敏感属性
        train_idx, test_idx = train_test_split(
            np.arange(len(self.data)), test_size=0.3, random_state=42, stratify=self.y
        )
        
        self.X_test = X_test
        self.y_test = y_test
        
        self.test_gender = self.sensitive_attributes['gender'][test_idx]
        self.test_region = self.sensitive_attributes['region'][test_idx]
        
        if use_fairness_constraint:
            # 使用公平性约束训练模型
            print("使用公平性约束训练模型...")
            self.model = self.train_fair_model(X_train, y_train)
        else:
            # 标准随机森林
            print("训练标准模型...")
            self.model = RandomForestClassifier(
                n_estimators=100,
                max_depth=10,
                min_samples_leaf=10,
                random_state=42,
                class_weight='balanced'
            )
            self.model.fit(X_train, y_train)
        
        # 在测试集上评估
        self.y_pred = self.model.predict(X_test)
        self.y_pred_proba = self.model.predict_proba(X_test)[:, 1]
        
        # 计算性能指标
        self.calculate_performance_metrics(y_test, self.y_pred)
        
        # 计算公平性指标
        self.calculate_fairness_metrics()
        
        return self.model
    
    def train_fair_model(self, X_train, y_train):
        """训练带公平性约束的模型"""
        from fairlearn.reductions import ExponentiatedGradient, DemographicParity
        
        # 基础分类器
        base_classifier = RandomForestClassifier(
            n_estimators=50,
            max_depth=8,
            random_state=42
        )
        
        # 定义公平性约束(人口平等)
        constraint = DemographicParity()
        
        # 使用指数梯度法
        mitigator = ExponentiatedGradient(
            estimator=base_classifier,
            constraints=constraint,
            eps=0.01,  # 公平性约束的松弛度
            max_iter=50,
            eta0=2.0
        )
        
        # 需要敏感属性进行训练
        # 这里使用性别作为敏感属性
        sensitive_train = self.sensitive_attributes['gender'][
            X_train.index if hasattr(X_train, 'index') else 
            np.arange(len(X_train))
        ]
        
        mitigator.fit(X_train, y_train, sensitive_features=sensitive_train)
        
        return mitigator
    
    def calculate_performance_metrics(self, y_true, y_pred):
        """计算性能指标"""
        from sklearn.metrics import (
            accuracy_score, precision_score, recall_score, 
            f1_score, roc_auc_score, confusion_matrix
        )
        
        self.performance_metrics = {
            'accuracy': accuracy_score(y_true, y_pred),
            'precision': precision_score(y_true, y_pred, zero_division=0),
            'recall': recall_score(y_true, y_pred, zero_division=0),
            'f1_score': f1_score(y_true, y_pred, zero_division=0),
            'roc_auc': roc_auc_score(y_true, self.y_pred_proba),
            'confusion_matrix': confusion_matrix(y_true, y_pred).tolist()
        }
        
        # 计算各类别的详细指标
        tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
        
        self.performance_metrics.update({
            'true_positive': int(tp),
            'false_positive': int(fp),
            'true_negative': int(tn),
            'false_negative': int(fn),
            'true_positive_rate': tp / (tp + fn) if (tp + fn) > 0 else 0,
            'false_positive_rate': fp / (fp + tn) if (fp + tn) > 0 else 0
        })
    
    def calculate_fairness_metrics(self):
        """计算公平性指标"""
        self.fairness_metrics = {}
        
        # 按性别分析
        if hasattr(self, 'test_gender'):
            gender_metrics = self.analyze_by_sensitive_attribute(
                self.y_test, self.y_pred, self.test_gender, 'gender'
            )
            self.fairness_metrics['gender'] = gender_metrics
        
        # 按地域分析
        if hasattr(self, 'test_region'):
            region_metrics = self.analyze_by_sensitive_attribute(
                self.y_test, self.y_pred, self.test_region, 'region'
            )
            self.fairness_metrics['region'] = region_metrics
        
        # 计算全局公平性指标
        self.calculate_global_fairness_metrics()
    
    def analyze_by_sensitive_attribute(self, y_true, y_pred, sensitive_array, attribute_name):
        """按敏感属性分析公平性"""
        from scipy.stats import chi2_contingency
        
        groups = np.unique(sensitive_array)
        metrics_by_group = {}
        
        for group in groups:
            mask = sensitive_array == group
            group_y_true = y_true[mask]
            group_y_pred = y_pred[mask]
            
            if len(group_y_true) == 0:
                continue
            
            # 计算基本指标
            accuracy = np.mean(group_y_pred == group_y_true)
            positive_rate = np.mean(group_y_pred)
            
            # 计算混淆矩阵
            tn, fp, fn, tp = self.calculate_confusion_matrix(group_y_true, group_y_pred)
            
            tpr = tp / (tp + fn) if (tp + fn) > 0 else 0
            fpr = fp / (fp + tn) if (fp + tn) > 0 else 0
            
            metrics_by_group[group] = {
                'sample_size': len(group_y_true),
                'accuracy': accuracy,
                'positive_rate': positive_rate,
                'true_positive_rate': tpr,
                'false_positive_rate': fpr,
                'true_positives': int(tp),
                'false_positives': int(fp),
                'true_negatives': int(tn),
                'false_negatives': int(fn)
            }
        
        # 计算群体间差异
        if len(metrics_by_group) >= 2:
            # 找出参考组(通常是最大的群体)
            reference_group = max(metrics_by_group.keys(), 
                                key=lambda x: metrics_by_group[x]['sample_size'])
            
            differences = {}
            for group in metrics_by_group:
                if group != reference_group:
                    diff = {}
                    for metric in ['positive_rate', 'true_positive_rate', 'false_positive_rate']:
                        val_ref = metrics_by_group[reference_group][metric]
                        val_group = metrics_by_group[group][metric]
                        diff[metric] = val_group - val_ref
                    
                    differences[f'{group}_vs_{reference_group}'] = diff
            
            metrics_by_group['differences'] = differences
        
        # 计算统计显著性
        contingency_table = []
        for group in groups:
            mask = sensitive_array == group
            group_y_pred = y_pred[mask]
            positive_count = np.sum(group_y_pred)
            negative_count = len(group_y_pred) - positive_count
            contingency_table.append([positive_count, negative_count])
        
        if len(contingency_table) >= 2:
            chi2, p_value, dof, expected = chi2_contingency(contingency_table)
            metrics_by_group['chi2_test'] = {
                'chi2': chi2,
                'p_value': p_value,
                'significant': p_value < 0.05
            }
        
        return metrics_by_group
    
    def calculate_confusion_matrix(self, y_true, y_pred):
        """计算混淆矩阵"""
        tp = np.sum((y_pred == 1) & (y_true == 1))
        fp = np.sum((y_pred == 1) & (y_true == 0))
        tn = np.sum((y_pred == 0) & (y_true == 0))
        fn = np.sum((y_pred == 0) & (y_true == 1))
        
        return tn, fp, fn, tp
    
    def calculate_global_fairness_metrics(self):
        """计算全局公平性指标"""
        # 使用fairlearn计算
        if hasattr(self, 'test_gender'):
            gender_array = self.test_gender
            
            # 人口平等性差异
            dp_diff = demographic_parity_difference(
                y_true=self.y_test,
                y_pred=self.y_pred,
                sensitive_features=gender_array
            )
            
            # 机会平等性差异
            eo_diff = equalized_odds_difference(
                y_true=self.y_test,
                y_pred=self.y_pred,
                sensitive_features=gender_array
            )
            
            self.fairness_metrics['global'] = {
                'demographic_parity_difference': dp_diff,
                'equalized_odds_difference': eo_diff,
                'meets_80_percent_rule': self.check_80_percent_rule()
            }
    
    def check_80_percent_rule(self):
        """检查80%规则"""
        if 'gender' in self.fairness_metrics:
            gender_metrics = self.fairness_metrics['gender']
            
            # 找出通过率最高和最低的群体
            groups = [g for g in gender_metrics.keys() if isinstance(g, str) and g != 'differences' and g != 'chi2_test']
            
            if len(groups) >= 2:
                positive_rates = {g: gender_metrics[g]['positive_rate'] for g in groups}
                max_group = max(positive_rates, key=positive_rates.get)
                min_group = min(positive_rates, key=positive_rates.get)
                
                max_rate = positive_rates[max_group]
                min_rate = positive_rates[min_group]
                
                if max_rate > 0:
                    ratio = min_rate / max_rate
                    return ratio >= 0.8, ratio
                
        return False, 0
    
    def generate_fairness_report(self):
        """生成公平性测试报告"""
        report = {
            'performance_summary': self.performance_metrics,
            'fairness_analysis': self.fairness_metrics,
            'timestamp': pd.Timestamp.now().isoformat(),
            'data_summary': {
                'total_samples': len(self.data),
                'hiring_rate': self.data['hired'].mean(),
                'gender_distribution': self.data['gender'].value_counts().to_dict(),
                'region_distribution': self.data['region'].value_counts().to_dict()
            }
        }
        
        # 添加解释和建议
        report['interpretations'] = self.interpret_results()
        report['recommendations'] = self.generate_recommendations()
        
        self.report = report
        return report
    
    def interpret_results(self):
        """解释测试结果"""
        interpretations = []
        
        # 解释性能
        accuracy = self.performance_metrics['accuracy']
        if accuracy >= 0.85:
            interpretations.append(f"模型性能优秀,准确率达到{accuracy:.1%}")
        elif accuracy >= 0.75:
            interpretations.append(f"模型性能良好,准确率为{accuracy:.1%}")
        else:
            interpretations.append(f"模型性能有待提高,准确率仅{accuracy:.1%}")
        
        # 解释性别公平性
        if 'gender' in self.fairness_metrics:
            gender_metrics = self.fairness_metrics['gender']
            
            if 'M' in gender_metrics and 'F' in gender_metrics:
                male_rate = gender_metrics['M']['positive_rate']
                female_rate = gender_metrics['F']['positive_rate']
                diff = abs(male_rate - female_rate)
                
                if diff < 0.05:
                    interpretations.append(f"性别公平性良好:男女通过率差异仅{diff:.1%}")
                elif diff < 0.1:
                    interpretations.append(f"性别公平性可接受:男女通过率差异为{diff:.1%}")
                else:
                    interpretations.append(f"⚠️ 性别公平性问题:男女通过率差异达{diff:.1%}")
                
                # 检查80%规则
                meets_80_percent, ratio = self.check_80_percent_rule()
                if meets_80_percent:
                    interpretations.append(f"满足80%规则:弱势群体通过率为优势群体的{ratio:.1%}")
                else:
                    interpretations.append(f"⚠️ 违反80%规则:弱势群体通过率仅为优势群体的{ratio:.1%}")
        
        # 解释地域公平性
        if 'region' in self.fairness_metrics:
            region_metrics = self.fairness_metrics['region']
            
            # 找出通过率最高和最低的地域
            regions = [r for r in region_metrics.keys() if isinstance(r, str) and r != 'differences']
            
            if len(regions) >= 2:
                positive_rates = {r: region_metrics[r]['positive_rate'] for r in regions}
                max_region = max(positive_rates, key=positive_rates.get)
                min_region = min(positive_rates, key=positive_rates.get)
                
                max_rate = positive_rates[max_region]
                min_rate = positive_rates[min_region]
                diff = max_rate - min_rate
                
                if diff > 0.15:
                    interpretations.append(f"⚠️ 严重地域偏见:{max_region}通过率比{min_region}高{diff:.1%}")
                elif diff > 0.08:
                    interpretations.append(f"地域偏见:{max_region}通过率比{min_region}高{diff:.1%}")
        
        # 全局公平性指标
        if 'global' in self.fairness_metrics:
            global_metrics = self.fairness_metrics['global']
            
            dp_diff = global_metrics.get('demographic_parity_difference', 0)
            if dp_diff < 0.05:
                interpretations.append("人口平等性良好")
            elif dp_diff < 0.1:
                interpretations.append(f"人口平等性可接受,差异为{dp_diff:.3f}")
            else:
                interpretations.append(f"⚠️ 人口平等性问题,差异达{dp_diff:.3f}")
        
        return interpretations
    
    def generate_recommendations(self):
        """生成改进建议"""
        recommendations = []
        
        # 基于性别偏见的建议
        if 'gender' in self.fairness_metrics:
            gender_metrics = self.fairness_metrics['gender']
            
            if 'M' in gender_metrics and 'F' in gender_metrics:
                male_rate = gender_metrics['M']['positive_rate']
                female_rate = gender_metrics['F']['positive_rate']
                
                if abs(male_rate - female_rate) > 0.1:
                    recommendations.append("实施匿名简历筛选,隐藏候选人性别信息")
                    recommendations.append("审查模型中可能与性别相关的特征权重")
                    recommendations.append("考虑使用公平性约束重新训练模型")
        
        # 基于地域偏见的建议
        if 'region' in self.fairness_metrics:
            region_metrics = self.fairness_metrics['region']
            regions = [r for r in region_metrics.keys() if isinstance(r, str) and r != 'differences']
            
            if len(regions) >= 2:
                positive_rates = {r: region_metrics[r]['positive_rate'] for r in regions}
                max_region = max(positive_rates, key=positive_rates.get)
                min_region = min(positive_rates, key=positive_rates.get)
                
                if positive_rates[max_region] - positive_rates[min_region] > 0.15:
                    recommendations.append(f"调查{min_region}地区候选人评分偏低的原因")
                    recommendations.append("确保地域信息不直接或间接影响招聘决策")
                    recommendations.append("考虑为不同地域设置差异化的公平性阈值")
        
        # 通用建议
        recommendations.append("建立定期的公平性审计机制")
        recommendations.append("收集更详细的公平性相关数据")
        recommendations.append("实施多维度偏见检测(交叉偏见分析)")
        recommendations.append("建立偏见投诉和修正流程")
        
        return recommendations
    
    def save_report(self, filepath='fairness_report.json'):
        """保存报告到文件"""
        import json
        
        report = self.generate_fairness_report()
        
        # 转换numpy类型为Python原生类型
        def convert_numpy_types(obj):
            if isinstance(obj, (np.integer, np.floating)):
                return obj.item()
            elif isinstance(obj, np.ndarray):
                return obj.tolist()
            elif isinstance(obj, dict):
                return {k: convert_numpy_types(v) for k, v in obj.items()}
            elif isinstance(obj, list):
                return [convert_numpy_types(v) for v in obj]
            else:
                return obj
        
        report_converted = convert_numpy_types(report)
        
        with open(filepath, 'w', encoding='utf-8') as f:
            json.dump(report_converted, f, indent=2, ensure_ascii=False)
        
        print(f"报告已保存到: {filepath}")
        return filepath

# 使用示例
def run_recruitment_fairness_test():
    """运行招聘系统公平性测试"""
    print("=" * 60)
    print("招聘AI系统公平性测试平台")
    print("=" * 60)
    
    # 创建测试器
    tester = RecruitmentAIFairnessTester()
    
    # 加载数据
    print("\n1. 加载数据...")
    data = tester.load_and_preprocess_data()
    print(f"   加载完成,共{len(data)}条记录")
    print(f"   雇佣率: {data['hired'].mean():.2%}")
    print(f"   性别分布: {data['gender'].value_counts().to_dict()}")
    print(f"   地域分布: {data['region'].value_counts().to_dict()}")
    
    # 训练标准模型
    print("\n2. 训练标准模型...")
    standard_model = tester.train_model(use_fairness_constraint=False)
    
    # 生成报告
    print("\n3. 生成公平性测试报告...")
    report = tester.generate_fairness_report()
    
    # 打印关键发现
    print("\n4. 关键发现:")
    print("-" * 40)
    
    for i, interpretation in enumerate(report['interpretations'], 1):
        print(f"   {i}. {interpretation}")
    
    print("\n5. 改进建议:")
    print("-" * 40)
    
    for i, recommendation in enumerate(report['recommendations'], 1):
        print(f"   {i}. {recommendation}")
    
    # 保存报告
    print("\n6. 保存详细报告...")
    report_path = tester.save_report()
    
    print("\n" + "=" * 60)
    print("测试完成!详细报告已生成。")
    print("=" * 60)
    
    # 返回结果用于进一步分析
    return tester, report

# 运行测试
if __name__ == "__main__":
    tester, report = run_recruitment_fairness_test()

06 踩坑指南:公平性测试的实践智慧

陷阱一:指标达标但业务仍投诉

问题描述:公平性统计指标(如人口平等性差异<0.1,不同影响比率>0.8)全部达标,但业务方或用户仍然投诉系统存在歧视。

深层原因分析

  1. 指标局限性:统计公平性指标只捕捉群体层面的平均差异,无法发现特定子群体或个体层面的不公平。

  2. 交叉性偏见:单一维度公平性指标可能掩盖多维度交叉偏见。例如,系统对"年轻女性"和"年长男性"都公平,但对"年长女性"存在偏见。

  3. 业务理解差异:业务部门可能关注不同维度的公平性,或者对"公平"有不同定义。

  4. 情境敏感性:某些情境下的微小差异可能被放大,如招聘最终阶段的小差异可能导致结果显著不同。

解决方案

python 复制代码
# 交叉偏见检测
def detect_intersectional_bias(data, predictions, sensitive_attributes):
    """
    检测交叉偏见:多维度敏感属性的组合偏见
    """
    # 创建交叉分组
    data['cross_group'] = data[sensitive_attributes].apply(
        lambda row: '_'.join([str(row[attr]) for attr in sensitive_attributes]), 
        axis=1
    )
    
    cross_groups = data['cross_group'].unique()
    results = {}
    
    for group in cross_groups:
        group_mask = data['cross_group'] == group
        group_size = group_mask.sum()
        
        if group_size < 10:  # 忽略太小的组
            continue
        
        group_positive_rate = predictions[group_mask].mean()
        
        # 与整体平均比较
        overall_rate = predictions.mean()
        deviation = group_positive_rate - overall_rate
        
        # 统计显著性检验
        from statsmodels.stats.proportion import proportions_ztest
        group_success = int(predictions[group_mask].sum())
        overall_success = int(predictions.sum())
        
        z_stat, p_value = proportions_ztest(
            [group_success, overall_success],
            [group_size, len(predictions)]
        )
        
        results[group] = {
            'sample_size': group_size,
            'positive_rate': group_positive_rate,
            'deviation_from_overall': deviation,
            'z_score': z_stat,
            'p_value': p_value,
            'significant': p_value < 0.05
        }
    
    # 识别最受偏见的交叉组
    most_disadvantaged = min(results.items(), 
                           key=lambda x: x[1]['positive_rate'])
    most_advantaged = max(results.items(), 
                         key=lambda x: x[1]['positive_rate'])
    
    return {
        'detailed_results': results,
        'most_disadvantaged': most_disadvantaged,
        'most_advantaged': most_advantaged,
        'max_disparity': most_advantaged[1]['positive_rate'] - most_disadvantaged[1]['positive_rate']
    }

陷阱二:多维度偏见检测的复杂性

挑战:当系统涉及多个敏感属性(性别、年龄、地域、种族等)时,公平性检测变得极其复杂。

  1. 维度组合爆炸:n个敏感属性可能产生2^n个需要检测的交叉组。

  2. 数据稀疏性:某些交叉组可能样本量太小,无法进行可靠的统计检验。

  3. 指标冲突:优化一个维度的公平性可能损害另一个维度的公平性。

解决方案

python 复制代码
def multi_dimensional_fairness_analysis(data, predictions, sensitive_attributes_list):
    """
    多维度公平性分析框架
    """
    results = {}
    
    # 1. 单维度分析
    single_dim_results = {}
    for attr in sensitive_attributes_list:
        single_dim_results[attr] = analyze_single_attribute_fairness(
            data[attr], predictions
        )
    
    results['single_dimension'] = single_dim_results
    
    # 2. 两两交叉分析
    pairwise_results = {}
    for i in range(len(sensitive_attributes_list)):
        for j in range(i + 1, len(sensitive_attributes_list)):
            attr1, attr2 = sensitive_attributes_list[i], sensitive_attributes_list[j]
            key = f"{attr1}_x_{attr2}"
            
            pairwise_results[key] = detect_intersectional_bias(
                data, predictions, [attr1, attr2]
            )
    
    results['pairwise_interactions'] = pairwise_results
    
    # 3. 识别最严重的偏见维度
    fairness_scores = {}
    for attr, analysis in single_dim_results.items():
        # 综合多个指标计算公平性分数
        score = calculate_fairness_score(analysis)
        fairness_scores[attr] = score
    
    # 4. 生成优先级建议
    priority_list = sorted(fairness_scores.items(), key=lambda x: x[1])
    
    results['priority_recommendations'] = [
        {
            'attribute': attr,
            'fairness_score': score,
            'recommendation': generate_recommendation(attr, single_dim_results[attr])
        }
        for attr, score in priority_list[:3]  # 优先处理最严重的三个
    ]
    
    return results

def calculate_fairness_score(analysis):
    """综合计算公平性分数"""
    # 考虑多个指标:人口平等性差异、不同影响比率、统计显著性等
    score = 100
    
    # 人口平等性差异惩罚
    if 'demographic_parity_difference' in analysis:
        dp_diff = analysis['demographic_parity_difference']
        if dp_diff > 0.2:
            score -= 40
        elif dp_diff > 0.1:
            score -= 20
        elif dp_diff > 0.05:
            score -= 10
    
    # 不同影响比率惩罚
    if 'disparate_impact_ratio' in analysis:
        di_ratio = analysis['disparate_impact_ratio']
        if di_ratio < 0.7:
            score -= 30
        elif di_ratio < 0.8:
            score -= 15
    
    # 统计显著性惩罚
    if 'significant_difference' in analysis and analysis['significant_difference']:
        score -= 20
    
    return max(0, score)

陷阱三:Vue公平性图表数据更新实时性优化

问题:在大数据量或实时数据流场景下,Vue前端公平性图表可能出现更新延迟、卡顿或内存问题。

优化策略

vue 复制代码
<!-- Vue实现:高性能公平性监控仪表盘 -->
<template>
  <div class="high-performance-fairness-dashboard">
    <!-- 使用虚拟化技术的大型数据表格 -->
    <div class="virtualized-table-container">
      <RecycleScroller
        :items="visibleData"
        :item-size="50"
        key-field="id"
        v-slot="{ item }"
        class="scroller"
        @resize="handleResize"
        @visible="handleVisibleChange">
        <FairnessDataRow :data="item" />
      </RecycleScroller>
    </div>
    
    <!-- 使用Web Worker进行后台计算的指标 -->
    <div class="metrics-display">
      <div v-for="metric in computedMetrics" :key="metric.name" class="metric-card">
        <h4>{{ metric.name }}</h4>
        <div class="metric-value">{{ metric.value }}</div>
        <div class="metric-trend" :class="metric.trendClass">
          <i :class="`fas fa-${metric.trendIcon}`"></i>
          {{ metric.trendValue }}
        </div>
      </div>
    </div>
    
    <!-- 使用Canvas渲染的大型图表 -->
    <div class="chart-container">
      <canvas ref="fairnessChart" @wheel="handleChartZoom"></canvas>
    </div>
    
    <!-- 实时数据流更新指示器 -->
    <div class="update-indicator" :class="{ updating: isUpdating }">
      <span v-if="isUpdating">
        <i class="fas fa-sync fa-spin"></i> 更新数据...
      </span>
      <span v-else>
        <i class="fas fa-check"></i> 数据已就绪
      </span>
      <span class="last-update">最后更新: {{ lastUpdateTime }}</span>
    </div>
    
    <!-- 性能监控面板 -->
    <div class="performance-monitor" v-if="showPerformance">
      <div class="performance-metric">
        <span>FPS: </span>
        <span :class="fpsClass">{{ fps.toFixed(1) }}</span>
      </div>
      <div class="performance-metric">
        <span>内存: </span>
        <span>{{ formatMemory(memoryUsage) }}</span>
      </div>
      <div class="performance-metric">
        <span>数据点: </span>
        <span>{{ totalDataPoints }}</span>
      </div>
    </div>
  </div>
</template>

<script>
import { RecycleScroller } from 'vue-virtual-scroller'
import 'vue-virtual-scroller/dist/vue-virtual-scroller.css'
import FairnessDataRow from './FairnessDataRow.vue'
import { ElMessage } from 'element-ui'

// 导入Web Worker
import FairnessWorker from './fairness.worker.js'

export default {
  components: {
    RecycleScroller,
    FairnessDataRow
  },
  data() {
    return {
      // 数据相关
      allData: [],  // 全部数据
      visibleData: [],  // 当前可见数据
      
      // 性能相关
      fps: 60,
      memoryUsage: 0,
      isUpdating: false,
      lastUpdateTime: null,
      frameCount: 0,
      lastTime: performance.now(),
      
      // 计算指标
      computedMetrics: [],
      
      // 图表相关
      chartInstance: null,
      zoomLevel: 1,
      
      // 配置
      showPerformance: true,
      dataUpdateInterval: 5000,  // 5秒更新一次
      maxDataPoints: 10000,  // 最大数据点数
      
      // Web Worker
      fairnessWorker: null,
      workerReady: false
    }
  },
  computed: {
    totalDataPoints() {
      return this.allData.length
    },
    fpsClass() {
      if (this.fps >= 50) return 'good'
      if (this.fps >= 30) return 'warning'
      return 'danger'
    }
  },
  mounted() {
    this.initDashboard()
    
    // 初始化Web Worker
    this.initWebWorker()
    
    // 开始数据更新循环
    this.startDataUpdateCycle()
    
    // 开始性能监控
    this.startPerformanceMonitoring()
    
    // 监听窗口大小变化
    window.addEventListener('resize', this.handleResize)
    
    // 初始化图表
    this.initChart()
  },
  beforeDestroy() {
    // 清理资源
    this.stopDataUpdateCycle()
    this.stopPerformanceMonitoring()
    
    // 终止Web Worker
    if (this.fairnessWorker) {
      this.fairnessWorker.terminate()
    }
    
    // 移除事件监听器
    window.removeEventListener('resize', this.handleResize)
    
    // 销毁图表
    if (this.chartInstance) {
      this.chartInstance.destroy()
    }
  },
  methods: {
    initDashboard() {
      // 初始化加载数据
      this.loadInitialData()
    },
    
    initWebWorker() {
      // 创建Web Worker进行后台计算
      if (window.Worker) {
        this.fairnessWorker = new FairnessWorker()
        
        this.fairnessWorker.onmessage = (event) => {
          const { type, data } = event.data
          
          switch (type) {
            case 'metrics_calculated':
              this.handleWorkerMetrics(data)
              break
            case 'bias_detected':
              this.handleWorkerBiasDetection(data)
              break
            case 'worker_ready':
              this.workerReady = true
              console.log('公平性计算Worker已就绪')
              break
          }
        }
        
        // 发送初始化消息
        this.fairnessWorker.postMessage({
          type: 'init',
          config: {
            maxDataPoints: this.maxDataPoints
          }
        })
      } else {
        console.warn('浏览器不支持Web Worker,将使用主线程计算')
      }
    },
    
    loadInitialData() {
      // 模拟加载初始数据
      this.isUpdating = true
      
      // 在实际应用中,这里应该是API调用
      setTimeout(() => {
        const mockData = this.generateMockData(1000)
        this.allData = mockData
        this.visibleData = mockData.slice(0, 100)  // 初始只显示100条
        
        this.isUpdating = false
        this.lastUpdateTime = new Date().toLocaleTimeString()
        
        // 发送数据到Web Worker进行计算
        if (this.workerReady) {
          this.fairnessWorker.postMessage({
            type: 'calculate_metrics',
            data: mockData
          })
        } else {
          // 降级:在主线程计算
          this.calculateMetricsLocally(mockData)
        }
      }, 1000)
    },
    
    generateMockData(count) {
      // 生成模拟公平性数据
      const data = []
      const groups = ['Male', 'Female', 'Other']
      const regions = ['North', 'South', 'East', 'West']
      
      for (let i = 0; i < count; i++) {
        const group = groups[Math.floor(Math.random() * groups.length)]
        const region = regions[Math.floor(Math.random() * regions.length)]
        const age = 20 + Math.floor(Math.random() * 40)
        
        // 模拟偏见:某些群体通过率较低
        let baseProb = 0.5
        if (group === 'Female') baseProb *= 0.8
        if (region === 'South') baseProb *= 0.7
        if (age > 50) baseProb *= 0.6
        
        const hired = Math.random() < baseProb ? 1 : 0
        const score = 50 + Math.random() * 50
        
        data.push({
          id: `candidate_${i}`,
          group,
          region,
          age,
          hired,
          score,
          timestamp: Date.now() - Math.random() * 1000000000
        })
      }
      
      return data
    },
    
    startDataUpdateCycle() {
      // 定时更新数据
      this.updateInterval = setInterval(() => {
        this.updateData()
      }, this.dataUpdateInterval)
    },
    
    stopDataUpdateCycle() {
      if (this.updateInterval) {
        clearInterval(this.updateInterval)
      }
    },
    
    updateData() {
      if (this.isUpdating) return  // 防止重复更新
      
      this.isUpdating = true
      
      // 模拟数据更新
      setTimeout(() => {
        const newData = this.generateMockData(100)
        this.allData = [...this.allData.slice(-this.maxDataPoints + 100), ...newData]
        
        // 更新可见数据
        this.updateVisibleData()
        
        this.isUpdating = false
        this.lastUpdateTime = new Date().toLocaleTimeString()
        
        // 发送到Web Worker更新计算
        if (this.workerReady) {
          this.fairnessWorker.postMessage({
            type: 'update_data',
            data: newData
          })
        }
      }, 500)
    },
    
    updateVisibleData() {
      // 根据当前滚动位置更新可见数据
      // 这里简化实现,实际应该基于虚拟滚动位置
      const startIndex = 0
      const endIndex = Math.min(100, this.allData.length)
      this.visibleData = this.allData.slice(startIndex, endIndex)
    },
    
    handleWorkerMetrics(metrics) {
      // 处理从Worker返回的指标计算结果
      this.computedMetrics = metrics.map(metric => ({
        name: metric.name,
        value: metric.value.toFixed(3),
        trend: metric.trend,
        trendIcon: metric.trend > 0 ? 'arrow-up' : metric.trend < 0 ? 'arrow-down' : 'arrow-right',
        trendValue: metric.trend !== 0 ? `${metric.trend > 0 ? '+' : ''}${metric.trend.toFixed(2)}` : '0.00',
        trendClass: metric.trend > 0 ? 'positive' : metric.trend < 0 ? 'negative' : 'neutral'
      }))
    },
    
    handleWorkerBiasDetection(biasInfo) {
      // 处理从Worker返回的偏见检测结果
      if (biasInfo.severity === 'high') {
        ElMessage.warning(`检测到严重偏见: ${biasInfo.message}`)
      } else if (biasInfo.severity === 'medium') {
        ElMessage.info(`检测到偏见: ${biasInfo.message}`)
      }
    },
    
    calculateMetricsLocally(data) {
      // 主线程计算指标(Web Worker不可用时的降级方案)
      
      // 使用requestIdleCallback避免阻塞主线程
      if ('requestIdleCallback' in window) {
        requestIdleCallback(() => {
          const metrics = this.computeFairnessMetrics(data)
          this.computedMetrics = metrics
        })
      } else {
        // 降级:直接计算
        setTimeout(() => {
          const metrics = this.computeFairnessMetrics(data)
          this.computedMetrics = metrics
        }, 0)
      }
    },
    
    computeFairnessMetrics(data) {
      // 简化的公平性指标计算
      const groups = ['Male', 'Female', 'Other']
      const metrics = []
      
      // 计算各群体雇佣率
      for (const group of groups) {
        const groupData = data.filter(d => d.group === group)
        if (groupData.length === 0) continue
        
        const hireRate = groupData.filter(d => d.hired).length / groupData.length
        
        metrics.push({
          name: `${group}雇佣率`,
          value: hireRate,
          trend: Math.random() * 0.1 - 0.05  // 模拟趋势
        })
      }
      
      // 计算公平性差异
      if (metrics.length >= 2) {
        const maxRate = Math.max(...metrics.map(m => m.value))
        const minRate = Math.min(...metrics.map(m => m.value))
        const disparity = maxRate - minRate
        
        metrics.push({
          name: '最大群体差异',
          value: disparity,
          trend: Math.random() * 0.05 - 0.025
        })
      }
      
      return metrics
    },
    
    initChart() {
      // 初始化Canvas图表
      const canvas = this.$refs.fairnessChart
      const ctx = canvas.getContext('2d')
      
      // 设置Canvas大小
      this.resizeChartCanvas()
      
      // 绘制初始图表
      this.drawChart()
    },
    
    resizeChartCanvas() {
      // 根据容器大小调整Canvas
      const canvas = this.$refs.fairnessChart
      const container = canvas.parentElement
      
      canvas.width = container.clientWidth * window.devicePixelRatio
      canvas.height = container.clientHeight * window.devicePixelRatio
      
      canvas.style.width = `${container.clientWidth}px`
      canvas.style.height = `${container.clientHeight}px`
    },
    
    drawChart() {
      // 使用Canvas绘制图表
      const canvas = this.$refs.fairnessChart
      const ctx = canvas.getContext('2d')
      
      // 清除画布
      ctx.clearRect(0, 0, canvas.width, canvas.height)
      
      // 绘制图表内容
      // 这里简化实现,实际应该使用图表库或自定义绘制
      
      // 示例:绘制简单的柱状图
      const metrics = this.computedMetrics
      const barWidth = canvas.width / (metrics.length * 1.5)
      const maxValue = Math.max(...metrics.map(m => m.value), 1)
      
      ctx.save()
      ctx.scale(window.devicePixelRatio, window.devicePixelRatio)
      
      metrics.forEach((metric, index) => {
        const x = index * barWidth * 1.5 + 50
        const barHeight = (metric.value / maxValue) * (canvas.height / window.devicePixelRatio - 100)
        const y = canvas.height / window.devicePixelRatio - barHeight - 50
        
        // 绘制柱状图
        ctx.fillStyle = this.getMetricColor(metric.value, maxValue)
        ctx.fillRect(x, y, barWidth, barHeight)
        
        // 绘制标签
        ctx.fillStyle = '#333'
        ctx.font = '12px Arial'
        ctx.textAlign = 'center'
        ctx.fillText(metric.name, x + barWidth / 2, canvas.height / window.devicePixelRatio - 30)
        ctx.fillText(metric.value.toFixed(3), x + barWidth / 2, y - 10)
      })
      
      ctx.restore()
    },
    
    getMetricColor(value, maxValue) {
      // 根据值返回颜色
      const ratio = value / maxValue
      
      if (ratio > 0.8) return '#67c23a'  // 绿色
      if (ratio > 0.6) return '#e6a23c'  // 黄色
      if (ratio > 0.4) return '#f56c6c'  // 红色
      return '#909399'  // 灰色
    },
    
    handleChartZoom(event) {
      // 处理图表缩放
      event.preventDefault()
      
      const delta = event.deltaY > 0 ? 0.9 : 1.1
      this.zoomLevel = Math.max(0.1, Math.min(5, this.zoomLevel * delta))
      
      // 重绘图表
      this.drawChart()
    },
    
    startPerformanceMonitoring() {
      // 开始性能监控
      this.performanceInterval = setInterval(() => {
        this.updatePerformanceMetrics()
      }, 1000)
    },
    
    stopPerformanceMonitoring() {
      if (this.performanceInterval) {
        clearInterval(this.performanceInterval)
      }
    },
    
    updatePerformanceMetrics() {
      // 更新FPS
      const now = performance.now()
      const delta = now - this.lastTime
      
      if (delta >= 1000) {
        this.fps = (this.frameCount * 1000) / delta
        this.frameCount = 0
        this.lastTime = now
      }
      
      this.frameCount++
      
      // 更新内存使用(如果支持)
      if (window.performance && window.performance.memory) {
        this.memoryUsage = window.performance.memory.usedJSHeapSize
      }
    },
    
    formatMemory(bytes) {
      const units = ['B', 'KB', 'MB', 'GB']
      let size = bytes
      let unitIndex = 0
      
      while (size >= 1024 && unitIndex < units.length - 1) {
        size /= 1024
        unitIndex++
      }
      
      return `${size.toFixed(1)} ${units[unitIndex]}`
    },
    
    handleResize() {
      // 处理窗口大小变化
      this.resizeChartCanvas()
      this.drawChart()
      
      // 更新虚拟滚动
      this.updateVisibleData()
    },
    
    handleVisibleChange({ start, end }) {
      // 处理可见区域变化
      this.visibleData = this.allData.slice(start, end + 1)
    }
  }
}
</script>

<style scoped>
.high-performance-fairness-dashboard {
  position: relative;
  height: 100vh;
  display: flex;
  flex-direction: column;
  background: #f5f7fa;
}

.virtualized-table-container {
  flex: 1;
  min-height: 300px;
  border: 1px solid #ebeef5;
  border-radius: 4px;
  background: white;
  overflow: hidden;
}

.scroller {
  height: 100%;
}

.metrics-display {
  display: flex;
  gap: 16px;
  padding: 16px;
  background: white;
  border-bottom: 1px solid #ebeef5;
}

.metric-card {
  flex: 1;
  padding: 12px;
  background: #f5f7fa;
  border-radius: 4px;
  text-align: center;
}

.metric-card h4 {
  margin: 0 0 8px 0;
  font-size: 14px;
  color: #606266;
}

.metric-value {
  font-size: 24px;
  font-weight: bold;
  color: #303133;
  margin-bottom: 4px;
}

.metric-trend.positive {
  color: #67c23a;
}

.metric-trend.negative {
  color: #f56c6c;
}

.metric-trend.neutral {
  color: #909399;
}

.chart-container {
  flex: 2;
  min-height: 400px;
  padding: 16px;
  background: white;
  border-radius: 4px;
  margin: 16px;
  border: 1px solid #ebeef5;
}

.chart-container canvas {
  width: 100%;
  height: 100%;
}

.update-indicator {
  position: fixed;
  bottom: 20px;
  right: 20px;
  padding: 8px 16px;
  background: white;
  border-radius: 20px;
  box-shadow: 0 2px 12px rgba(0, 0, 0, 0.1);
  display: flex;
  align-items: center;
  gap: 8px;
  font-size: 14px;
  z-index: 1000;
}

.update-indicator.updating {
  background: #f0f9eb;
  color: #67c23a;
}

.last-update {
  color: #909399;
  font-size: 12px;
}

.performance-monitor {
  position: fixed;
  top: 20px;
  right: 20px;
  padding: 8px 12px;
  background: rgba(0, 0, 0, 0.7);
  color: white;
  border-radius: 4px;
  font-size: 12px;
  z-index: 1000;
}

.performance-metric {
  display: flex;
  justify-content: space-between;
  margin-bottom: 4px;
}

.performance-metric:last-child {
  margin-bottom: 0;
}

.good { color: #67c23a; }
.warning { color: #e6a23c; }
.danger { color: #f56c6c; }
</style>

关键优化技术总结

  1. 虚拟滚动 :使用vue-virtual-scroller等库实现大数据列表的高效渲染。

  2. Web Worker:将计算密集型任务(公平性指标计算、偏见检测)移出主线程。

  3. Canvas渲染:对于复杂图表,使用Canvas替代SVG/DOM渲染提高性能。

  4. 增量更新:只更新变化的数据部分,而非整个数据集。

  5. 节流与防抖:对频繁触发的事件(如滚动、调整大小)进行节流处理。

  6. 内存管理:及时清理不再需要的数据和监听器,防止内存泄漏。

  7. 请求闲置期处理 :使用requestIdleCallback在浏览器空闲时执行非关键任务。

  8. 数据分页与懒加载:只加载和渲染当前可见的数据。

07 总结:构建负责任的AI系统

公平性测试不仅是技术挑战,更是伦理责任。通过本文介绍的Python、Java和Vue全栈实现方案,我们可以构建系统化的偏见检测与公平性保障机制:

  1. 多层次检测框架:从数据偏见检测到算法公平性测试,覆盖AI系统全生命周期。

  2. 多维度公平性指标:兼顾统计公平性、因果公平性和业务公平性。

  3. 全栈技术实现:后端Python/Java处理计算,前端Vue提供交互式可视化。

  4. 实战导向的方法:针对实际业务场景(如招聘AI)提供完整解决方案。

  5. 持续优化策略:解决大规模数据处理、实时更新等工程挑战。

未来展望 :随着AI伦理法规的完善和技术的进步,公平性测试将更加自动化、标准化。联邦学习、可解释AI、因果推断等新技术将为公平性测试提供更强大的工具。但无论如何发展,核心原则不变:构建对所有人公平、负责任的AI系统

相关推荐
北京盟通科技官方账号2 小时前
从“人机交互”到“数字预演”:详解 HMI、SCADA 与虚拟调试的闭环架构
人工智能·人机交互·数字孪生·scada·系统集成·hmi·虚拟调试
刀法如飞2 小时前
从零手搓一个类Spring框架,彻底搞懂Spring核心原理
java·设计模式·架构设计
wheelmouse77882 小时前
Python 装饰器函数(decoratots) 学习笔记
笔记·python·学习
老歌老听老掉牙2 小时前
差分进化算法深度解码:Scipy高效全局优化实战秘籍
python·算法·scipy
工程师老罗2 小时前
Pycharm下新建一个conda环境后,如何在该环境下安装包?
人工智能·python
dazzle2 小时前
Python数据结构(四):栈详解
开发语言·数据结构·python
飞Link2 小时前
Spatiotemporal Filtering(时空滤波)详解:从理论到实战
人工智能·深度学习·机器学习·计算机视觉
virtaitech2 小时前
云平台一键部署【Tencent-YouTu-Research/Youtu-LLM-2B】具备原生智能体能力
人工智能·深度学习·机器学习·ai·gpu·算力·云平台
毕设源码-邱学长2 小时前
【开题答辩全过程】以 基于java的办公自动化系统设计为例,包含答辩的问题和答案
java·开发语言