【DeepID】《Deep Learning Face Representation from Predicting 10,000 Classes》

CVPR-2014

Sun Y, Wang X, Tang X. Deep learning face representation from predicting 10,000 classes[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2014: 1891-1898.


文章目录

  • [1、Background and Motivation](#1、Background and Motivation)
  • [2、Related Work](#2、Related Work)
  • [3、Advantages / Contributions](#3、Advantages / Contributions)
  • 4、Method
    • [4.1、Deep ConvNets](#4.1、Deep ConvNets)
    • [4.2、Feature extraction](#4.2、Feature extraction)
    • [4.3、Face verification](#4.3、Face verification)
  • 5、Experiments
    • [5.1、Datasets and Metrics](#5.1、Datasets and Metrics)
    • [5.2、Multi-scale ConvNets](#5.2、Multi-scale ConvNets)
    • [5.3、Learning effective features](#5.3、Learning effective features)
    • [5.4、Over-complete representation](#5.4、Over-complete representation)
    • [5.5、Method comparison](#5.5、Method comparison)
  • [6、Conclusion(own)/ Future work](#6、Conclusion(own)/ Future work)

1、Background and Motivation

随着计算机视觉和深度学习技术的快速发展,人脸验证作为生物特征识别的一种重要方式,在安全监控、人机交互、社交媒体等多个领域展现出巨大潜力。

然而,在无约束或非控制环境下(如光照变化、姿态多样、表情丰富等),人脸图像的变异极大,给准确的人脸验证带来了巨大困难。

传统的基于低层次特征提取和浅层模型的方法(over-complete low-level features, followed by shallow models )在处理这类复杂变化时表现不佳,因此需要更强大的特征表示和模型来提高人脸验证的准确性和鲁棒性。

本文提出一种通过深度学习来学习高级人脸特征表示的方法------Deep hidden IDentity features (DeepID),以提高人脸验证的准确性和泛化能力。

2、Related Work

  • high dimensional over-complete face descriptors, followed by shallow models.
    • 26K learning-based (LE) descriptors
    • 1.7M SIFT
    • 1.2M CMD
  • learned identity related features based on low-level features
  • deep models

3、Advantages / Contributions

  • 提出 DeepID 特征表示,与传统方法使用低层次特征或浅层模型相比,DeepID 包含了更丰富、更本质的人脸身份信息,显著提高了人脸验证的准确性。
  • deepID 有一定的泛化性,can be generalized to other tasks (such as verification) and new identities unseen in the training set
  • 在LFW数据集上取得优异成绩,97.45% verification accuracy on LFW is achieved with only weakly aligned faces.

4、Method

over-complete representations.

highly compact and discriminative features are acquired

4.1、Deep ConvNets

four convolutional layers

特征提取是一个 4 层的 CNN,输入是不同的人脸 patch

长方形 patch 的尺寸为 39x31xk,正方形 patch 的尺寸为 31x31xk,彩色 patch k = 3,灰色 patch k = 1

The features extracted from different face regions are complementary and further boost the performance

卷积操作的公式化表达如下

activation function 用的是 ReLU

max pooling 的公式化表达如下

conv3 和 conv4 之间加了 bypass connection 结构(shortcut),666

The ConvNet output is an n-way softmax predicting the probability distribution over n different identities

4.2、Feature extraction

Features are extracted from 60 face patches with ten regions, three scales, and RGB or gray channels

10 x 3 x 2 = 60 patches

The total length of DeepID is 19, 200 (160×2×60), which is ready for the final face verification

60 种 patch 训练了 60 个网络,每个网络的输出是 160 维特征,2 是 horizontall flipped 的输出得到的特征

We trained 60 ConvNets, each of which extracts two 160-dimensional DeepID vectors from a particular patch and its horizontally flipped counterpart

4.3、Face verification

Joint Bayesian 或者 neural network 方法都可以,输入就是 ConvNets 提取到的特征,输出 1:1 验证结果

(1)Joint Bayesian

核心公式

(2)neural network

注意这里 640 的计算,前面已知 60 patch,每个 patch 160 特征加上 horizontal filp 也才 320 特征,怎么变成 640 了呢?

因为 face verification 每次输入两个人脸

5、Experiments

5.1、Datasets and Metrics

LFW

  • 5749 people,only 85 have more than 15 images, and 4069 people have only one image

CelebFaces

  • 87, 628 face images of 5436 celebrities from the Internet,with approximately 16 images per person on average.

CelebFaces+

  • extend CelebFaces to the CelebFaces+ dataset, which contains 202, 599 face images of 10, 177 celebrities

evaluate our algorithm on LFW

trained our model on CelebFaces

We randomly choose 80% (4349) people from CelebFaces to learn the DeepID , and use the remaining 20% people to learn the face verification model (Joint Bayesian or neural networks).

评价指标

  • top-1 error rates

5.2、Multi-scale ConvNets

The lower error rates indicate the better hidden features learned.

5.3、Learning effective features

人数变多了,hidden 层不变还是能 hold 住

More identity classes help to learn better hidden representations that can distinguish more people (discriminative) without increasing the feature length (compact).

可视化看看学到的 160 维隐藏层特征(远小于训练时候的 id 数量)

同 id 的ren,激活(白色)会相似一些,不同 id 的,激活有差异

5.4、Over-complete representation

best performing single patch (k = 1),

global color patches in a single scale (k = 5),

all the global color patches (k = 15),

all the color patches (k = 30),

all the patches (k = 60)

The curves show that the performance may be further improved if more features are extracted.

5.5、Method comparison

Number of points 指的是人脸对齐时关键点的数量,eg,It utilized 3D alignment and pose transform as preprocessing,或者比较简单的眼睛、鼻子、嘴巴五个关键带你

Low feature dimensions indicate efficient face recognition systems

feature dimension 才 150,

6、Conclusion(own)/ Future work

  • 参考学习来自 人脸识别合集 | 2 DeepID解析
  • 港中文孙祎、汤晓鸥、王晓刚
  • Q:face identification(面部识别)和 face recognition(人脸识别)的区别
  • A:面部识别主要是一对多的比对过程,而人脸识别则涵盖了更广泛的技术步骤,包括人脸检测、预处理、特征提取和比对识别等。
  • Face Identification,1:N
  • Face Verification,1:1
  • 注意特征维度的计算,每个 patch 160 不多,但是每张图 60 个 patch,加上 horizontal flip,提取的特征也很庞大,60x2x160

更多论文解读,请参考 【Paper Reading】

相关推荐
普蓝机器人29 分钟前
果蔬采摘机器人:自动驾驶融合视觉识别,精准定位,高效作业
人工智能·机器人·自动驾驶
普蓝机器人31 分钟前
普蓝自研AutoTrack-4X导航套件平台适配高校机器人实操应用
人工智能·科技·机器人·三维仿真导航·移动机器人底盘
makerjack0012 小时前
Java中使用Spring Boot+Ollama实现本地AI的MCP接入
java·人工智能·spring boot
陈敬雷-充电了么-CEO兼CTO3 小时前
深度拆解判别式推荐大模型RankGPT!生成式精排落地提速94.8%,冷启动效果飙升,还解决了传统推荐3大痛点
大数据·人工智能·机器学习·chatgpt·大模型·推荐算法·agi
stbomei3 小时前
生成式 AI 的 “魔法”:以 GPT 为例,拆解大语言模型(LLM)的训练与推理过程
人工智能
有才不一定有德3 小时前
多代理系统架构:Supervisor 与 Swarm 架构详解
人工智能·chatgpt·架构·系统架构
计算机sci论文精选5 小时前
CVPR 强化学习模块深度分析:连多项式不等式+自驾规划
人工智能·深度学习·机器学习·计算机视觉·机器人·强化学习·cvpr
华略创新6 小时前
用KPI导航数字化转型:制造企业如何科学评估系统上线成效
人工智能·制造·crm·管理系统·erp·软件·mes
嘀咕博客6 小时前
Komo Searc-AI驱动的搜索引擎
人工智能·搜索引擎·ai工具
小马过河R6 小时前
GPT-5原理
人工智能·gpt·深度学习·语言模型·embedding