【CVPR2023】人像卡通化(2D图像->3D卡通)

1. 3DAvatarGAN Bridging Domains for Personalized Editable Avatars

Affiliation: KAUST (Peter Wonka), Snap Inc. (Hsin-Ying Lee, Menglei Chai, Aliaksandr Siarohin, Sergey Tulyakov)

Authors: Rameen Abdal, Hsin-Ying Lee, Peihao Zhu, Menglei Chai, Aliaksandr Siarohin, Peter Wonka, Sergey Tulyakov

Keywords: 3D-GAN, personalized avatars, artistic datasets, deformation-based modeling, inversion method

Summary:

(1) This paper addresses the domain of 3D cartoon character modeling, aiming to provide a method for generating and editing personalized 3D cartoon characters from a single 2D image.

(2) Previous methods primarily relied on known 3D models and image pairs to generate 3D characters, requiring extensive 3D models and parameter annotations, which produced low-quality cartoon characters. This paper proposes a method using existing 2D artistic datasets and 3D GAN models, which does not need camera parameter annotations and is capable of generating and editing high-quality personalized cartoon characters.

(3) The paper presents an optimization method based on adjusting camera parameter distributions and texture quality, alongside deformation modeling techniques, which allows the knowledge of 2D GAN generators to be transferred to 3D generators, hence enabling the creation and editing of 3D cartoon characters.

(4) The method has been tested on artistic datasets such as Caricatures, Pixar toons, Cartoons, Comics, etc., achieving high-quality generation results. The paper also introduces a novel 3D-GAN inversion method, achieving a latent space mapping from the target domain to the source domain.

Methods:

(1) The paper proposes a method for generating personalized 3D cartoon characters from 2D images based on a 3D GAN model, which does not require camera parameter annotations and can generate high-quality 3D cartoon characters.

(2) The method achieves this by adapting the knowledge of 2D GAN generators to 3D generators through deformation modeling techniques and an optimization method that adjusts camera parameter distributions and texture quality. A new 3D-GAN inversion method is also provided, which maps the latent space from the target domain to the source domain.

(3) Tested on artistic datasets, the method has yielded high-quality results and offers personalized editing capabilities, supporting local adjustments based on geometry and semantics while maintaining the identity of the target.

Conclusion:

(1) The paper proposes a method for generating personalized 3D cartoon characters from 2D images, as well as a domain adaptation method for 3D GANs, filling a gap in the field of cartoon character generation. The method contributes to improving the quality and efficiency of 3D cartoon character generation.

(2) Innovation: The paper introduces a new method for generating and editing 3D cartoon characters based on deformation modeling techniques without needing camera parameter annotations, and a novel 3D-GAN inversion method, tackling the challenges in cartoon character generation. Performance: The experimental results show that the method can generate high-quality personalized 3D cartoon characters and supports local adjustments while maintaining the identity of the target. Workload: The method requires optimization of factors such as camera parameter distributions and texture quality, and it has been tested on a limited number of datasets, necessitating further validation and expansion.

2. 人像卡通化 (Photo to Cartoon)实现原理

有个公式做这个的:这个项目名叫「人像卡通化 (Photo to Cartoon)」,已经在 GitHub 上开源。但对于不想动手下载各种软件、数据集、训练模型的普通用户,该公司开放了一个名为「AI 卡通秀」的小程序,可以生成各种风格的卡通照片、gif 表情包,完全可以满足社交需求。

此外还有一个开源项目

2.1 定义

人像卡通风格渲染的目标是,在保持原图像 ID 信息和纹理细节的同时,将真实照片转换为卡通风格的非真实感图像。

2.2 图像卡通化任务面临着一些难题

卡通图像往往有清晰的边缘,平滑的色块和经过简化的纹理,与其他艺术风格有很大区别。使用传统图像处理技术生成的卡通图无法自适应地处理复杂的光照和纹理,效果较差;基于风格迁移的方法无法对细节进行准确地勾勒。

数据获取难度大。绘制风格精美且统一的卡通画耗时较多、成本较高,且转换后的卡通画和原照片的脸型及五官形状有差异,因此不构成像素级的成对数据,难以采用基于成对数据的图像翻译(Paired Image Translation)方法。

照片卡通化后容易丢失身份信息。基于非成对数据的图像翻译(Unpaired Image Translation)方法中的循环一致性损失(Cycle Loss)无法对输入输出的 id 进行有效约束。

那么如何解决这些问题呢?

小视科技的研究团队提出了一种基于生成对抗网络的卡通化模型,只需少量非成对训练数据,就能获得漂亮的结果。卡通风格渲染网络是该解决方案的核心,它主要由特征提取、特征融合和特征重建三部分组成。

整体框架由下图所示,该框架基于近期研究 U-GAT-IT(论文《U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation》。

2.3 详细原理与测试见论文与项目

论文《U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation》
项目代码与部署

相关推荐
代数狂人10 小时前
《深入浅出Godot 4与C# 3D游戏开发》第二章:编辑器导航
3d·编辑器·游戏引擎·godot
VBsemi-专注于MOSFET研发定制10 小时前
高端汽车零部件尺寸3D检测设备功率MOSFET选型方案:精密高效运动与成像电源驱动系统适配指南
3d·汽车
threelab12 小时前
从工厂模式到简化封装:三维引擎架构演进之路 threejs设计
javascript·3d·架构·webgl
三毛的二哥13 小时前
BEV:MapTR
人工智能·算法·计算机视觉·3d
地知通16 小时前
地下管网AR可视化更新:支持AR拍照、智能巡检、3dtiles等4项功能
3d·ar·数字孪生·地下管网·移动巡检·参数化建模
林恒smileZAZ1 天前
Three.js实现更真实的3D地球[特殊字符]动态昼夜交替
开发语言·javascript·3d
三维频道1 天前
不止于精度:汽车精密锻铸件质检的“数据降维”与方案重构
3d·重构·汽车·工业数字化·蓝光三维扫描仪·汽车供应链·汽车智能制造
yeflx1 天前
正射模块-odm_orthophoto
3d
Hali_Botebie1 天前
【3D 生成技术】3D Gaussian Splatting 的基础理解
3d
3DVisionary1 天前
【深度实测】从三坐标到全场扫描:汽车精密锻铸件 3D 质检的数字化进阶
3d·汽车·质量检测·精密制造·三维扫描仪·汽车工业·xtop3d