【CVPR2023】人像卡通化(2D图像->3D卡通)

1. 3DAvatarGAN Bridging Domains for Personalized Editable Avatars

Affiliation: KAUST (Peter Wonka), Snap Inc. (Hsin-Ying Lee, Menglei Chai, Aliaksandr Siarohin, Sergey Tulyakov)

Authors: Rameen Abdal, Hsin-Ying Lee, Peihao Zhu, Menglei Chai, Aliaksandr Siarohin, Peter Wonka, Sergey Tulyakov

Keywords: 3D-GAN, personalized avatars, artistic datasets, deformation-based modeling, inversion method

Summary:

(1) This paper addresses the domain of 3D cartoon character modeling, aiming to provide a method for generating and editing personalized 3D cartoon characters from a single 2D image.

(2) Previous methods primarily relied on known 3D models and image pairs to generate 3D characters, requiring extensive 3D models and parameter annotations, which produced low-quality cartoon characters. This paper proposes a method using existing 2D artistic datasets and 3D GAN models, which does not need camera parameter annotations and is capable of generating and editing high-quality personalized cartoon characters.

(3) The paper presents an optimization method based on adjusting camera parameter distributions and texture quality, alongside deformation modeling techniques, which allows the knowledge of 2D GAN generators to be transferred to 3D generators, hence enabling the creation and editing of 3D cartoon characters.

(4) The method has been tested on artistic datasets such as Caricatures, Pixar toons, Cartoons, Comics, etc., achieving high-quality generation results. The paper also introduces a novel 3D-GAN inversion method, achieving a latent space mapping from the target domain to the source domain.

Methods:

(1) The paper proposes a method for generating personalized 3D cartoon characters from 2D images based on a 3D GAN model, which does not require camera parameter annotations and can generate high-quality 3D cartoon characters.

(2) The method achieves this by adapting the knowledge of 2D GAN generators to 3D generators through deformation modeling techniques and an optimization method that adjusts camera parameter distributions and texture quality. A new 3D-GAN inversion method is also provided, which maps the latent space from the target domain to the source domain.

(3) Tested on artistic datasets, the method has yielded high-quality results and offers personalized editing capabilities, supporting local adjustments based on geometry and semantics while maintaining the identity of the target.

Conclusion:

(1) The paper proposes a method for generating personalized 3D cartoon characters from 2D images, as well as a domain adaptation method for 3D GANs, filling a gap in the field of cartoon character generation. The method contributes to improving the quality and efficiency of 3D cartoon character generation.

(2) Innovation: The paper introduces a new method for generating and editing 3D cartoon characters based on deformation modeling techniques without needing camera parameter annotations, and a novel 3D-GAN inversion method, tackling the challenges in cartoon character generation. Performance: The experimental results show that the method can generate high-quality personalized 3D cartoon characters and supports local adjustments while maintaining the identity of the target. Workload: The method requires optimization of factors such as camera parameter distributions and texture quality, and it has been tested on a limited number of datasets, necessitating further validation and expansion.

2. 人像卡通化 (Photo to Cartoon)实现原理

有个公式做这个的:这个项目名叫「人像卡通化 (Photo to Cartoon)」,已经在 GitHub 上开源。但对于不想动手下载各种软件、数据集、训练模型的普通用户,该公司开放了一个名为「AI 卡通秀」的小程序,可以生成各种风格的卡通照片、gif 表情包,完全可以满足社交需求。

此外还有一个开源项目

2.1 定义

人像卡通风格渲染的目标是,在保持原图像 ID 信息和纹理细节的同时,将真实照片转换为卡通风格的非真实感图像。

2.2 图像卡通化任务面临着一些难题

卡通图像往往有清晰的边缘,平滑的色块和经过简化的纹理,与其他艺术风格有很大区别。使用传统图像处理技术生成的卡通图无法自适应地处理复杂的光照和纹理,效果较差;基于风格迁移的方法无法对细节进行准确地勾勒。

数据获取难度大。绘制风格精美且统一的卡通画耗时较多、成本较高,且转换后的卡通画和原照片的脸型及五官形状有差异,因此不构成像素级的成对数据,难以采用基于成对数据的图像翻译(Paired Image Translation)方法。

照片卡通化后容易丢失身份信息。基于非成对数据的图像翻译(Unpaired Image Translation)方法中的循环一致性损失(Cycle Loss)无法对输入输出的 id 进行有效约束。

那么如何解决这些问题呢?

小视科技的研究团队提出了一种基于生成对抗网络的卡通化模型,只需少量非成对训练数据,就能获得漂亮的结果。卡通风格渲染网络是该解决方案的核心,它主要由特征提取、特征融合和特征重建三部分组成。

整体框架由下图所示,该框架基于近期研究 U-GAT-IT(论文《U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation》。

2.3 详细原理与测试见论文与项目

论文《U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation》
项目代码与部署

相关推荐
ykjhr_3d5 小时前
3D 演示动画在汽车培训与教育领域中的应用
3d·汽车
三月的一天9 小时前
React Three Fiber 实现 3D 模型点击高亮交互的核心技巧
3d·webgl·threejs·reactthreefiber
云空1 天前
《PyQt6-3D应用开发技术文档》
3d·pyqt
鹧鸪云光伏1 天前
光伏无人机3D建模:毫秒级精度设计
3d·无人机
杀生丸学AI1 天前
【三维生成】FlashDreamer:基于扩散模型的单目图像到3D场景
人工智能·3d·大模型·aigc·蒸馏与迁移学习·扩散模型与生成模型
gis分享者2 天前
学习threejs,使用自定义GLSL 着色器,生成漂流的3D能量球
3d·threejs·着色器·glsl·shadermaterial·能量球
m0_743106462 天前
【论文笔记】BlockGaussian:巧妙解决大规模场景重建中的伪影问题
论文阅读·计算机视觉·3d·aigc·几何学
向宇it2 天前
【unity小技巧】在 Unity 中将 2D 精灵添加到 3D 游戏中,并实现阴影投射效果,实现类《八分旅人》《饥荒》等等的2.5D游戏效果
游戏·3d·unity·编辑器·游戏引擎·材质
荔枝味啊~3 天前
相机位姿估计
人工智能·计算机视觉·3d
在下胡三汉3 天前
什么是 3D 文件?
3d