基于GAN的文生图算法详解ControlGAN（Controllable Text-to-Image Generation）

视频讲解1：Bilibili视频讲解

视频讲解2：https://www.douyin.com/video/7600973855217208610?count=10&cursor=0&enter_method=post&modeFrom=userPost&previous_page=personal_homepage&secUid=MS4wLjABAAAA0NVS_BfnZjuBUqHzrh-1oSxoNxExvuesrznu1Wu4-fc

论文下载：https://arxiv.org/abs/1909.07083

代码下载：https://github.com/mrlibw/ControlGAN

论文GALIP: Generative Adversarial CLIPs for Text-to-Image Synthesis详解（代码详解）

论文Generative Adversarial Text to Image Synthesis详解

论文DF-GAN: ASimple and Effective Baseline for Text-to-Image Synthesis详解

论文StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks详解

论文StackGAN++详解

论文HDGAN（Photographic Text-to-Image Synthesis with a Hierarchically-nested Adversarial Network）详解

视觉语义相似性评估（文本和图像之间的相似性-HDGAN）

论文AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks详解

文本和图像编码器（AttnGAN）详解

文本对图像的描述（MirrorGAN）

论文MirrorGAN: Learning Text-to-image Generation by Redescription详解

基于GAN的文生图（DM-GAN:Dynamic MemoryGenerative Adversarial Networks for Text-to-Image Synthesis）

基于监督对比学习的统一图像生成框架（A Framework For Image Synthesis Using Supervised Contrastive Learning）

基于GAN的文生图算法详解（Text to Image Generation with Semantic-Spatial Aware GAN）

本文综述了多项文本生成图像（Text-to-Image）的GAN模型研究，重点分析了现有方法在生成可控性和细粒度控制方面的局限性。针对StackGAN++、AttnGAN等模型存在的生成不可控、属性耦合等问题，提出了一种改进方案：通过引入通道注意力机制和词级判别器，增强语义部位聚焦能力；采用感知损失减少随机性，保持未修改内容的一致性。实验表明，该方法能实现更精准的文本-图像对齐，在修改特定属性时保持其他视觉内容稳定。研究为提升文生图模型的可控性和生成质量提供了新思路。