AI 生成代码，从 Copilot 到 Claude Code 的全景测评

原文链接： drublic.de/blog/ai-fro...

AI 生成代码工具正在经历一场显著的流行与普及浪潮。几乎每周业界都会有新的工具发布或宣布上线。各大科技公司也纷纷入局：例如谷歌、微软、OpenAI 和 Anthropic 等相继推出了各自的 AI 编码解决方案，令这一领域的竞争异常激烈。以 GitHub Copilot 为代表的早期工具更是在短短几年间积累了超过 1500 万用户。随着 Claude Code 等 AI 驱动开发工具的关注度持续上升，现在正是深入研究这些平台、理解它们之间真实差异的最佳时机。

我花时间系统性地测试了多个此类工具，并将在此分享详细的观察与发现。

Metric	Lovable	Replit	Vercel v0	base44	Cursor	GitHub Copilot	Claude Code
Website	lovable.dev	replit.com	v0.dev	base44.com	cursor.com	github.com/copilot	www.claude.com/product/cla...
Generated output	speakit-core-mvp.lovable.app	10609a1a-c1a1-4f69-8816-e84db3d05d38-00-efwvfsaqwnb.worf.replit.dev/	v0-speakit-mvp-build.vercel.app/	speakit-52ac2ce2.base44.app/	-	-	-
Backend Functionality	Yes, on request, Supabase DB, Auth integrated	Yes, connected to Firebase	Yes	Yes, own implementations	Yes, partial implementation	Yes, partial implementation	Yes
Performance	Mobile: 98 Desktop: 100	Mobile: 54 Desktop: 55 (Dev version)	Mobile: 92 Desktop: 93	Mobile: 84 Desktop: 88	Mobile: 84 Desktop: 100	Mobile: 98 Desktop: 100	Mobile: 98 Desktop: 100
Code Quality lizard → LOC, Complexity	NLOC: 4182 AvgCCN: 2.1 Avg.token: 47.2 Fun Cnt: 250 Well organised; Code modern; No Tests included	NLOC: 5368 AvgCCN: 1.9 Avg.token: 44.3 Fun Cnt: 296 Well organised but somewhat unusual; Code modern; No tests included	NLOC: 1886 AvgCCN: 2.4 Avg.token: 43.1 Fun Cnt: 117 Well organised; Code modern; No tests included	-	NLOC: 1433 AvgCCN: 2.3 Avg.token: 41.3 Fun Cnt: 114 Well organised; Code modern; no tests included	NLOC: 614 AvgCCN: 3.5 Avg.token: 50.6 Fun Cnt: 54 Well organised; Code modern; no tests included	NLOC: 614 AvgCCN: 3.5 Avg.token: 50.6 Fun Cnt: 54 Well organised; Code modern; no tests included
Export/Portability	Each file individually, Download to GitHub	Each file individually, Download as zip	Each file individually, Download as zip, CLI Download	Download only in $50 Subscription Plan, not even starter	Code is right at your hand	Same as Cursor	Same as Cursor
Cost	<math xmlns="http://www.w3.org/1998/Math/MathML"> 25 / m o n t h f o r 100 c r e d i t s ; 25 / month for 100 credits; </math>25/monthfor100credits;50 for 200	$25 / month + pay as you go for additional usage	$20 / month + pay as you go for additional usage	<math xmlns="http://www.w3.org/1998/Math/MathML"> 25 / m o n t h B a c k e n d o n l y w i t h p l a n f o r 25 / month Backend only with plan for </math>25/monthBackendonlywithplanfor50 / month	$20 / month	Free; Next plan $10 for unlimited usage and latest models	$20 / month or Pay as you go via API connect
Tech Stack	TypeScript, React, Shadcn UI with Radix UI, Tailwind, Vite	TypeScript, React, Radix UI, Framer Motion, Tailwind, Vite	TypeScript, React, Next.js, Shadcn with Radix UI, Tailwind	-	TypeScript, React, Tailwind	TypeScript, React, Next.js, Tailwind	TypeScript, React, Next.js, Tailwind
Version Control Integration	Yes, GitHub, 2-way sync	No, there seems to be a .git folder, but I have not found it	Yes, GitHub, 2-way sync	No, only in $50 Subscription Plan, not even starter	Did not generate .git; manual work	Did not generate .git; manual work	Did not generate .git; manual work
Accessibility	96	92	100	96	88	95	95
Error Handling	No (Inserting 404 pages displays their content; No error message for PDFs >10mb)	Yes	Yes	Yes	Yes	Yes	Yes
SEO	100	- (Dev mode only)	100	100	91	100	100
Developer Experience	8/10; Good, Code visible, easy to change code; not too verbose about what is going on	5/10; A bit hard to find all the right options in the interface, a bit overloaded, Changing code is possible	9/10; Good, Code visible, easy to change code, easy to change each individual element, verbose about what is going on	2/10; possible to view configuration of functionality, no real code visible; no possibility for editing	7/10; Manual work involved, all code visible and easy to change, verbose about what is going on	7/10; Manual work involved, easy to read codebase, verbose about what is going on; no shiny interface	7/10; Manual work involved, all code visible and easy to change, verbose about what is going on
Iteration Speed	Initial: 58s; Worked for 5:41m; Verify: 2:29m	Initial: 2m Worked for 14m	Worked for 9:46m; Verify: 4:21m	Worked for 3m; Verify: 1:40m	Worked for roughly 7m; Verify: 3:47m	Worked for roughly 16m; Verify: 2m	Worked for roughly 14m; Verify: roughly 4m
Could you publish automatically?	Yes; can set up custom domain	No; publish with Subscription	Yes; can set up custom domain; Integrates with Vercel	Yes; can set up custom domains	No --- Editor	No --- Editor	No --- Editor
Additional	+ Security Scanner, found 1 issue (Missing RLS Protection for users)+ Page speed analysis+ Uses own AI gateway to enable AI functionality- Components installed, e.g. "recharts", but not used	+ Asks to do design first+ Made it possible to use app in guest mode without Firebase connection- Verify only possible with Subscription- URL pasting required to submit the form 2 times before it works- No README file- Components installed, e.g. "recharts", but not used	#NAME?	#NAME?	+ Verbose output while working- No Markdown upload- Initial build raised error: fs Can't be resolved --- fixed it to get any result- Very minimalistic landing page	+ Verbose output while working- Landing Page just minimal functionality- No Markdown upload- No mention to fill .env file- Worked very long for little result	- You must have a subscription even if you just want to test - No markdown upload - On the app it was hard to read the text
Repository Link	github.com/drublic/lov...	github.com/drublic/rep...	github.com/drublic/v0-...	-	github.com/drublic/cur...

假设与研究背景

本次实验建立在我早前文章《前端开发的 AI 未来》（The AI Future of Frontend Development）中的假设基础之上。我希望了解：前端开发这一领域是否会随着时间推移而逐渐被新技术淘汰或重塑。

我的核心假设是：从设计、设计系统的构建到生成前端生产代码，未来这一切都有可能由 AI 完成------当然，全程需有专家的指导与监督。事实上，一些行业领袖已经提出类似观点：Anthropic 首席执行官达里奥·阿莫代伊曾预测，未来 AI 将为软件工程师编写高达 90% 的代码。

随着越来越多的开发者探索 AI 的潜力并将其融入工作流，我们无疑将看到更多创新出现，持续重塑整个行业。这一趋势在数据上也有所体现------2025 年有约 84% 的开发者表示他们正在使用或计划使用 AI 工具辅助开发，其中超过一半的专业开发者每天都在使用此类工具。

如今，这一刻似乎已经到来。

随着越来越多能够自动生成美观、性能优越且无障碍访问良好的用户界面的工具涌现，设计师、产品经理乃至工程师都可能会更快地转向使用这些工具，而无需再专门聘请前端工程师为他们构建界面。这代表着软件开发方式的一次根本性转变------正如 Andrej Karpathy 最近提出的概念"氛围式编程（vibe coding）"所描述的那样。在这种模式下，开发者不再手动编写代码，而是通过自然语言向 AI 描述项目，由后者根据提示生成代码。开发者通过运行和测试结果来评估代码质量，并让 AI 持续改进，而不再亲自审阅每行代码。

这种方法强调迭代试验而非传统的代码结构，允许非专业程序员也能通过 AI 生产出可用的软件。当然，这也引发了关于代码可维护性、理解度和责任归属的讨论------缺乏人工审查可能会带来潜在的安全漏洞和不可控的结果。

实验范围与结构化方法

本次实验聚焦于 Web 前端领域。我选择了一个全新的项目环境（Greenfield 项目），不涉及复杂的长期维护项目或多人员协作功能。这是一项单人测试，以尽量消除协作因素的影响。

为确保评估方法科学严谨，我采用了结构化的测试流程。首先，我使用 Google 即将推出的 Gemini AI 模型生成了一份新项目 "Speakit" 的功能清单（作为 Markdown 文件，便于大型语言模型理解）。随后，我将相同的提示和这份功能列表输入到每个待测工具中。

功能清单文件：speakit-feature-list.md 提示示例：任务：为名为"Speakit"的应用构建最小可行产品（MVP）。完整的功能集、约束与范围请见附件。最终输出：生成满足要求的、可运行的完整应用代码。

我特意未为 LLM 指定明确的角色，因为我认为这些工具应该具备自行理解角色定位的能力。

在初次生成应用后，我向每个工具追加了一个验证提示，以确保所有功能均已实现：

验证任务：请再次逐项核对功能列表，确保应用已完整实现所有功能，并补充任何缺失部分。

通过上述过程，每个工具都产出了一个应用版本。接下来，我依据一系列客观和主观指标对所生成的应用进行了对比分析：

客观指标： 性能、代码质量、可移植性、成本、技术栈、版本控制集成、可访问性、错误处理。 主观指标： 开发者体验、迭代速度。

测试工具包括 Lovable、Replit、Vercel v0、base44，以及编辑器类 AI 工具 Cursor Editor 与 GitHub Copilot，另测试 Claude Code。

对比与结果概览

Feature ID	Feature	Lovable	Replit	Vercel's v0	base44	Cursor	Copilot	Claude
1.1	URL Input	Yes	Yes	Yes	Yes	Yes	Yes	Yes
1.2	PDF Upload (max 10MB, text-based)	Yes	Yes	Yes	Yes, allowed >10mb	Yes	Yes, allowed >10mb	Yes, allowed > 10mb
1.3	AI Summarization (gemini-2.5-flash)	Yes	Yes	No, there was an error	Yes	No	No (reader page not accessible)	Yes
2.1	Web Content Extraction	Yes, but also footer and header etc. was extracted.	Yes	Yes	Partially, some pages just did not load	Yes	No (reader page not accessible)	Yes
2.2	PDF Text Extraction	No, PDF reading did not work as it should, did not find text	No, failed for all tested PDFs	No, failed to extract content	Yes	No, failed for all tested PDFs	No (reader page not accessible)	No, failed for all tested PDFs
3.1	TTS Voice Selection (min 2 voices)	Yes	Yes	Yes	Yes, significantly less voices than others	Yes, voice selection did not work	No (reader page not accessible)	Yes, just 2 voices, but that's the spec ;)
3.2	Playback Controls (Play/Pause/Resume/Stop)	Yes, only 1 word played	Yes	Yes	Sometimes played, sometimes not	Buttons available, but did not work	No (reader page not accessible)	Yes
3.3	Reading Speed (0.5x to 2.0x)	Yes	Yes	Yes	Yes	Yes	No (reader page not accessible)	Yes
3.4	Instant Playback	Yes	No	Yes	Sometimes yes, sometimes	Yes	No (reader page not accessible)	Yes
4.1	Real-Time Word Highlighting	Yes	Yes, after initial voice selection	Yes, but is not accurate	No	Yes	No (reader page not accessible)	Yes
4.2	Auto-Scrolling	Yes	Yes, after initial voice selection	Yes	No	Yes	No (reader page not accessible)	Yes
5.1	Landing Page (URL input + PDF upload)	Yes	Yes, inserted AI generated image	Yes	No	Yes	Yes, very limited	Yes
5.2	Minimal Reader Interface	Yes, but could be more minimalistic	Yes	Yes, but could be more minimalistic	Yes	Yes	No (reader page not accessible)	Yes
5.3	Progress Bar	No	Yes (X of XXX Words played)	Yes	Yes	Yes	No (reader page not accessible)	Yes
5.4	Responsive Design (desktop/tablet/mobile)	Partially, Controls were not fully visible	Yes	Yes	Yes	Yes	Yes	Yes
5.5	Authentication UI (Log In/Sign Up)	Yes	No, Error while calling the Firebase Login	Yes	Yes	Yes	Yes (same page with two buttons)	Yes
6.1	Guest Mode + Authenticated Accounts (Firebase)	Yes, used own Database	No, Firebase failure	No, failed to create account	No	No, failed to create account	No, failed to create accounts	No, failed to create accounts
6.2	Data Persistence (session-based for guests, Firestore for authenticated)	Yes, used own Database	No	No	No	No	No	No
6.3	Security (HTTPS)	Yes	Yes	Yes	Yes

综合结果如下：

性能最佳： Lovable、Claude Code、GitHub Copilot 生成的应用 Lighthouse 分数最高。
代码结构清晰但缺乏测试： 无工具自动生成单元测试。
可移植性差异大： base44 不提供源码导出。
成本门槛相近： 月费集中在 20--25 美元。
技术栈趋同： 多数工具使用 TypeScript + React + Tailwind CSS + Vite/Next.js。
无障碍水平高： Lighthouse 得分普遍 90--100。
功能缺陷常见： PDF 解析失败、登录功能不稳定。
开发体验差距： v0、Lovable 顺畅；base44 表现糟糕。
自动部署能力： Lovable 与 v0 支持一键部署。

工具逐一分析

Lovable ：界面简洁，上手快，但初版功能不全。 Replit ：界面美观，逻辑清晰，但生成耗时长。 Vercel v0 ：集成度高，可直接部署，但缺少部分功能。 base44 ：体验极差，无源码导出。 Cursor ："规划模式"强大，生成质量高，但需技术基础。 GitHub Copilot ：适合辅助编程，非整项目生成。 Claude Code：代码质量高，功能齐全但生成速度偏慢。

局限性

测试时间有限，仅代表 2025 年 10 月的状态；工具更新频繁；部分指标存在主观性。

结论

AI 生成 Web 应用的技术栈正逐渐形成统一标准：React + Vite/Next.js + Tailwind CSS + Firebase。

身份认证功能仍是弱项。

工具大致分为三类：效率增强型（Copilot、Claude Code）、重构专用型（Cursor）、原型生成型（Vercel v0、Lovable）。

编辑器内工具更灵活但需技术基础；独立平台更易上手但定制性差。

平均成本 20--25 美元/月。

总体而言，AI 工具擅长快速生成 UI，但在复杂逻辑与健壮性上仍需人工干预。

建议：

提升日常开发效率：Copilot、Claude Code；
重构与自动化：Cursor；
快速原型：Vercel v0、Lovable。

总结 AI 工具已在前端开发中展现出巨大潜力，但尚无法取代人类工程师。它们更像"第二助手"，承担重复性工作，而开发者应专注于架构、性能与创意层面的价值。