AI 生成代码,从 Copilot 到 Claude Code 的全景测评

原文链接: drublic.de/blog/ai-fro...

AI 生成代码工具正在经历一场显著的流行与普及浪潮。几乎每周业界都会有新的工具发布或宣布上线。各大科技公司也纷纷入局:例如谷歌、微软、OpenAI 和 Anthropic 等相继推出了各自的 AI 编码解决方案,令这一领域的竞争异常激烈。以 GitHub Copilot 为代表的早期工具更是在短短几年间积累了超过 1500 万用户。随着 Claude Code 等 AI 驱动开发工具的关注度持续上升,现在正是深入研究这些平台、理解它们之间真实差异的最佳时机。

我花时间系统性地测试了多个此类工具,并将在此分享详细的观察与发现。

Metric Lovable Replit Vercel v0 base44 Cursor GitHub Copilot Claude Code
Website lovable.dev replit.com v0.dev base44.com cursor.com github.com/copilot www.claude.com/product/cla...
Generated output speakit-core-mvp.lovable.app 10609a1a-c1a1-4f69-8816-e84db3d05d38-00-efwvfsaqwnb.worf.replit.dev/ v0-speakit-mvp-build.vercel.app/ speakit-52ac2ce2.base44.app/ - - -
Backend Functionality Yes, on request, Supabase DB, Auth integrated Yes, connected to Firebase Yes Yes, own implementations Yes, partial implementation Yes, partial implementation Yes
Performance Mobile: 98 Desktop: 100 Mobile: 54 Desktop: 55 (Dev version) Mobile: 92 Desktop: 93 Mobile: 84 Desktop: 88 Mobile: 84 Desktop: 100 Mobile: 98 Desktop: 100 Mobile: 98 Desktop: 100
Code Quality lizard → LOC, Complexity NLOC: 4182 AvgCCN: 2.1 Avg.token: 47.2 Fun Cnt: 250 Well organised; Code modern; No Tests included NLOC: 5368 AvgCCN: 1.9 Avg.token: 44.3 Fun Cnt: 296 Well organised but somewhat unusual; Code modern; No tests included NLOC: 1886 AvgCCN: 2.4 Avg.token: 43.1 Fun Cnt: 117 Well organised; Code modern; No tests included - NLOC: 1433 AvgCCN: 2.3 Avg.token: 41.3 Fun Cnt: 114 Well organised; Code modern; no tests included NLOC: 614 AvgCCN: 3.5 Avg.token: 50.6 Fun Cnt: 54 Well organised; Code modern; no tests included NLOC: 614 AvgCCN: 3.5 Avg.token: 50.6 Fun Cnt: 54 Well organised; Code modern; no tests included
Export/Portability Each file individually, Download to GitHub Each file individually, Download as zip Each file individually, Download as zip, CLI Download Download only in $50 Subscription Plan, not even starter Code is right at your hand Same as Cursor Same as Cursor
Cost <math xmlns="http://www.w3.org/1998/Math/MathML"> 25 / m o n t h f o r 100 c r e d i t s ; 25 / month for 100 credits; </math>25/monthfor100credits;50 for 200 $25 / month + pay as you go for additional usage $20 / month + pay as you go for additional usage <math xmlns="http://www.w3.org/1998/Math/MathML"> 25 / m o n t h B a c k e n d o n l y w i t h p l a n f o r 25 / month Backend only with plan for </math>25/monthBackendonlywithplanfor50 / month $20 / month Free; Next plan $10 for unlimited usage and latest models $20 / month or Pay as you go via API connect
Tech Stack TypeScript, React, Shadcn UI with Radix UI, Tailwind, Vite TypeScript, React, Radix UI, Framer Motion, Tailwind, Vite TypeScript, React, Next.js, Shadcn with Radix UI, Tailwind - TypeScript, React, Tailwind TypeScript, React, Next.js, Tailwind TypeScript, React, Next.js, Tailwind
Version Control Integration Yes, GitHub, 2-way sync No, there seems to be a .git folder, but I have not found it Yes, GitHub, 2-way sync No, only in $50 Subscription Plan, not even starter Did not generate .git; manual work Did not generate .git; manual work Did not generate .git; manual work
Accessibility 96 92 100 96 88 95 95
Error Handling No (Inserting 404 pages displays their content; No error message for PDFs >10mb) Yes Yes Yes Yes Yes Yes
SEO 100 - (Dev mode only) 100 100 91 100 100
Developer Experience 8/10; Good, Code visible, easy to change code; not too verbose about what is going on 5/10; A bit hard to find all the right options in the interface, a bit overloaded, Changing code is possible 9/10; Good, Code visible, easy to change code, easy to change each individual element, verbose about what is going on 2/10; possible to view configuration of functionality, no real code visible; no possibility for editing 7/10; Manual work involved, all code visible and easy to change, verbose about what is going on 7/10; Manual work involved, easy to read codebase, verbose about what is going on; no shiny interface 7/10; Manual work involved, all code visible and easy to change, verbose about what is going on
Iteration Speed Initial: 58s; Worked for 5:41m; Verify: 2:29m Initial: 2m Worked for 14m Worked for 9:46m; Verify: 4:21m Worked for 3m; Verify: 1:40m Worked for roughly 7m; Verify: 3:47m Worked for roughly 16m; Verify: 2m Worked for roughly 14m; Verify: roughly 4m
Could you publish automatically? Yes; can set up custom domain No; publish with Subscription Yes; can set up custom domain; Integrates with Vercel Yes; can set up custom domains No --- Editor No --- Editor No --- Editor
Additional + Security Scanner, found 1 issue (Missing RLS Protection for users)+ Page speed analysis+ Uses own AI gateway to enable AI functionality- Components installed, e.g. "recharts", but not used + Asks to do design first+ Made it possible to use app in guest mode without Firebase connection- Verify only possible with Subscription- URL pasting required to submit the form 2 times before it works- No README file- Components installed, e.g. "recharts", but not used #NAME? #NAME? + Verbose output while working- No Markdown upload- Initial build raised error: fs Can't be resolved --- fixed it to get any result- Very minimalistic landing page + Verbose output while working- Landing Page just minimal functionality- No Markdown upload- No mention to fill .env file- Worked very long for little result - You must have a subscription even if you just want to test - No markdown upload - On the app it was hard to read the text
Repository Link github.com/drublic/lov... github.com/drublic/rep... github.com/drublic/v0-... - github.com/drublic/cur...

假设与研究背景

本次实验建立在我早前文章《前端开发的 AI 未来》(The AI Future of Frontend Development)中的假设基础之上。我希望了解:前端开发这一领域是否会随着时间推移而逐渐被新技术淘汰或重塑。

我的核心假设是:从设计、设计系统的构建到生成前端生产代码,未来这一切都有可能由 AI 完成------当然,全程需有专家的指导与监督。事实上,一些行业领袖已经提出类似观点:Anthropic 首席执行官达里奥·阿莫代伊曾预测,未来 AI 将为软件工程师编写高达 90% 的代码。

随着越来越多的开发者探索 AI 的潜力并将其融入工作流,我们无疑将看到更多创新出现,持续重塑整个行业。这一趋势在数据上也有所体现------2025 年有约 84% 的开发者表示他们正在使用或计划使用 AI 工具辅助开发,其中超过一半的专业开发者每天都在使用此类工具。

如今,这一刻似乎已经到来。

随着越来越多能够自动生成美观、性能优越且无障碍访问良好的用户界面的工具涌现,设计师、产品经理乃至工程师都可能会更快地转向使用这些工具,而无需再专门聘请前端工程师为他们构建界面。这代表着软件开发方式的一次根本性转变------正如 Andrej Karpathy 最近提出的概念"氛围式编程(vibe coding)"所描述的那样。在这种模式下,开发者不再手动编写代码,而是通过自然语言向 AI 描述项目,由后者根据提示生成代码。开发者通过运行和测试结果来评估代码质量,并让 AI 持续改进,而不再亲自审阅每行代码。

这种方法强调迭代试验而非传统的代码结构,允许非专业程序员也能通过 AI 生产出可用的软件。当然,这也引发了关于代码可维护性、理解度和责任归属的讨论------缺乏人工审查可能会带来潜在的安全漏洞和不可控的结果。

实验范围与结构化方法

本次实验聚焦于 Web 前端领域。我选择了一个全新的项目环境(Greenfield 项目),不涉及复杂的长期维护项目或多人员协作功能。这是一项单人测试,以尽量消除协作因素的影响。

为确保评估方法科学严谨,我采用了结构化的测试流程。首先,我使用 Google 即将推出的 Gemini AI 模型生成了一份新项目 "Speakit" 的功能清单(作为 Markdown 文件,便于大型语言模型理解)。随后,我将相同的提示和这份功能列表输入到每个待测工具中。

功能清单文件:speakit-feature-list.md 提示示例: 任务:为名为"Speakit"的应用构建最小可行产品(MVP)。完整的功能集、约束与范围请见附件。 最终输出:生成满足要求的、可运行的完整应用代码。

我特意未为 LLM 指定明确的角色,因为我认为这些工具应该具备自行理解角色定位的能力。

在初次生成应用后,我向每个工具追加了一个验证提示,以确保所有功能均已实现:

验证任务:请再次逐项核对功能列表,确保应用已完整实现所有功能,并补充任何缺失部分。

通过上述过程,每个工具都产出了一个应用版本。接下来,我依据一系列客观和主观指标对所生成的应用进行了对比分析:

客观指标: 性能、代码质量、可移植性、成本、技术栈、版本控制集成、可访问性、错误处理。 主观指标: 开发者体验、迭代速度。

测试工具包括 Lovable、Replit、Vercel v0、base44,以及编辑器类 AI 工具 Cursor Editor 与 GitHub Copilot,另测试 Claude Code。


对比与结果概览

Feature ID Feature Lovable Replit Vercel's v0 base44 Cursor Copilot Claude
1.1 URL Input Yes Yes Yes Yes Yes Yes Yes
1.2 PDF Upload (max 10MB, text-based) Yes Yes Yes Yes, allowed >10mb Yes Yes, allowed >10mb Yes, allowed > 10mb
1.3 AI Summarization (gemini-2.5-flash) Yes Yes No, there was an error Yes No No (reader page not accessible) Yes
2.1 Web Content Extraction Yes, but also footer and header etc. was extracted. Yes Yes Partially, some pages just did not load Yes No (reader page not accessible) Yes
2.2 PDF Text Extraction No, PDF reading did not work as it should, did not find text No, failed for all tested PDFs No, failed to extract content Yes No, failed for all tested PDFs No (reader page not accessible) No, failed for all tested PDFs
3.1 TTS Voice Selection (min 2 voices) Yes Yes Yes Yes, significantly less voices than others Yes, voice selection did not work No (reader page not accessible) Yes, just 2 voices, but that's the spec ;)
3.2 Playback Controls (Play/Pause/Resume/Stop) Yes, only 1 word played Yes Yes Sometimes played, sometimes not Buttons available, but did not work No (reader page not accessible) Yes
3.3 Reading Speed (0.5x to 2.0x) Yes Yes Yes Yes Yes No (reader page not accessible) Yes
3.4 Instant Playback Yes No Yes Sometimes yes, sometimes Yes No (reader page not accessible) Yes
4.1 Real-Time Word Highlighting Yes Yes, after initial voice selection Yes, but is not accurate No Yes No (reader page not accessible) Yes
4.2 Auto-Scrolling Yes Yes, after initial voice selection Yes No Yes No (reader page not accessible) Yes
5.1 Landing Page (URL input + PDF upload) Yes Yes, inserted AI generated image Yes No Yes Yes, very limited Yes
5.2 Minimal Reader Interface Yes, but could be more minimalistic Yes Yes, but could be more minimalistic Yes Yes No (reader page not accessible) Yes
5.3 Progress Bar No Yes (X of XXX Words played) Yes Yes Yes No (reader page not accessible) Yes
5.4 Responsive Design (desktop/tablet/mobile) Partially, Controls were not fully visible Yes Yes Yes Yes Yes Yes
5.5 Authentication UI (Log In/Sign Up) Yes No, Error while calling the Firebase Login Yes Yes Yes Yes (same page with two buttons) Yes
6.1 Guest Mode + Authenticated Accounts (Firebase) Yes, used own Database No, Firebase failure No, failed to create account No No, failed to create account No, failed to create accounts No, failed to create accounts
6.2 Data Persistence (session-based for guests, Firestore for authenticated) Yes, used own Database No No No No No No
6.3 Security (HTTPS) Yes Yes Yes Yes

综合结果如下:

  • 性能最佳: Lovable、Claude Code、GitHub Copilot 生成的应用 Lighthouse 分数最高。
  • 代码结构清晰但缺乏测试: 无工具自动生成单元测试。
  • 可移植性差异大: base44 不提供源码导出。
  • 成本门槛相近: 月费集中在 20--25 美元。
  • 技术栈趋同: 多数工具使用 TypeScript + React + Tailwind CSS + Vite/Next.js。
  • 无障碍水平高: Lighthouse 得分普遍 90--100。
  • 功能缺陷常见: PDF 解析失败、登录功能不稳定。
  • 开发体验差距: v0、Lovable 顺畅;base44 表现糟糕。
  • 自动部署能力: Lovable 与 v0 支持一键部署。

工具逐一分析

Lovable :界面简洁,上手快,但初版功能不全。 Replit :界面美观,逻辑清晰,但生成耗时长。 Vercel v0 :集成度高,可直接部署,但缺少部分功能。 base44 :体验极差,无源码导出。 Cursor :"规划模式"强大,生成质量高,但需技术基础。 GitHub Copilot :适合辅助编程,非整项目生成。 Claude Code:代码质量高,功能齐全但生成速度偏慢。


局限性

测试时间有限,仅代表 2025 年 10 月的状态;工具更新频繁;部分指标存在主观性。


结论

AI 生成 Web 应用的技术栈正逐渐形成统一标准:React + Vite/Next.js + Tailwind CSS + Firebase。

身份认证功能仍是弱项。

工具大致分为三类:效率增强型(Copilot、Claude Code)、重构专用型(Cursor)、原型生成型(Vercel v0、Lovable)。

编辑器内工具更灵活但需技术基础;独立平台更易上手但定制性差。

平均成本 20--25 美元/月。

总体而言,AI 工具擅长快速生成 UI,但在复杂逻辑与健壮性上仍需人工干预。

建议:

  • 提升日常开发效率:Copilot、Claude Code;
  • 重构与自动化:Cursor;
  • 快速原型:Vercel v0、Lovable。

总结 AI 工具已在前端开发中展现出巨大潜力,但尚无法取代人类工程师。它们更像"第二助手",承担重复性工作,而开发者应专注于架构、性能与创意层面的价值。

相关推荐
说私域2 小时前
基于开源链动2+1模式AI智能名片S2B2C商城小程序的赛道力构建与品牌发展研究
人工智能·小程序
喜欢吃豆3 小时前
llama.cpp 全方位技术指南:从底层原理到实战部署
人工智能·语言模型·大模型·llama·量化·llama.cpp
e6zzseo4 小时前
独立站的优势和劣势和运营技巧
大数据·人工智能
富唯智能5 小时前
移动+协作+视觉:开箱即用的下一代复合机器人如何重塑智能工厂
人工智能·工业机器人·复合机器人
Antonio9156 小时前
【图像处理】图像的基础几何变换
图像处理·人工智能·计算机视觉
新加坡内哥谈技术7 小时前
Perplexity AI 的 RAG 架构全解析:幕后技术详解
人工智能
武子康7 小时前
AI研究-119 DeepSeek-OCR PyTorch FlashAttn 2.7.3 推理与部署 模型规模与资源详细分析
人工智能·深度学习·机器学习·ai·ocr·deepseek·deepseek-ocr
Sirius Wu8 小时前
深入浅出:Tongyi DeepResearch技术解读
人工智能·语言模型·langchain·aigc
忙碌5448 小时前
AI大模型时代下的全栈技术架构:从深度学习到云原生部署实战
人工智能·深度学习·架构