AI 时代真正流式解析+渲染双重优化的 Incremark

故事的开始

最初是想要实现富文本编辑器 ai 流式输出的,此过程大致为 ai -> chunk -> parser -> propsemirror JSONContent 实现富文本编辑器中类似 ai 聊天框一样的流式输出,并且要尽量节省性能。

然后就出现了 incremark,在此过程中也关注到了 vue-stream-markdown,不过它做的只是 vue 渲染层的优化,每次 ai 输出的 markdown chunk,它还是会将其跟之前所有输出拼接解析成 token,虽然在渲染层做到了流式的渲染,但其内核无法满足我的需求。

想法的实现

在选择 markdown 底层解析器的时候关注了 tiptap 在用的 markedjs 以及之前自己实现 tiptap markdown 转换的 markdown-it,但最终还是选择了 remark 底层编辑器 micromark。因为 markdown-it token 结构的复杂度远高于 mdast 结构,它存在进入、退出的概念,但 mdast 就是一棵树。

因此拼接一棵树的复杂度其实是更低的,整个工具的设计思路为:

  1. 维护一个文本缓冲区,接收流式输入
  2. 识别"稳定边界"(如空行、标题等),将已完成的块标记为 completed
  3. 对于正在接收的块,每次重新解析,但只解析该块的内容
  4. 复杂嵌套节点(如列表、引用)作为整体处理,直到确认完成

最终可以将流式 markdown 的解析复杂度从 O(n²) 降低到 O(n),ai 输出内容越多,性能节省也越多。

实际 benchmark 测试

小型 markdown 解析性能可提高 2-10 倍,中型 markdown 解析性能提高 10-20 倍,长 markdown 解析性能提高 20-46 倍,当前更长的测试是没有必要的。

原始测试输出结果:

plain 复制代码
============================================================
Incremark Benchmark
============================================================
Markdown length: 771 chars
Chunk size: 10 chars
Total chunks: 78
Iterations: 100
============================================================

Warming up...

Running benchmark...

Results:
------------------------------------------------------------

📊 Traditional (re-parse all)
   Total time: 2608.41 ms
   Parse count: 7800
   Avg time per parse: 0.3344 ms
   Total chars parsed: 3,080,100

⚡ Incremark (incremental)
   Total time: 638.36 ms
   Parse count: 7800
   Avg time per parse: 0.0818 ms
   Total chars parsed: 77,100

------------------------------------------------------------

🎯 Performance Improvement:
   Time saved: 75.5%
   Chars parsing saved: 97.5%
   Speedup: 4.09x faster

============================================================

🔬 Running Incremark Benchmark Suite


============================================================
📄 Document Size: Short (~1KB) (1000 chars)
============================================================

📦 Chunk size: 10 chars

============================================================
Incremark Benchmark
============================================================
Markdown length: 1000 chars
Chunk size: 10 chars
Total chunks: 100
Iterations: 20
============================================================

Warming up...

Running benchmark...

Results:
------------------------------------------------------------

📊 Traditional (re-parse all)
   Total time: 435.85 ms
   Parse count: 2000
   Avg time per parse: 0.2179 ms
   Total chars parsed: 1,010,000

⚡ Incremark (incremental)
   Total time: 171.50 ms
   Parse count: 2000
   Avg time per parse: 0.0858 ms
   Total chars parsed: 20,000

------------------------------------------------------------

🎯 Performance Improvement:
   Time saved: 60.7%
   Chars parsing saved: 98.0%
   Speedup: 2.54x faster

============================================================

📦 Chunk size: 50 chars

============================================================
Incremark Benchmark
============================================================
Markdown length: 1000 chars
Chunk size: 50 chars
Total chunks: 20
Iterations: 20
============================================================

Warming up...

Running benchmark...

Results:
------------------------------------------------------------

📊 Traditional (re-parse all)
   Total time: 92.33 ms
   Parse count: 400
   Avg time per parse: 0.2308 ms
   Total chars parsed: 210,000

⚡ Incremark (incremental)
   Total time: 43.77 ms
   Parse count: 400
   Avg time per parse: 0.1094 ms
   Total chars parsed: 20,000

------------------------------------------------------------

🎯 Performance Improvement:
   Time saved: 52.6%
   Chars parsing saved: 90.5%
   Speedup: 2.11x faster

============================================================

============================================================
📄 Document Size: Medium (~5KB) (5000 chars)
============================================================

📦 Chunk size: 10 chars

============================================================
Incremark Benchmark
============================================================
Markdown length: 5000 chars
Chunk size: 10 chars
Total chunks: 500
Iterations: 20
============================================================

Warming up...

Running benchmark...

Results:
------------------------------------------------------------

📊 Traditional (re-parse all)
   Total time: 10335.94 ms
   Parse count: 10000
   Avg time per parse: 1.0336 ms
   Total chars parsed: 25,050,000

⚡ Incremark (incremental)
   Total time: 916.48 ms
   Parse count: 10000
   Avg time per parse: 0.0916 ms
   Total chars parsed: 100,000

------------------------------------------------------------

🎯 Performance Improvement:
   Time saved: 91.1%
   Chars parsing saved: 99.6%
   Speedup: 11.28x faster

============================================================

📦 Chunk size: 50 chars

============================================================
Incremark Benchmark
============================================================
Markdown length: 5000 chars
Chunk size: 50 chars
Total chunks: 100
Iterations: 20
============================================================

Warming up...

Running benchmark...

Results:
------------------------------------------------------------

📊 Traditional (re-parse all)
   Total time: 2120.47 ms
   Parse count: 2000
   Avg time per parse: 1.0602 ms
   Total chars parsed: 5,050,000

⚡ Incremark (incremental)
   Total time: 223.64 ms
   Parse count: 2000
   Avg time per parse: 0.1118 ms
   Total chars parsed: 100,000

------------------------------------------------------------

🎯 Performance Improvement:
   Time saved: 89.5%
   Chars parsing saved: 98.0%
   Speedup: 9.48x faster

============================================================

============================================================
📄 Document Size: Long (~10KB) (10000 chars)
============================================================

📦 Chunk size: 10 chars

============================================================
Incremark Benchmark
============================================================
Markdown length: 10000 chars
Chunk size: 10 chars
Total chunks: 1000
Iterations: 20
============================================================

Warming up...

Running benchmark...

Results:
------------------------------------------------------------

📊 Traditional (re-parse all)
   Total time: 40596.85 ms
   Parse count: 20000
   Avg time per parse: 2.0298 ms
   Total chars parsed: 100,100,000

⚡ Incremark (incremental)
   Total time: 1781.89 ms
   Parse count: 20000
   Avg time per parse: 0.0891 ms
   Total chars parsed: 200,000

------------------------------------------------------------

🎯 Performance Improvement:
   Time saved: 95.6%
   Chars parsing saved: 99.8%
   Speedup: 22.78x faster

============================================================

📦 Chunk size: 50 chars

============================================================
Incremark Benchmark
============================================================
Markdown length: 10000 chars
Chunk size: 50 chars
Total chunks: 200
Iterations: 20
============================================================

Warming up...

Running benchmark...

Results:
------------------------------------------------------------

📊 Traditional (re-parse all)
   Total time: 8095.40 ms
   Parse count: 4000
   Avg time per parse: 2.0239 ms
   Total chars parsed: 20,100,000

⚡ Incremark (incremental)
   Total time: 473.23 ms
   Parse count: 4000
   Avg time per parse: 0.1183 ms
   Total chars parsed: 200,000

------------------------------------------------------------

🎯 Performance Improvement:
   Time saved: 94.2%
   Chars parsing saved: 99.0%
   Speedup: 17.11x faster

============================================================

============================================================
📄 Document Size: Very Long (~20KB) (20000 chars)
============================================================

📦 Chunk size: 10 chars

============================================================
Incremark Benchmark
============================================================
Markdown length: 20000 chars
Chunk size: 10 chars
Total chunks: 2000
Iterations: 20
============================================================

Warming up...

Running benchmark...

Results:
------------------------------------------------------------

📊 Traditional (re-parse all)
   Total time: 183844.78 ms
   Parse count: 40000
   Avg time per parse: 4.5961 ms
   Total chars parsed: 400,200,000

⚡ Incremark (incremental)
   Total time: 3997.77 ms
   Parse count: 40000
   Avg time per parse: 0.0999 ms
   Total chars parsed: 400,000

------------------------------------------------------------

🎯 Performance Improvement:
   Time saved: 97.8%
   Chars parsing saved: 99.9%
   Speedup: 45.99x faster

============================================================

📦 Chunk size: 50 chars

============================================================
Incremark Benchmark
============================================================
Markdown length: 20000 chars
Chunk size: 50 chars
Total chunks: 400
Iterations: 20
============================================================

Warming up...

Running benchmark...

Results:
------------------------------------------------------------

📊 Traditional (re-parse all)
   Total time: 37400.52 ms
   Parse count: 8000
   Avg time per parse: 4.6751 ms
   Total chars parsed: 80,200,000

⚡ Incremark (incremental)
   Total time: 1001.10 ms
   Parse count: 8000
   Avg time per parse: 0.1251 ms
   Total chars parsed: 400,000

------------------------------------------------------------

🎯 Performance Improvement:
   Time saved: 97.3%
   Chars parsing saved: 99.5%
   Speedup: 37.36x faster

============================================================


================================================================================
📈 Complete Benchmark Summary
================================================================================

| Document Size    | Chunk | Time Saved | Chars Saved | Speedup |
|------------------|-------|------------|-------------|---------|
| Short (~1KB)     | 10    |      60.7% |       98.0% |   2.54x |
| Short (~1KB)     | 50    |      52.6% |       90.5% |   2.11x |
| Medium (~5KB)    | 10    |      91.1% |       99.6% |  11.28x |
| Medium (~5KB)    | 50    |      89.5% |       98.0% |   9.48x |
| Long (~10KB)     | 10    |      95.6% |       99.8% |  22.78x |
| Long (~10KB)     | 50    |      94.2% |       99.0% |  17.11x |
| Very Long (~20KB) | 10    |      97.8% |       99.9% |  45.99x |
| Very Long (~20KB) | 50    |      97.3% |       99.5% |  37.36x |

--------------------------------------------------------------------------------

📊 Average by Document Size:

   Short (~1KB): 2.33x faster, 56.7% time saved
   Medium (~5KB): 10.38x faster, 90.3% time saved
   Long (~10KB): 19.94x faster, 94.9% time saved
   Very Long (~20KB): 41.67x faster, 97.5% time saved

--------------------------------------------------------------------------------

🎯 Overall Average:
   Time Saved: 84.8%
   Chars Saved: 98.0%
   Speedup: 18.58x

================================================================================

框架支持

由于从解析层面上实现了流式解析,并使用 mdast 进行解析结果的记录,因此当前工具很容易迁移到各种框架中,目前实现了 vue 与 react 版本,svelte 与 solid 等小众框架实现也很简单(但暂未做)。 vue demo 地址:incremark-vue.vercel.app/ react demo 地址:incremark-react.vercel.app/

devtools

除了功能实现,还增加了 devtools 可以查看当前解析内容的一些状态

markdown 内容统计

markdown 块统计

mdast 结果

chunk append 记录

小结

欢迎大家体验,文档地址:incremark-docs.vercel.app/zh/

相关推荐
layman052843 分钟前
webpack5 css-loader:从基础到原理
前端·css·webpack
半桔1 小时前
【前端小站】CSS 样式美学:从基础语法到界面精筑的实战宝典
前端·css·html
AI老李1 小时前
PostCSS完全指南:功能/配置/插件/SourceMap/AST/插件开发/自定义语法
前端·javascript·postcss
_OP_CHEN1 小时前
【前端开发之CSS】(一)初识 CSS:网页化妆术的终极指南,新手也能轻松拿捏页面美化!
前端·css·html·网页开发·样式表·界面美化
啊哈一半醒1 小时前
CSS 主流布局
前端·css·css布局·标准流 浮动 定位·flex grid 响应式布局
PHP武器库1 小时前
ULUI:不止于按钮和菜单,一个专注于“业务组件”的纯 CSS 框架
前端·css
电商API_180079052471 小时前
第三方淘宝商品详情 API 全维度调用指南:从技术对接到生产落地
java·大数据·前端·数据库·人工智能·网络爬虫
晓晓莺歌1 小时前
vue3某一个路由切换,导致所有路由页面均变成空白页
前端·vue.js
Up九五小庞2 小时前
开源埋点分析平台 ClkLog 本地部署 + Web JS 埋点测试实战--九五小庞
前端·javascript·开源
qq_177767372 小时前
React Native鸿蒙跨平台数据使用监控应用技术,通过setInterval每5秒更新一次数据使用情况和套餐使用情况,模拟了真实应用中的数据监控场景
开发语言·前端·javascript·react native·react.js·ecmascript·harmonyos