基于 Taro 的 Markdown AST 渲染器实现

在微信小程序或 Taro 项目中，我们经常需要将 Markdown 内容渲染成页面组件。然而，小程序自带的 <rich-text> 组件存在诸多限制，例如 无法自定义样式、无法完整支持复制和选择文本 。为了解决这些问题，我实现了一个 Markdown → AST → Taro 组件 的渲染方案。本文将详细介绍实现思路、核心代码和后续优化方向。

1. Markdown 转 AST

核心类型定义

ini 复制代码

export type ASTNode = {
  type: string;
  attrs?: Record<string, any>;
  children?: ASTNode[];
  text?: string;
};

type：节点类型（如 paragraph、heading、image 等）
attrs：节点属性，如 <map> 的 latitude、longitude
children：子节点
text：文本内容（行内节点或文本节点使用）

tokens 转 AST

我们使用 markdown-it 解析 Markdown，得到 token 列表，然后递归转换为 AST：

typescript 复制代码

import MarkdownIt from 'markdown-it';

export function tokensToAST(tokens: any[]): ASTNode[] {
  const ast: ASTNode[] = [];
  const stack: { node: ASTNode }[] = [];

  const addNode = (node: ASTNode) => {
    if (stack.length > 0) stack[stack.length - 1].node.children!.push(node);
    else ast.push(node);
  };

  tokens.forEach(token => {
    if (token.type === 'text' || token.type === 'code_inline') {
      addNode({ type: 'text', text: token.content });
      return;
    }

    if (token.type === 'html_inline' || token.type === 'html_block') {
      const content = token.content.trim();
      const MAP_TAG_OPEN_REGEX = /^<map\s+([^>]*)>/i;
      const MAP_TAG_CLOSE_REGEX = /^</map>/i;
      const parseAttrs = (str: string) => Object.fromEntries([...str.matchAll(/([\w-:]+)="([^"]*)"/g)].map(m => [m[1], m[2]]));

      if (MAP_TAG_CLOSE_REGEX.test(content)) return;
      const match = content.match(MAP_TAG_OPEN_REGEX);
      if (match) {
        const attrs = parseAttrs(match[1]);
        addNode({ type: 'map', attrs, children: [] });
        const remaining = content.replace(match[0], '').trim();
        if (remaining) addNode({ type: 'text', text: remaining });
        return;
      }

      addNode({ type: 'text', text: content });
      return;
    }

    if (token.type.endsWith('_open')) {
      const type = token.type.replace('_open', '');
      const node: ASTNode = { type, attrs: {}, children: [] };
      if (token.attrs) token.attrs.forEach(([k, v]: [string, string]) => (node.attrs![k] = v));
      addNode(node);
      stack.push({ node });
      return;
    }

    if (token.type.endsWith('_close')) {
      stack.pop();
      return;
    }

    if (token.type === 'inline' && token.children) {
      tokensToAST(token.children).forEach(addNode);
      return;
    }
  });

  return ast;
}

export function mdToAST(markdown: string): ASTNode[] {
  const md = new MarkdownIt({ html: true });
  const tokens = md.parse(markdown, {});
  return tokensToAST(tokens);
}

亮点：

支持自定义 HTML 标签 <map> 并解析属性
支持行内和块级节点递归解析
自动处理文本、换行和代码块

2. AST 渲染为 Taro 组件

渲染原则

块级节点 使用 <View> 包裹
纯文本或行内节点 使用 <Text>
支持自定义组件（如 <Map>、<Image>）和样式
保留原生文本选择能力（userSelect="text"）

核心组件

typescript 复制代码

import React from 'react';
import { Text, View, Image, Map } from '@tarojs/components';
import { ASTNode } from './mdToAST';

interface RendererProps {
  nodes: ASTNode[];
  keyPrefix?: string;
}

export const ASTRenderer: React.FC<RendererProps> = ({ nodes, keyPrefix = 'node' }) => {
  if (!Array.isArray(nodes) || nodes.length === 0) return null;

  const isBlockNode = (node: ASTNode) => [
    'map', 'image', 'code_block', 'fence', 'hr', 'heading', 'paragraph'
  ].includes(node.type);

  const isAllTextNodes = (nodes: ASTNode[]): boolean =>
    nodes.every(n => !isBlockNode(n) && (!n.children || isAllTextNodes(n.children)));

  const renderInlineNode = (node: ASTNode, key: string) => {
    switch (node.type) {
      case 'text': return <Text key={key}>{node.text}</Text>;
      case 'strong': return <Text key={key} style={{ fontWeight: 'bold' }}>{node.children?.map((c, i) => renderInlineNode(c, `${key}-${i}`))}</Text>;
      case 'em': return <Text key={key} style={{ fontStyle: 'italic' }}>{node.children?.map((c, i) => renderInlineNode(c, `${key}-${i}`))}</Text>;
      case 'link': return <Text key={key} style={{ color: '#1a0dab' }}>{node.children?.map((c, i) => renderInlineNode(c, `${key}-${i}`))}</Text>;
      case 'softbreak':
      case 'hardbreak': return <Text key={key}>{'\n'}</Text>;
      default: return node.text ? <Text key={key}>{node.text}</Text> : null;
    }
  };

  const renderNode = (node: ASTNode, key: string) => {
    if (!node) return null;
    switch (node.type) {
      case 'paragraph':
        if (node.children && node.children.every(c => !isBlockNode(c))) {
          return <Text key={key} userSelect="text">{node.children.map((c, i) => renderInlineNode(c, `${key}-${i}`))}</Text>;
        } else {
          return <View key={key}>{node.children?.map((c, i) => renderNode(c, `${key}-${i}`))}</View>;
        }
      case 'heading':
        const level = node.attrs?.level || 1;
        return <Text key={key} style={{ fontWeight: 'bold', fontSize: [24, 20, 18, 16, 14, 12][level] || 16 }}>{node.children?.map((c, i) => renderInlineNode(c, `${key}-${i}`))}</Text>;
      case 'image':
        return <Image key={key} src={node.attrs?.src} style={{ width: 200, height: 200 }} />;
      case 'map':
        return <Map key={key} latitude={Number(node.attrs?.latitude)} longitude={Number(node.attrs?.longitude)} scale={16} style={{ width: '100%', height: 200 }} />;
      default:
        return <Text key={key}>{node.text || ''}</Text>;
    }
  };

  if (isAllTextNodes(nodes)) {
    return <Text userSelect="text">{nodes.map((n, i) => renderInlineNode(n, `${keyPrefix}-${i}`))}</Text>;
  }

  return <>{nodes.map((n, i) => renderNode(n, `${keyPrefix}-${i}`))}</>;
};

亮点：

纯文本优化 ：整篇都是文本时直接用一个 <Text> 包裹
块级/行内分离 ：段落内有块级节点时用 <View>，否则用 <Text>
支持自定义组件 ：地图 <Map>、图片 <Image>
可选择文本：保留原生复制能力

3. 待优化方向

3.1 性能优化

对连续文本节点进行合并，减少 <Text> 层级
对列表和表格等复杂结构进行 虚拟化渲染
缓存 isAllTextNodes、isParagraphAllText 的计算结果

3.2 样式扩展

目前样式固定，后续可以支持：
- 主题化（暗黑模式、字号调整）
- 通过 props 注入自定义样式，如 paragraphStyle、headingStyle

3.3 交互增强

链接点击支持外部 URL
图片点击增加预览功能
地图组件增加标记点、点击事件

3.4 Markdown 特性覆盖

支持更多 Markdown 特性：
- 表格单元格合并（colspan/rowspan）
- Task list（勾选列表）
- 脚注、引用、数学公式

3.5 渲染组件优化

相邻行内文本节点可以合并，减少 DOM 层级
段落换行表现可进一步优化，使其在小程序中显示一致

4. 总结

通过 Markdown → AST → Taro 渲染 的方案，我们实现了：

支持自定义组件和样式的 Markdown 渲染
性能可控，支持纯文本优化
保留原生复制能力

后续优化方向包括 性能、样式、交互和 Markdown 特性覆盖 。整体方案可扩展性强，适用于微信小程序、Taro 或 React Native 项目，满足比 <rich-text> 更灵活的 Markdown 渲染需求。