markdown 渲染自定义组件

现在在做ai问答，大模型流式输出markdown文本，一说到markdown文本大家想到的是不是v-html,我一直也用的是v-html,但是最近遇到一个需求，要求在markdown文本中文件引用部分鼠标悬浮要展示气泡卡片

传统的v-html是全量渲染，不支持渲染自定义组件，它只能渲染包含html标签的字符串，如果只是纯展示，可以用它，简单方便，但是如果需要事件跟踪，那就不适用了，下面我们来说一下markdown-it

markdown-it是一个功能强大、可扩展的 Markdown 解析器 ，用于将 Markdown 文本转换为 HTML，并支持丰富的插件系统，哈哈插件是这篇文章的主题哦

js 复制代码

import MarkdownIt from 'markdown-it';
/**
 * 渲染 Markdown 文本时为代码块添加复制功能
 */
const md: MarkdownIt = new MarkdownIt({
  html: true,
  highlight: function (str: string, lang: string): string {
    try {
      if (lang && hljs.getLanguage(lang)) {
        const languageName = lang.charAt(0).toUpperCase() + lang.slice(1);
        return '<pre class="code-block hljs relative">' +
          `<div class="code-title mb-3 mx-[-12px] sticky left-[-12px] px-3">${languageName}<span class="copy-icon cursor-pointer" data-code="${str.replace(/"/g, '&quot;')}">复制</span></div>` +
          '<code class="code-wrapper">' +
          hljs.highlight(str, { language: lang }).value +
          '</code></pre>';
      }
    } catch (e) {
      console.error("Highlighting error:", e);
    }

    // 如果没有语言或发生错误，返回普通的 HTML
    return '<pre class="hljs"><code>' + md.utils.escapeHtml(str) + '</code></pre>';
  }
}).use(referencePlugin); // referencePlugin就是我们自定义的插件

/**
 * 自定义插件
 * 在 Markdown 文本中识别形如 [ID:123] 的标记。
 * 解析出数字 123。
 * 生成一个类型为 'reference' 的 token，携带这个数字。
 * 这个 token 后续可以被渲染成自定义的 HTML 或 Vue 组件。
 */
function referencePlugin(md: MarkdownIt) {
  // 自定义规则处理 [ID:x]
  md.inline.ruler.before('link', 'reference', (state, silent) => {
    const max = state.posMax;
    const start = state.pos;

    // 检查是否以 [ID: 开头
    if (state.src.charCodeAt(start) !== 0x5B/* [ */) return false;
    if (state.src.slice(start, start + 4) !== '[ID:') return false;

    const end = state.src.indexOf(']', start + 4);

    if (end === -1) return false;

    // 提取 ID 值
    const idText = state.src.slice(start + 4, end);

    const id = parseInt(idText, 10);
    if (isNaN(id)) return false;

    if (!silent) {
      // 创建 token
      const token = state.push('reference', '', 0);
      token.content = id.toString();
      token.attrSet('id', id.toString());
    }

    state.pos = end + 1;
    return true;
  });
}

然后把需要markdown渲染的文本封装成组件：

js 复制代码

  <!-- 机器人回答渲染markdown文本 -->
  <MarkdownRenderer
    v-if="message.output.answer"
    :markdownText="message.output.answer"
    :chunks="message?.chunks || []"
  />

下面是MarkdownRenderer组件的具体实现：

js 复制代码

  <template>
  <div>
    <template v-for="(node, index) in renderedNodes" :key="index">
      <span v-if="typeof node === 'string'">{{ node }}</span>
      <component v-else :is="node" />
    </template>
  </div>
</template>

<script setup lang="ts">
import { computed, h } from "vue";
import type { VNode } from "vue";
import ReferencePopover from "@/views/aiHome/components/referencePopover/Index.vue";
import { md } from "../../data";

const props = defineProps<{
  markdownText: string;
  chunks: any[];
}>();

const tokens = computed(() => md.parse(props.markdownText || "", {}));

/**
 * 递归渲染所有内联 token
 */
function renderInlineTokens(tokens: any[], idx = 0): (string | VNode)[] {
  const children: (string | VNode)[] = [];
  while (idx < tokens.length) {
    const token = tokens[idx];

    if (token.type.endsWith("_open")) {
      const tag = token.tag;
      const closeType = token.type.replace("_open", "_close");
      let level = 1;
      let j = idx + 1;
      while (j < tokens.length) {
        if (tokens[j].type === token.type) level++;
        else if (tokens[j].type === closeType) level--;
        if (level === 0) break;
        j++;
      }
      const innerChildren = renderInlineTokens(tokens.slice(idx + 1, j));
      children.push(h(tag, {}, innerChildren));
      idx = j + 1;
      continue;
    } else if (token.type === "text") {
      children.push(token.content);
    } else if (token.type === "reference") {
      const id = token.attrGet("id");
      // props.chunks
      if (id >= 0 && id < props.chunks.length) {
        const chunk = props.chunks[id];
        // h(componentOrTag, props?, children?)
        children.push(h(ReferencePopover, { chunk }));
      }
    } else if (token.type === "softbreak" || token.type === "hardbreak") {
      children.push(h("br"));
    } else {
      children.push(token.content || "");
    }
    idx++;
  }
  return children;
}

function renderBlockTokens(tokens: any[], idx = 0): (string | VNode)[] {
  const children: (string | VNode)[] = [];
  while (idx < tokens.length) {
    const token = tokens[idx];

    if (token.type.endsWith("_open")) {
      const tag = token.tag;
      const closeType = token.type.replace("_open", "_close");
      let level = 1;
      let j = idx + 1;
      while (j < tokens.length) {
        if (tokens[j].type === token.type) level++;
        else if (tokens[j].type === closeType) level--;
        if (level === 0) break;
        j++;
      }

      const innerChildren = renderBlockTokens(tokens.slice(idx + 1, j));
      children.push(h(tag, {}, innerChildren));
      idx = j + 1;
      continue;
    } else if (token.type === "inline" && token.children) {
      children.push(...renderInlineTokens(token.children));
      idx++;
    } else if (token.type === "text") {
      children.push(token.content);
      idx++;
    } else {
      idx++;
    }
  }
  return children;
}

const renderedNodes = computed(() => renderBlockTokens(tokens.value));
</script>

其中个人感觉有个不好的点，自定义插件只是"告诉 Markdown-it 如何识别新语法"，真正的渲染逻辑仍然依赖我们写的 renderInlineTokens / renderBlockTokens；因此每次拿到新的 markdownText 都必须重新跑一遍 md.parse() 来生成完整的 token 流，否则插件识别出的新 token 不会被渲染。

稍微解释一下Markdown-it工作流程

1.解析阶段（parse）

md.parse(markdownText) -> 生成token数组，这一步会把纯文本拆成 paragraph_open、text、reference、strong_open ... 等 token

2.渲染阶段（render）

默认 md.render()会把token 转成html字符串，但是我们是自定义渲染插件，所以我们跳过了这步，自己用 h() 把 token 转成 VNode

为什么必须重新 parse

当 markdownText 变化时，文本内容、顺序、甚至插件规则都可能不同。
旧的 token 数组里没有新插件识别出的 token，或者 token 属性已经失效。
因此必须重新 md.parse() 生成全新的 token 流，才能保证自定义语法被正确识别并渲染。
所以重新生成所有 token不是重复劳动，而是保证最新内容与最新插件规则匹配的必要步骤。