第五章：辅助函数与全流程整合

🧩 函数 1：`condenseWhitespace(nodes)`

ini 复制代码

function condenseWhitespace(nodes: TemplateChildNode[]): TemplateChildNode[] {
  const shouldCondense = currentOptions.whitespace !== 'preserve'
  let removedWhitespace = false
  for (let i = 0; i < nodes.length; i++) {
    const node = nodes[i]
    if (node.type === NodeTypes.TEXT) {
      if (!inPre) {
        if (isAllWhitespace(node.content)) {
          const prev = nodes[i - 1] && nodes[i - 1].type
          const next = nodes[i + 1] && nodes[i + 1].type
          if (
            !prev ||
            !next ||
            (shouldCondense &&
              ((prev === NodeTypes.COMMENT &&
                (next === NodeTypes.COMMENT || next === NodeTypes.ELEMENT)) ||
                (prev === NodeTypes.ELEMENT &&
                  (next === NodeTypes.COMMENT ||
                    (next === NodeTypes.ELEMENT &&
                      hasNewlineChar(node.content))))))
          ) {
            removedWhitespace = true
            nodes[i] = null as any
          } else {
            node.content = ' '
          }
        } else if (shouldCondense) {
          node.content = condense(node.content)
        }
      } else {
        node.content = node.content.replace(windowsNewlineRE, '\n')
      }
    }
  }
  return removedWhitespace ? nodes.filter(Boolean) : nodes
}

📖 功能说明

统一处理节点中的空白字符（Whitespace），用于压缩模板文本、优化渲染性能。

🔍 逻辑拆解

场景	处理方式
在 `<pre>` 内部	保留原样，只将 `\r\n` 转换为 `\n`。
空白节点（全是空格或换行）	可能被删除或压缩成一个空格 `' '`。
普通文本中的多余空格	压缩为单空格（调用 `condense()`）。
空白在元素/注释之间	若 `whitespace: "condense"` 且存在换行符，则删除节点。

🧠 设计理念

Vue 默认模板渲染不保留多余空格，这是性能优化与 HTML 渲染一致性考虑。

但保留 <pre> 标签原样以尊重语义。

📘 举例

css 复制代码

<div>
   Hello
   World
</div>

压缩后等价为：

css 复制代码

[  { "type": "TEXT", "content": " Hello World " }]

🧩 函数 2：`condense(str)`

ini 复制代码

function condense(str: string) {
  let ret = ''
  let prevCharIsWhitespace = false
  for (let i = 0; i < str.length; i++) {
    if (isWhitespace(str.charCodeAt(i))) {
      if (!prevCharIsWhitespace) {
        ret += ' '
        prevCharIsWhitespace = true
      }
    } else {
      ret += str[i]
      prevCharIsWhitespace = false
    }
  }
  return ret
}

📖 功能说明

将字符串中连续空白压缩为单一空格。

例如：
"foo bar baz" → "foo bar baz"

💡 原理

利用标志 prevCharIsWhitespace 记录上一个字符是否是空格，

在连续空格时仅保留第一个。

🧩 函数 3：`isAllWhitespace(str)` 与 `hasNewlineChar(str)`

rust 复制代码

function isAllWhitespace(str: string) {
  for (let i = 0; i < str.length; i++) {
    if (!isWhitespace(str.charCodeAt(i))) {
      return false
    }
  }
  return true
}

function hasNewlineChar(str: string) {
  for (let i = 0; i < str.length; i++) {
    const c = str.charCodeAt(i)
    if (c === CharCodes.NewLine || c === CharCodes.CarriageReturn) {
      return true
    }
  }
  return false
}

📖 功能说明

提供文本判断工具：

isAllWhitespace() → 判断字符串是否全部为空白；
hasNewlineChar() → 判断是否包含换行符。

💡 用途

这两个函数被频繁用于：

空白节点删除；
condenseWhitespace() 中的逻辑判断；
onCloseTag() 的新行处理。

🧩 函数 4：`lookAhead()` 与 `backTrack()`

css 复制代码

function lookAhead(index: number, c: number) {
  let i = index
  while (currentInput.charCodeAt(i) !== c && i < currentInput.length - 1) i++
  return i
}

function backTrack(index: number, c: number) {
  let i = index
  while (currentInput.charCodeAt(i) !== c && i >= 0) i--
  return i
}

📖 功能说明

字符扫描工具，用于在字符串中向前或向后搜索特定字符。

📘 举例

lookAhead(end, CharCodes.Gt) 用于查找下一个 > 的位置。
backTrack(start, CharCodes.Lt) 用于向后回溯到上一个 <（如隐式闭合标签时）。

🧠 设计思路

相比正则表达式，这种"手动扫描"效率更高，也能在解析过程中精确追踪行列位置。

🧩 函数 5：`setLocEnd()` 与 `getLoc()`

arduino 复制代码

function setLocEnd(loc: SourceLocation, end: number) {
  loc.end = tokenizer.getPos(end)
  loc.source = getSlice(loc.start.offset, end)
}

📖 功能说明

更新 AST 节点的结束位置信息。

在节点合并、闭合、或文本扩展时调用。

💡 原理

通过 tokenizer 的偏移量转化为源码位置信息（行号、列号、偏移量），

让 AST 的每个节点都带有可追溯的源代码定位。

🧩 函数 6：`emitError()`

less 复制代码

function emitError(code: ErrorCodes, index: number, message?: string) {
  currentOptions.onError(
    createCompilerError(code, getLoc(index, index), undefined, message),
  )
}

📖 功能说明

统一错误报告接口。

🔍 特点

使用 ErrorCodes 枚举控制错误类型；
自动生成位置信息；
调用 onError 回调（默认打印到控制台）。

📘 常见错误码

错误码	描述
`X_MISSING_END_TAG`	缺少闭合标签
`X_INVALID_EXPRESSION`	表达式语法错误
`EOF_IN_TAG`	解析到标签结尾前遇到文件结束

🧩 函数 7：`isFragmentTemplate()` 与 `isComponent()`

这些函数我们在上一章讲过，但这里补充一点运行流程背景：

isFragmentTemplate() 在 onCloseTag() 时判断 <template> 是否包裹逻辑性指令（v-if/v-for）。
isComponent() 在同一位置判断标签是否是组件（根据首字母大写、:is 动态绑定等规则）。

这两个函数是 节点语义分类的最后一步 。

Vue 编译器正是在这里决定每个标签的"生成代码类型"：

是组件、模板、插槽、还是普通元素。

🧩 函数 8：`baseParse()` 全流程整合（最终回顾）

scss 复制代码

export function baseParse(input: string, options?: ParserOptions): RootNode {
  reset()                               // ① 清理状态
  currentInput = input
  currentOptions = extend({}, defaultParserOptions)
  if (options) Object.assign(currentOptions, options)

  tokenizer.mode = ...                  // ② 选择模式 (HTML / SFC / Base)
  tokenizer.inXML = ...
  const delimiters = options?.delimiters
  if (delimiters) {
    tokenizer.delimiterOpen = toCharCodes(delimiters[0])
    tokenizer.delimiterClose = toCharCodes(delimiters[1])
  }

  const root = (currentRoot = createRoot([], input)) // ③ 创建 Root AST
  tokenizer.parse(currentInput)          // ④ 启动状态机解析
  root.loc = getLoc(0, input.length)
  root.children = condenseWhitespace(root.children)
  currentRoot = null
  return root                            // ⑤ 返回 AST
}

🔍 全流程图解

scss 复制代码

模板字符串
    ↓
[Tokenizer]
  ├── onopentagname() → ElementNode
  ├── onattribname() / ondirname() → Attribute / DirectiveNode
  ├── ontext() → TextNode
  ├── oninterpolation() → InterpolationNode
  ├── onclosetag() → 栈出栈，结构闭合
    ↓
[AST 树生成完毕]
    ↓
condenseWhitespace() → 空白优化
    ↓
RootNode 返回

⚙️ Vue 解析器核心架构总结

模块	主要职责	示例
Tokenizer	词法分析：识别标签、文本、插值	`<div>{{ msg }}</div>`
Directive Parser	语义解析：识别指令、参数、修饰符	`v-if="ok" @click.stop`
Expression Parser	语法分析：用 Babel 构建表达式 AST	`"ok && show"`
Tree Builder	维护节点栈，生成层级关系	`<div><span></span></div>`
Whitespace Optimizer	清理空白与冗余节点	`condenseWhitespace()`
Error Reporter	统一错误系统与位置追踪	`emitError()`

💡 运行原理简述

Vue 的模板解析器采用事件驱动状态机 + 栈式构建模型：

Tokenizer 负责扫描字符流；
Parser 监听事件（如 onopentagname、ontext）；
Builder 负责维护栈结构与父子关系；
Expression 解析器 负责 Babel AST；
Whitespace 处理器 进行语义级压缩。

整个系统具有：

⚙️ 流式解析（无需整句加载）；
🔍 位置信息追踪（支持精确错误提示）；
🧩 插件式扩展点（支持 SFC、兼容模式）。

🧾 最终总结

Vue 3 的 baseParse() 是一个兼具 编译器严谨性 与 框架灵活性 的模板解析器。

它融合了：

HTML 状态机解析；
Vue 特有语义规则；
Babel 表达式分析能力；
严格的 AST 结构化体系。

📚 一句话总结

Vue 的模板解析器，是一台能理解"语义"的 HTML 状态机，它不仅能读懂 <div>，还能明白 v-if 与 {{ msg }} 的意图。

本文部分内容借助 AI 辅助生成，并由作者整理审核。

第五章：辅助函数与全流程整合

🧩 函数 1：condenseWhitespace(nodes)

📖 功能说明

🔍 逻辑拆解

🧠 设计理念

📘 举例

🧩 函数 2：condense(str)

📖 功能说明

💡 原理

🧩 函数 3：isAllWhitespace(str) 与 hasNewlineChar(str)

📖 功能说明

💡 用途

🧩 函数 4：lookAhead() 与 backTrack()

📖 功能说明

📘 举例

🧠 设计思路

🧩 函数 5：setLocEnd() 与 getLoc()

📖 功能说明

💡 原理

🧩 函数 6：emitError()

📖 功能说明

🔍 特点

📘 常见错误码

🧩 函数 7：isFragmentTemplate() 与 isComponent()

🧩 函数 8：baseParse() 全流程整合（最终回顾）

🔍 全流程图解

⚙️ Vue 解析器核心架构总结

💡 运行原理简述

🧾 最终总结

📚 一句话总结

🧩 函数 1：`condenseWhitespace(nodes)`

🧩 函数 2：`condense(str)`

🧩 函数 3：`isAllWhitespace(str)` 与 `hasNewlineChar(str)`

🧩 函数 4：`lookAhead()` 与 `backTrack()`

🧩 函数 5：`setLocEnd()` 与 `getLoc()`

🧩 函数 6：`emitError()`

🧩 函数 7：`isFragmentTemplate()` 与 `isComponent()`

🧩 函数 8：`baseParse()` 全流程整合（最终回顾）