CSS 中如何处理空白字符

本文深入解析 CSS 空白字符处理机制。CSS 只处理四种空白符:Space(U+0020)、Tab(U+0009)、换行符(LF,U+000A)和回车符(CR,U+000D)。通过 white-space-collapse 属性的不同取值,CSS 对这些空白符进行折叠、保留或转换,直接影响文本布局和渲染效果。文章将详细说明各种处理规则、实际应用场景,并提供完整的规范翻译和演示代码。

CSS 根据 white-space-collapse 属性处理空白符:

行为
collapse(默认) 空白符替换为 Space,然后折叠 Space
preserve-breaks 保留 LF,Tab/Space/CR 替换为 Space 后折叠
preserve-spaces 保留 Tab,LF/CR 替换为 Space
preserve 保留所有空白符
break-spaces 类似 preserve,但空白字符不会悬挂

关键特性

1. 空白符影响内联元素布局

空白符不仅影响文本,还会影响内联元素的水平位置:

html 复制代码
<style>
span {
    tab-size: 100px;
    white-space-collapse: preserve;
}
</style>
<span>A	<input>	</span>

由于 Tab 的存在,<input> 元素被向右移动。

2. Tab 处理规则

white-space-collapsepreserve-spacespreservebreak-spaces 时:Tab 字符被保留,会推动后续内联元素水平移动。其他属性值会将 Tab 转换为 Space

html 复制代码
<style>
.tab-example {
  tab-size: 4;
  white-space-collapse: preserve;
}
</style>
<div class="tab-example">
Name	Age	City
John	25	NYC
Alice	30	LA
</div>

Tab 布局规则

  • tab-size: 0:Tab 不渲染
  • tab-stop = tab-size × 整数
  • Tab 对齐到 tab-stop,间距小于 0.5ch 时使用下一个 tab-stop

3. CR 处理规则

HTML 解析时会将 CR 和 CRLF 序列替换为单个 LF,但 CR 仍可通过字符引用或 JavaScript 插入到 DOM 中。所以仍然需要处理 CR

white-space-collapsepreservebreak-spaces 时:CR 字符被保留,但不影响布局(相当于不存在)。其他属性值会将 CR 转换为 Space

html 复制代码
<style>
.preserve-breaks { white-space-collapse: preserve-breaks; }
.preserve { white-space-collapse: preserve; }
</style>

<div class="preserve-breaks">&#x000D;&#x000D;A&#x000D;&#x000D;B&#x000D;&#x000D;</div>
<div class="preserve">&#x000D;&#x000D;A&#x000D;&#x000D;B&#x000D;&#x000D;</div>

4. LF 处理规则

white-space-collapsecollapsepreserve-spaces 时:LF 字符转换为 Space。其他属性值会保持 LF

5. Space 折叠规则

white-space-collapsecollapsepreserve-breaks 时:Space 字符会被折叠。其他属性值会保留 Space

折叠规则

  1. LF 前后的 Space 被删除(首行相当于前面有 LF,末行相当于后面有 LF)
html 复制代码
<style>
.collapse { white-space-collapse: collapse; }
.preserve-breaks { white-space-collapse: preserve-breaks; }
.preserve { white-space-collapse: preserve; }
</style>

<div class="collapse">  &#x000A;&#x000A; A &#x000A;&#x000A;  B  &#x000A;&#x000A; </div>
<div class="preserve-breaks">  &#x000A;&#x000A; A &#x000A;&#x000A;  B  &#x000A;&#x000A; </div>
<div class="preserve">  &#x000A;&#x000A; A &#x000A;&#x000A;  B  &#x000A;&#x000A;  </div>
  1. Space 向前折叠:连续 Space 只保留第一个

跨元素折叠示例

html 复制代码
<!-- 空格位置:第一个 div 无空格;第二个 div 空格在第一个 span;第三个 div 空格在第二个 span;第四个 div 空格在第一个 span -->
<div><span> A</span><span>A </span></div>
<div><span> A </span><span> A </span></div>
<div><span> A</span><span> A </span></div>
<div><span> A </span><span>A </span></div>

混合元素折叠

html 复制代码
<!-- 空格位置:第一个 div 无空格;第二个 div 空格在第一个 span;第三个 div 空格在第二个 span;第四个 div 空格在第一个 span -->
<div><span> A</span>A </div>
<div><span> A </span> A </div>
<div><span> A</span> A </div>
<div><span> A </span>A </div>

RTL 方向折叠

html 复制代码
<style>
#container { direction: rtl; }
</style>
<div id="container">
<!-- 空格位置:第一个 div 无空格;第二个 div 空格在第一个 span;第三个 div 空格在第二个 span;第四个 div 空格在第一个 span -->
  <div><span> A</span>B </div>
  <div><span> A </span> B </div>
  <div><span> A</span> B </div>
  <div><span> A </span>B </div>
</div>

6. break-spaces vs preserve 的区别

break-spaces 相比 preserve 的关键差异:行尾空白符不会悬挂

特性 break-spaces preserve
行尾空白符 始终占据空间 可能被悬挂(不占据空间)
换行机会 每个空白字符后都可换行 只在非空白字符序列末尾换行
盒子尺寸 影响 min-content 和 max-content 空白字符可能被悬挂,不影响尺寸
html 复制代码
<div style="width: 100px; white-space-collapse: preserve;">  Hello    World    
</div>

<div style="width: 100px; white-space-collapse: break-spaces;">  Hello    World    
</div>

7. CSS 只处理 Space、Tab、CR、LF

注意 :JavaScript 的 replace(/\s+/g, ' ') 会替换所有空白符,包括 CSS 不处理的字符(如不间断空格 U+00A0)

以下演示展示了不同空白符的处理效果:

xml 复制代码
<!DOCTYPE html>
<html lang="zh-CN">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>空白符处理演示</title>
    <style>
        .demo {
            border: 1px solid #ccc;
            margin: 10px 0;
            padding: 10px;
            background: #f9f9f9;
        }
        
        .normal {
            white-space: normal;
        }
        
        .code {
            font-family: monospace;
            background: #f0f0f0;
            padding: 5px;
        }
    </style>
</head>
<body>
    <h1>CSS空白符处理规则演示</h1>
    
    <div class="demo">
        <h3>1. 普通空格 (U+0020) - 会被折叠</h3>
        <div class="normal">
            <span class="code">"a    b    c"</span> → 
            <span>a    b    c</span>
        </div>
    </div>
    
    <div class="demo">
        <h3>2. 制表符 (U+0009) - 会被折叠</h3>
        <div class="normal">
            <span class="code">"a\t\tb\t\tc"</span> → 
            <span>a		b		c</span>
        </div>
    </div>
    
    <div class="demo">
        <h3>3. 换行符 (U+000A) - 段分隔符,会被折叠</h3>
        <div class="normal">
            <span class="code">"a\n\nb\n\nc"</span> → 
            <span>a

b

c</span>
        </div>
    </div>
    
    <div class="demo">
        <h3>4. 不间断空格 (U+00A0) - 不会被折叠</h3>
        <div class="normal">
            <span class="code">"a&nbsp;&nbsp;&nbsp;b&nbsp;&nbsp;&nbsp;c"</span> → 
            <span>a&nbsp;&nbsp;&nbsp;b&nbsp;&nbsp;&nbsp;c</span>
        </div>
    </div>
    
    <div class="demo">
        <h3>5. 全角空格 (U+3000) - 不会被折叠</h3>
        <div class="normal">
            <span class="code">"a  b  c"</span> → 
            <span>a  b  c</span>
        </div>
    </div>
    
    <div class="demo">
        <h3>6. 零宽空格 (U+200B) - 不会被折叠</h3>
        <div class="normal">
            <span class="code">"a​​​b​​​c"</span> → 
            <span>a​​​b​​​c</span>
        </div>
    </div>
    
    <div class="demo">
        <h3>7. 混合空白符对比</h3>
        <div class="normal">
            <p>普通空格 + 制表符 + 换行符:</p>
            <span class="code">"a   \t\n   b   \t\n   c"</span> → 
            <span>a  	
   b  	
   c</span>
        </div>
        
        <div class="normal">
            <p>不间断空格 + 全角空格:</p>
            <span class="code">"a&nbsp; b&nbsp; c"</span> → 
            <span>a&nbsp; b&nbsp; c</span>
        </div>
    </div> 
</body>
</html>

CSS 规范翻译: The White Space Processing Rules

Phase I: Collapsing and Transformation

For each inline (including anonymous inlines; see CSS 2 § 9.2.2.1 Anonymous inline boxes [CSS2]) within an inline formatting context, white space characters are processed as follows prior to line breaking and bidi reordering, ignoring bidi formatting characters (characters with the Bidi_Control property [UAX9]) as if they were not there:

对于内联格式化上下文中的每个内联元素,white space characters 在 line breaking 和 bidi 之前处理。在处理 white space characters 时,忽略 bidi,就好像它们不存在一样:

  • If white-space-collapse is set to collapse or preserve-breaks, white space characters are considered collapsible and are processed by performing the following steps:
  1. Any sequence of collapsible spaces and tabs immediately preceding or following a segment break is removed.
  2. Collapsible segment breaks are transformed for rendering according to the segment break transformation rules.
  3. Every collapsible tab is converted to a collapsible space (U+0020).
  4. Any collapsible space immediately following another collapsible space---even one outside the boundary of the inline containing that space, provided both spaces are within the same inline formatting context---is collapsed to have zero advance width. (It is invisible, but retains its soft wrap opportunity, if any.)
  • 如果将 white-space-collapse 设置为 collapse 或 preserve-breaks,white space characters 被认为是可折叠的,并通过执行以下步骤进行处理:
  1. 在 segment break 之前或之后 的 Any sequence of collapsible spaces and tabs 被移除
  2. Collapsible segment breaks 根据 segment break transformation rules 进行转换
  3. 每个 collapsible tab 都会转换为 collapsible space (U+0020)
  4. 任何紧跟在另一个 collapsible space 之后的 collapsible space (即使该 space 位于包含该 space 的内联边界之外,假设两个 space 都在同一个内联格式上下文中),折叠为零前进宽度。它是看不见的,但保留了 soft wrap opportunity
  • If white-space-collapse is set to preserve-spaces, each tab and segment break is converted to a space.

  • 如果将 white-space-collapse 设置为 preserve-spaces ,则每个 tab and segment break 都会转换为 space。

  • If white-space-collapse is set to preserve or preserve-spaces, any sequence of spaces is treated as a sequence of non-breaking spaces except that a soft wrap opportunity exists at the end of each maximal sequence of spaces and/or tabs. For break-spaces, a soft wrap opportunity exists after every space and every tab.

  • 如果 white-space-collapse 设置为 preserve 或 preserve-spaces,任何 sequence of spaces 都被视为 non-breaking spaces,除了在每个 maximal sequence of spaces and/or tabs 的末尾存在 soft wrap opportunity。对于 break-spaces,每个 space 和 tab 后都存在 soft wrap opportunity。

Soft Wrap Opportunity: 不是强制换行。当 inline element 溢出,可能会换行

Phase II: Trimming and Positioning

Then, the entire block is rendered. Inlines are laid out, taking bidi reordering into account, and wrapping as specified by the text-wrap-mode and text-wrap-style property. As each line is laid out,

然后,渲染整个块。内联元素会进行布局,考虑 bidi,并根据 text-wrap-mode 和 text-wrap-style 属性的指定进行换行。每行布局完成后,

  1. A sequence of collapsible spaces at the beginning of a line is removed.

  2. 行首的 sequence of collapsible spaces 被移除。

  3. If the tab size is zero, preserved tabs are not rendered. Otherwise, each preserved tab is rendered as a horizontal shift that lines up the start edge of the next glyph with the next tab stop. If this distance is less than 0.5ch, then the subsequent tab stop is used instead. Tab stops occur at points that are multiples of the tab size from the starting content edge of the preserved tab's nearest block container ancestor. The tab size is given by the tab-size property.

  4. 如果 tab-size 大小为0,则 preserved tabs 不会被渲染。否则,每个 preserved tab 会被渲染为水平位移,使下一个字符的起始边缘与下一个 tab stop 对齐。如果这个距离小于 0.5ch,则使用后面的 tab stop。tab stop 出现在从保留制表符的最近块容器祖先的起始内容边缘开始、制表符大小倍数的位置。制表符大小由 tab-size 属性指定。

  5. A sequence of collapsible spaces at the end of a line is removed, as well as any trailing U+1680   OGHAM SPACE MARK whose white-space-collapse property is collapse or preserve-breaks.

  6. 行末的 sequence of collapsible spaces 会被移除。同样地,任何尾随的 U+1680   OGHAM SPACE MARK(如果其 white-space-collapse 属性为 collapse 或 preserve-breaks)也会被移除。

  7. If there remains any sequence of white space, other space separators, and/or preserved tabs at the end of a line (after bidi reordering [CSS-WRITING-MODES-4]):

  • If white-space-collapse is collapse or preserve-breaks, the UA must hang this sequence (unconditionally).
  • If white-space-collapse is preserve and text-wrap-mode is not nowrap, the UA must (unconditionally) hang this sequence, unless the sequence is followed by a forced line break, in which case it must conditionally hang the sequence instead. It may also visually collapse the character advance widths of any that would otherwise overflow.
  • If white-space-collapse is set to break-spaces, spaces, tabs, and other space separators are treated the same as other visible characters: they cannot hang nor have their advance width collapsed.
  1. 如果在行末(双向重新排序后)仍然存在任何 sequence of white space, other space separators, and/or preserved tabs
  • 如果 white-space-collapse 为 collapse 或 preserve-breaks,用户代理必须悬挂。
  • 如果 white-space-collapse 为 preserve 且 text-wrap-mode 不是 nowrap,用户代理必须悬挂此序列,除非该序列后跟强制换行符,在这种情况下它必须条件性地悬挂该序列。它还可以在视觉上折叠任何会溢出的字符前进宽度。
  • 如果 white-space-collapse 设置为 break-spaces,Space、Tab 和其他 space separators 被视为与其他可见字符相同:它们不能悬挂也不能折叠其前进宽度。

Segment Break Transformation Rules

When white-space-collapse is not collapse, segment breaks are not collapsible. For values other than collapse or preserve-spaces (which transforms them into spaces), segment breaks are instead transformed into a preserved line feed (U+000A).

当 white-space-collapse 不是 collapse 时,segment breaks 不可折叠。对于除 collapse 或 preserve-spaces(将其转换为 Space)之外的值,分段符被转换为保留的 LF(U+000A)。

When white-space-collapse is collapse, segment breaks are collapsible, and are collapsed as follows:

当 white-space-collapse 为 collapse 时,segment breaks 是可折叠的,并按以下方式折叠:

  1. First, any collapsible segment break immediately following another collapsible segment break is removed.

  2. Then any remaining segment break is either transformed into a space (U+0020) or removed depending on the context before and after the break. The rules for this operation are UA-defined in this level.

  3. 首先,紧跟在另一个可折叠分段符之后的任何可折叠分段符被移除。

  4. 然后,任何剩余的分段符根据断点前后的上下文被转换为空格(U+0020)或移除。此操作的规则在此级别由用户代理定义。

相关推荐
dengzhenyue2 小时前
矩形碰撞检测
开发语言·前端·javascript
llq_3502 小时前
为什么 npm view yarn version 显示 1.22.22?
前端
aesthetician3 小时前
ReactFlow:构建交互式节点流程图的完全指南
前端·流程图·react
neo_dowithless3 小时前
多语言维护太痛苦?我自研了一个翻译自动化 CLI 工具
前端·ai编程
小徐_23333 小时前
老乡鸡也开源?我用 Trae SOLO 做了个像老乡鸡那样做饭小程序!
前端·trae
荒诞英雄3 小时前
菠萝滞销,帮帮我们(多个APP实例间pinia混乱)
前端·架构
llq_3503 小时前
pnpm / Yarn / npm 覆盖依赖用法对比
前端
麦当_3 小时前
ReAct 模式在 Neovate 中的应用
前端·javascript·架构
折七3 小时前
告别传统开发痛点:AI 驱动的现代化企业级模板 Clhoria
前端·后端·node.js