Kafka 的底层不是把消息一条条放进传统队列,而是把 Topic 的数据按 Partition 写成追加日志。理解它的存储结构,才能解释为什么 Kafka 能高吞吐,也能解释日志为什么可以按时间或大小清理。
一句话概括:Kafka 中 Topic 数据落在 Partition 上,每个 Partition 又被拆成多个 Segment;每个 Segment 通常包含 .log 数据文件、.index 偏移量索引和 .timeindex 时间索引。分段让查找更快,也让过期日志删除更方便。

#mermaid-svg-QAagJaqcicEGHmpk{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-QAagJaqcicEGHmpk .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-QAagJaqcicEGHmpk .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-QAagJaqcicEGHmpk .error-icon{fill:#552222;}#mermaid-svg-QAagJaqcicEGHmpk .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-QAagJaqcicEGHmpk .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-QAagJaqcicEGHmpk .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-QAagJaqcicEGHmpk .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-QAagJaqcicEGHmpk .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-QAagJaqcicEGHmpk .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-QAagJaqcicEGHmpk .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-QAagJaqcicEGHmpk .marker{fill:#333333;stroke:#333333;}#mermaid-svg-QAagJaqcicEGHmpk .marker.cross{stroke:#333333;}#mermaid-svg-QAagJaqcicEGHmpk svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-QAagJaqcicEGHmpk p{margin:0;}#mermaid-svg-QAagJaqcicEGHmpk .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-QAagJaqcicEGHmpk .cluster-label text{fill:#333;}#mermaid-svg-QAagJaqcicEGHmpk .cluster-label span{color:#333;}#mermaid-svg-QAagJaqcicEGHmpk .cluster-label span p{background-color:transparent;}#mermaid-svg-QAagJaqcicEGHmpk .label text,#mermaid-svg-QAagJaqcicEGHmpk span{fill:#333;color:#333;}#mermaid-svg-QAagJaqcicEGHmpk .node rect,#mermaid-svg-QAagJaqcicEGHmpk .node circle,#mermaid-svg-QAagJaqcicEGHmpk .node ellipse,#mermaid-svg-QAagJaqcicEGHmpk .node polygon,#mermaid-svg-QAagJaqcicEGHmpk .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-QAagJaqcicEGHmpk .rough-node .label text,#mermaid-svg-QAagJaqcicEGHmpk .node .label text,#mermaid-svg-QAagJaqcicEGHmpk .image-shape .label,#mermaid-svg-QAagJaqcicEGHmpk .icon-shape .label{text-anchor:middle;}#mermaid-svg-QAagJaqcicEGHmpk .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-QAagJaqcicEGHmpk .rough-node .label,#mermaid-svg-QAagJaqcicEGHmpk .node .label,#mermaid-svg-QAagJaqcicEGHmpk .image-shape .label,#mermaid-svg-QAagJaqcicEGHmpk .icon-shape .label{text-align:center;}#mermaid-svg-QAagJaqcicEGHmpk .node.clickable{cursor:pointer;}#mermaid-svg-QAagJaqcicEGHmpk .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-QAagJaqcicEGHmpk .arrowheadPath{fill:#333333;}#mermaid-svg-QAagJaqcicEGHmpk .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-QAagJaqcicEGHmpk .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-QAagJaqcicEGHmpk .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-QAagJaqcicEGHmpk .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-QAagJaqcicEGHmpk .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-QAagJaqcicEGHmpk .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-QAagJaqcicEGHmpk .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-QAagJaqcicEGHmpk .cluster text{fill:#333;}#mermaid-svg-QAagJaqcicEGHmpk .cluster span{color:#333;}#mermaid-svg-QAagJaqcicEGHmpk div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-QAagJaqcicEGHmpk .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-QAagJaqcicEGHmpk rect.text{fill:none;stroke-width:0;}#mermaid-svg-QAagJaqcicEGHmpk .icon-shape,#mermaid-svg-QAagJaqcicEGHmpk .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-QAagJaqcicEGHmpk .icon-shape p,#mermaid-svg-QAagJaqcicEGHmpk .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-QAagJaqcicEGHmpk .icon-shape .label rect,#mermaid-svg-QAagJaqcicEGHmpk .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-QAagJaqcicEGHmpk .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-QAagJaqcicEGHmpk .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-QAagJaqcicEGHmpk :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Topic: itheima
Partition 0
Partition 1
Segment 0
Segment 1
000.log 数据文件
000.index 偏移量索引
000.timeindex 时间索引
Topic、Partition、Segment 的关系
Kafka 的存储结构可以这样理解:
text
Topic
├── Partition 0
│ ├── Segment 0
│ │ ├── .log
│ │ ├── .index
│ │ └── .timeindex
│ └── Segment 1
├── Partition 1
└── Partition 2
| 层级 | 作用 |
|---|---|
| Topic | 业务主题,比如订单事件、用户行为 |
| Partition | Topic 的物理分片,提高并行能力 |
| Segment | Partition 的日志分段,便于查找和清理 |
.log |
真正保存消息数据 |
.index |
Offset 到物理位置的稀疏索引 |
.timeindex |
时间到 Offset 的索引 |
Partition 是 Kafka 并行能力的基础,Segment 是 Kafka 管理磁盘文件的基础。
为什么要分段
如果一个 Partition 只对应一个巨大文件,查找和删除都会很麻烦。
分段之后有两个明显好处:
| 好处 | 说明 |
|---|---|
| 查找更方便 | 先定位 Segment,再通过索引定位消息 |
| 删除更方便 | 过期数据所在的旧 Segment 可以整体删除 |
#mermaid-svg-BCYlumRYNL3Cgg4s{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-BCYlumRYNL3Cgg4s .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-BCYlumRYNL3Cgg4s .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-BCYlumRYNL3Cgg4s .error-icon{fill:#552222;}#mermaid-svg-BCYlumRYNL3Cgg4s .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-BCYlumRYNL3Cgg4s .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-BCYlumRYNL3Cgg4s .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-BCYlumRYNL3Cgg4s .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-BCYlumRYNL3Cgg4s .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-BCYlumRYNL3Cgg4s .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-BCYlumRYNL3Cgg4s .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-BCYlumRYNL3Cgg4s .marker{fill:#333333;stroke:#333333;}#mermaid-svg-BCYlumRYNL3Cgg4s .marker.cross{stroke:#333333;}#mermaid-svg-BCYlumRYNL3Cgg4s svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-BCYlumRYNL3Cgg4s p{margin:0;}#mermaid-svg-BCYlumRYNL3Cgg4s .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-BCYlumRYNL3Cgg4s .cluster-label text{fill:#333;}#mermaid-svg-BCYlumRYNL3Cgg4s .cluster-label span{color:#333;}#mermaid-svg-BCYlumRYNL3Cgg4s .cluster-label span p{background-color:transparent;}#mermaid-svg-BCYlumRYNL3Cgg4s .label text,#mermaid-svg-BCYlumRYNL3Cgg4s span{fill:#333;color:#333;}#mermaid-svg-BCYlumRYNL3Cgg4s .node rect,#mermaid-svg-BCYlumRYNL3Cgg4s .node circle,#mermaid-svg-BCYlumRYNL3Cgg4s .node ellipse,#mermaid-svg-BCYlumRYNL3Cgg4s .node polygon,#mermaid-svg-BCYlumRYNL3Cgg4s .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-BCYlumRYNL3Cgg4s .rough-node .label text,#mermaid-svg-BCYlumRYNL3Cgg4s .node .label text,#mermaid-svg-BCYlumRYNL3Cgg4s .image-shape .label,#mermaid-svg-BCYlumRYNL3Cgg4s .icon-shape .label{text-anchor:middle;}#mermaid-svg-BCYlumRYNL3Cgg4s .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-BCYlumRYNL3Cgg4s .rough-node .label,#mermaid-svg-BCYlumRYNL3Cgg4s .node .label,#mermaid-svg-BCYlumRYNL3Cgg4s .image-shape .label,#mermaid-svg-BCYlumRYNL3Cgg4s .icon-shape .label{text-align:center;}#mermaid-svg-BCYlumRYNL3Cgg4s .node.clickable{cursor:pointer;}#mermaid-svg-BCYlumRYNL3Cgg4s .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-BCYlumRYNL3Cgg4s .arrowheadPath{fill:#333333;}#mermaid-svg-BCYlumRYNL3Cgg4s .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-BCYlumRYNL3Cgg4s .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-BCYlumRYNL3Cgg4s .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-BCYlumRYNL3Cgg4s .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-BCYlumRYNL3Cgg4s .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-BCYlumRYNL3Cgg4s .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-BCYlumRYNL3Cgg4s .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-BCYlumRYNL3Cgg4s .cluster text{fill:#333;}#mermaid-svg-BCYlumRYNL3Cgg4s .cluster span{color:#333;}#mermaid-svg-BCYlumRYNL3Cgg4s div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-BCYlumRYNL3Cgg4s .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-BCYlumRYNL3Cgg4s rect.text{fill:none;stroke-width:0;}#mermaid-svg-BCYlumRYNL3Cgg4s .icon-shape,#mermaid-svg-BCYlumRYNL3Cgg4s .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-BCYlumRYNL3Cgg4s .icon-shape p,#mermaid-svg-BCYlumRYNL3Cgg4s .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-BCYlumRYNL3Cgg4s .icon-shape .label rect,#mermaid-svg-BCYlumRYNL3Cgg4s .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-BCYlumRYNL3Cgg4s .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-BCYlumRYNL3Cgg4s .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-BCYlumRYNL3Cgg4s :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 查找 offset=10520
定位所在 Segment
查 .index
跳到 .log 对应物理位置
这就是为什么 Kafka 的日志清理通常可以按 Segment 粒度执行,而不是一条条消息删除。
日志清理策略一:按保留时间
课件里提到第一种清理策略是按时间。消息在 Kafka 中保存超过指定时间后,会触发清理。
默认保留时间常见是 168 小时,也就是 7 天。
#mermaid-svg-3btDbZeSYiiJgli6{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-3btDbZeSYiiJgli6 .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-3btDbZeSYiiJgli6 .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-3btDbZeSYiiJgli6 .error-icon{fill:#552222;}#mermaid-svg-3btDbZeSYiiJgli6 .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-3btDbZeSYiiJgli6 .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-3btDbZeSYiiJgli6 .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-3btDbZeSYiiJgli6 .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-3btDbZeSYiiJgli6 .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-3btDbZeSYiiJgli6 .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-3btDbZeSYiiJgli6 .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-3btDbZeSYiiJgli6 .marker{fill:#333333;stroke:#333333;}#mermaid-svg-3btDbZeSYiiJgli6 .marker.cross{stroke:#333333;}#mermaid-svg-3btDbZeSYiiJgli6 svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-3btDbZeSYiiJgli6 p{margin:0;}#mermaid-svg-3btDbZeSYiiJgli6 .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-3btDbZeSYiiJgli6 .cluster-label text{fill:#333;}#mermaid-svg-3btDbZeSYiiJgli6 .cluster-label span{color:#333;}#mermaid-svg-3btDbZeSYiiJgli6 .cluster-label span p{background-color:transparent;}#mermaid-svg-3btDbZeSYiiJgli6 .label text,#mermaid-svg-3btDbZeSYiiJgli6 span{fill:#333;color:#333;}#mermaid-svg-3btDbZeSYiiJgli6 .node rect,#mermaid-svg-3btDbZeSYiiJgli6 .node circle,#mermaid-svg-3btDbZeSYiiJgli6 .node ellipse,#mermaid-svg-3btDbZeSYiiJgli6 .node polygon,#mermaid-svg-3btDbZeSYiiJgli6 .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-3btDbZeSYiiJgli6 .rough-node .label text,#mermaid-svg-3btDbZeSYiiJgli6 .node .label text,#mermaid-svg-3btDbZeSYiiJgli6 .image-shape .label,#mermaid-svg-3btDbZeSYiiJgli6 .icon-shape .label{text-anchor:middle;}#mermaid-svg-3btDbZeSYiiJgli6 .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-3btDbZeSYiiJgli6 .rough-node .label,#mermaid-svg-3btDbZeSYiiJgli6 .node .label,#mermaid-svg-3btDbZeSYiiJgli6 .image-shape .label,#mermaid-svg-3btDbZeSYiiJgli6 .icon-shape .label{text-align:center;}#mermaid-svg-3btDbZeSYiiJgli6 .node.clickable{cursor:pointer;}#mermaid-svg-3btDbZeSYiiJgli6 .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-3btDbZeSYiiJgli6 .arrowheadPath{fill:#333333;}#mermaid-svg-3btDbZeSYiiJgli6 .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-3btDbZeSYiiJgli6 .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-3btDbZeSYiiJgli6 .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-3btDbZeSYiiJgli6 .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-3btDbZeSYiiJgli6 .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-3btDbZeSYiiJgli6 .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-3btDbZeSYiiJgli6 .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-3btDbZeSYiiJgli6 .cluster text{fill:#333;}#mermaid-svg-3btDbZeSYiiJgli6 .cluster span{color:#333;}#mermaid-svg-3btDbZeSYiiJgli6 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-3btDbZeSYiiJgli6 .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-3btDbZeSYiiJgli6 rect.text{fill:none;stroke-width:0;}#mermaid-svg-3btDbZeSYiiJgli6 .icon-shape,#mermaid-svg-3btDbZeSYiiJgli6 .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-3btDbZeSYiiJgli6 .icon-shape p,#mermaid-svg-3btDbZeSYiiJgli6 .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-3btDbZeSYiiJgli6 .icon-shape .label rect,#mermaid-svg-3btDbZeSYiiJgli6 .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-3btDbZeSYiiJgli6 .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-3btDbZeSYiiJgli6 .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-3btDbZeSYiiJgli6 :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 否
是
Segment 写入完成
等待保留时间
是否超过 retention 时间
继续保留
删除过期 Segment
这种策略适合大多数日志、行为数据、事件流水场景。业务只关心最近一段时间的数据,超过保留期就可以清理。
日志清理策略二:按存储大小
第二种策略是按 Topic 占用空间大小。当 Topic 日志文件大小超过阈值后,Kafka 会删除更旧的数据。
#mermaid-svg-nOQxrkQOjuBGaZYD{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-nOQxrkQOjuBGaZYD .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-nOQxrkQOjuBGaZYD .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-nOQxrkQOjuBGaZYD .error-icon{fill:#552222;}#mermaid-svg-nOQxrkQOjuBGaZYD .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-nOQxrkQOjuBGaZYD .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-nOQxrkQOjuBGaZYD .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-nOQxrkQOjuBGaZYD .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-nOQxrkQOjuBGaZYD .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-nOQxrkQOjuBGaZYD .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-nOQxrkQOjuBGaZYD .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-nOQxrkQOjuBGaZYD .marker{fill:#333333;stroke:#333333;}#mermaid-svg-nOQxrkQOjuBGaZYD .marker.cross{stroke:#333333;}#mermaid-svg-nOQxrkQOjuBGaZYD svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-nOQxrkQOjuBGaZYD p{margin:0;}#mermaid-svg-nOQxrkQOjuBGaZYD .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-nOQxrkQOjuBGaZYD .cluster-label text{fill:#333;}#mermaid-svg-nOQxrkQOjuBGaZYD .cluster-label span{color:#333;}#mermaid-svg-nOQxrkQOjuBGaZYD .cluster-label span p{background-color:transparent;}#mermaid-svg-nOQxrkQOjuBGaZYD .label text,#mermaid-svg-nOQxrkQOjuBGaZYD span{fill:#333;color:#333;}#mermaid-svg-nOQxrkQOjuBGaZYD .node rect,#mermaid-svg-nOQxrkQOjuBGaZYD .node circle,#mermaid-svg-nOQxrkQOjuBGaZYD .node ellipse,#mermaid-svg-nOQxrkQOjuBGaZYD .node polygon,#mermaid-svg-nOQxrkQOjuBGaZYD .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-nOQxrkQOjuBGaZYD .rough-node .label text,#mermaid-svg-nOQxrkQOjuBGaZYD .node .label text,#mermaid-svg-nOQxrkQOjuBGaZYD .image-shape .label,#mermaid-svg-nOQxrkQOjuBGaZYD .icon-shape .label{text-anchor:middle;}#mermaid-svg-nOQxrkQOjuBGaZYD .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-nOQxrkQOjuBGaZYD .rough-node .label,#mermaid-svg-nOQxrkQOjuBGaZYD .node .label,#mermaid-svg-nOQxrkQOjuBGaZYD .image-shape .label,#mermaid-svg-nOQxrkQOjuBGaZYD .icon-shape .label{text-align:center;}#mermaid-svg-nOQxrkQOjuBGaZYD .node.clickable{cursor:pointer;}#mermaid-svg-nOQxrkQOjuBGaZYD .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-nOQxrkQOjuBGaZYD .arrowheadPath{fill:#333333;}#mermaid-svg-nOQxrkQOjuBGaZYD .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-nOQxrkQOjuBGaZYD .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-nOQxrkQOjuBGaZYD .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-nOQxrkQOjuBGaZYD .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-nOQxrkQOjuBGaZYD .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-nOQxrkQOjuBGaZYD .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-nOQxrkQOjuBGaZYD .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-nOQxrkQOjuBGaZYD .cluster text{fill:#333;}#mermaid-svg-nOQxrkQOjuBGaZYD .cluster span{color:#333;}#mermaid-svg-nOQxrkQOjuBGaZYD div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-nOQxrkQOjuBGaZYD .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-nOQxrkQOjuBGaZYD rect.text{fill:none;stroke-width:0;}#mermaid-svg-nOQxrkQOjuBGaZYD .icon-shape,#mermaid-svg-nOQxrkQOjuBGaZYD .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-nOQxrkQOjuBGaZYD .icon-shape p,#mermaid-svg-nOQxrkQOjuBGaZYD .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-nOQxrkQOjuBGaZYD .icon-shape .label rect,#mermaid-svg-nOQxrkQOjuBGaZYD .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-nOQxrkQOjuBGaZYD .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-nOQxrkQOjuBGaZYD .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-nOQxrkQOjuBGaZYD :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} 否
是
Topic 日志持续增长
是否超过大小阈值
继续写入
从最旧 Segment 开始删除
按大小清理通常用于控制磁盘成本。它需要结合业务可接受的数据保留范围来配置,否则可能出现数据还没来得及被下游处理就被清掉。
清理机制的工程影响
Kafka 的消息不是消费完就立刻删除。消费者只是提交自己的 Offset,消息仍然会在 Kafka 中保存到保留策略触发。
这带来两个重要影响:
| 影响 | 说明 |
|---|---|
| 可以重复消费 | 只要日志还在,可以重置 Offset 重新消费 |
| 磁盘要规划 | 高吞吐 Topic 必须估算保留时间和磁盘容量 |
如果业务需要重新补数据,比如修复一个消费程序 bug,可以把消费者组 Offset 回退到旧位置重新消费。但前提是旧日志还没被清理。
面试回答模板
可以这样答:
Kafka 的数据是按照 Topic、Partition、Segment 三级结构存储的。Topic 会拆成多个 Partition,每个 Partition 在磁盘上又会分成多个 Segment。每个 Segment 通常包含
.log数据文件、.index偏移量索引文件和.timeindex时间索引文件。分段的好处是减少单个文件大小,提高查找效率,也方便清理过期数据。Kafka 的日志清理主要有两类策略:第一是按保留时间,消息保存超过指定时间后删除,默认常见是 168 小时;第二是按 Topic 日志大小,超过阈值后删除最旧的数据。消费者提交 Offset 不代表消息立即删除,消息是否删除由日志保留策略决定。
小结
Kafka 存储结构可以记成一句话:
#mermaid-svg-IqJrrL5hoDgp30yC{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;fill:#333;}@keyframes edge-animation-frame{from{stroke-dashoffset:0;}}@keyframes dash{to{stroke-dashoffset:0;}}#mermaid-svg-IqJrrL5hoDgp30yC .edge-animation-slow{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 50s linear infinite;stroke-linecap:round;}#mermaid-svg-IqJrrL5hoDgp30yC .edge-animation-fast{stroke-dasharray:9,5!important;stroke-dashoffset:900;animation:dash 20s linear infinite;stroke-linecap:round;}#mermaid-svg-IqJrrL5hoDgp30yC .error-icon{fill:#552222;}#mermaid-svg-IqJrrL5hoDgp30yC .error-text{fill:#552222;stroke:#552222;}#mermaid-svg-IqJrrL5hoDgp30yC .edge-thickness-normal{stroke-width:1px;}#mermaid-svg-IqJrrL5hoDgp30yC .edge-thickness-thick{stroke-width:3.5px;}#mermaid-svg-IqJrrL5hoDgp30yC .edge-pattern-solid{stroke-dasharray:0;}#mermaid-svg-IqJrrL5hoDgp30yC .edge-thickness-invisible{stroke-width:0;fill:none;}#mermaid-svg-IqJrrL5hoDgp30yC .edge-pattern-dashed{stroke-dasharray:3;}#mermaid-svg-IqJrrL5hoDgp30yC .edge-pattern-dotted{stroke-dasharray:2;}#mermaid-svg-IqJrrL5hoDgp30yC .marker{fill:#333333;stroke:#333333;}#mermaid-svg-IqJrrL5hoDgp30yC .marker.cross{stroke:#333333;}#mermaid-svg-IqJrrL5hoDgp30yC svg{font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:16px;}#mermaid-svg-IqJrrL5hoDgp30yC p{margin:0;}#mermaid-svg-IqJrrL5hoDgp30yC .label{font-family:"trebuchet ms",verdana,arial,sans-serif;color:#333;}#mermaid-svg-IqJrrL5hoDgp30yC .cluster-label text{fill:#333;}#mermaid-svg-IqJrrL5hoDgp30yC .cluster-label span{color:#333;}#mermaid-svg-IqJrrL5hoDgp30yC .cluster-label span p{background-color:transparent;}#mermaid-svg-IqJrrL5hoDgp30yC .label text,#mermaid-svg-IqJrrL5hoDgp30yC span{fill:#333;color:#333;}#mermaid-svg-IqJrrL5hoDgp30yC .node rect,#mermaid-svg-IqJrrL5hoDgp30yC .node circle,#mermaid-svg-IqJrrL5hoDgp30yC .node ellipse,#mermaid-svg-IqJrrL5hoDgp30yC .node polygon,#mermaid-svg-IqJrrL5hoDgp30yC .node path{fill:#ECECFF;stroke:#9370DB;stroke-width:1px;}#mermaid-svg-IqJrrL5hoDgp30yC .rough-node .label text,#mermaid-svg-IqJrrL5hoDgp30yC .node .label text,#mermaid-svg-IqJrrL5hoDgp30yC .image-shape .label,#mermaid-svg-IqJrrL5hoDgp30yC .icon-shape .label{text-anchor:middle;}#mermaid-svg-IqJrrL5hoDgp30yC .node .katex path{fill:#000;stroke:#000;stroke-width:1px;}#mermaid-svg-IqJrrL5hoDgp30yC .rough-node .label,#mermaid-svg-IqJrrL5hoDgp30yC .node .label,#mermaid-svg-IqJrrL5hoDgp30yC .image-shape .label,#mermaid-svg-IqJrrL5hoDgp30yC .icon-shape .label{text-align:center;}#mermaid-svg-IqJrrL5hoDgp30yC .node.clickable{cursor:pointer;}#mermaid-svg-IqJrrL5hoDgp30yC .root .anchor path{fill:#333333!important;stroke-width:0;stroke:#333333;}#mermaid-svg-IqJrrL5hoDgp30yC .arrowheadPath{fill:#333333;}#mermaid-svg-IqJrrL5hoDgp30yC .edgePath .path{stroke:#333333;stroke-width:2.0px;}#mermaid-svg-IqJrrL5hoDgp30yC .flowchart-link{stroke:#333333;fill:none;}#mermaid-svg-IqJrrL5hoDgp30yC .edgeLabel{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-IqJrrL5hoDgp30yC .edgeLabel p{background-color:rgba(232,232,232, 0.8);}#mermaid-svg-IqJrrL5hoDgp30yC .edgeLabel rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-IqJrrL5hoDgp30yC .labelBkg{background-color:rgba(232, 232, 232, 0.5);}#mermaid-svg-IqJrrL5hoDgp30yC .cluster rect{fill:#ffffde;stroke:#aaaa33;stroke-width:1px;}#mermaid-svg-IqJrrL5hoDgp30yC .cluster text{fill:#333;}#mermaid-svg-IqJrrL5hoDgp30yC .cluster span{color:#333;}#mermaid-svg-IqJrrL5hoDgp30yC div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:"trebuchet ms",verdana,arial,sans-serif;font-size:12px;background:hsl(80, 100%, 96.2745098039%);border:1px solid #aaaa33;border-radius:2px;pointer-events:none;z-index:100;}#mermaid-svg-IqJrrL5hoDgp30yC .flowchartTitleText{text-anchor:middle;font-size:18px;fill:#333;}#mermaid-svg-IqJrrL5hoDgp30yC rect.text{fill:none;stroke-width:0;}#mermaid-svg-IqJrrL5hoDgp30yC .icon-shape,#mermaid-svg-IqJrrL5hoDgp30yC .image-shape{background-color:rgba(232,232,232, 0.8);text-align:center;}#mermaid-svg-IqJrrL5hoDgp30yC .icon-shape p,#mermaid-svg-IqJrrL5hoDgp30yC .image-shape p{background-color:rgba(232,232,232, 0.8);padding:2px;}#mermaid-svg-IqJrrL5hoDgp30yC .icon-shape .label rect,#mermaid-svg-IqJrrL5hoDgp30yC .image-shape .label rect{opacity:0.5;background-color:rgba(232,232,232, 0.8);fill:rgba(232,232,232, 0.8);}#mermaid-svg-IqJrrL5hoDgp30yC .label-icon{display:inline-block;height:1em;overflow:visible;vertical-align:-0.125em;}#mermaid-svg-IqJrrL5hoDgp30yC .node .label-icon path{fill:currentColor;stroke:revert;stroke-width:revert;}#mermaid-svg-IqJrrL5hoDgp30yC :root{--mermaid-font-family:"trebuchet ms",verdana,arial,sans-serif;} Topic
Partition
Segment
.log
.index
.timeindex
Partition 负责并行,Segment 负责文件管理,Retention 负责清理。