引言:为什么我们需要专业的敏感词过滤系统?
随着互联网内容的爆炸式增长,用户生成内容(UGC)平台面临着前所未有的内容安全挑战。一条不当内容不仅可能引发法律风险,更会对平台声誉造成致命打击。传统的简单关键词替换已无法满足现代应用的需求,我们需要一个可扩展、高效率、易维护的敏感词过滤解决方案。
本文将带你深入实战,通过策略模式+责任链模式的组合,构建一个基于SpringBoot的后端过滤系统,并结合Vue3+TypeScript的前端实时检测功能,实现全方位的内容安全防护。
一、后端设计:SpringBoot实现多层次过滤系统
1.1 核心架构设计
我们采用策略模式 定义不同过滤算法,使用责任链模式将多个过滤器串联起来,实现高可扩展性的过滤系统。
java
// 过滤策略接口
public interface SensitiveWordFilter {
/**
* 过滤敏感词
* @param content 原始内容
* @return 过滤后的内容
*/
String filter(String content);
/**
* 判断是否包含敏感词
* @param content 待检测内容
* @return 是否包含敏感词
*/
boolean hasSensitiveWord(String content);
}
1.2 基于DFA算法的敏感词过滤器
DFA(Deterministic Finite Automaton)算法是敏感词过滤的核心算法,具有极高的检测效率。
java
/**
* 基于DFA算法的敏感词过滤器
* 使用责任链模式,可以与其他过滤器组合使用
*/
@Component
public class DFAFilter implements SensitiveWordFilter {
// DFA敏感词树
private Map<Object, Object> sensitiveWordMap = new HashMap<>();
// 最小匹配规则(只要包含敏感词就匹配)
public static final int MIN_MATCH_TYPE = 1;
// 最大匹配规则(匹配最长的敏感词)
public static final int MAX_MATCH_TYPE = 2;
/**
* 初始化DFA敏感词树
* @param sensitiveWords 敏感词集合
*/
@PostConstruct
public void init() {
Set<String> sensitiveWords = loadSensitiveWords(); // 从数据库或文件加载
addSensitiveWordsToHashMap(sensitiveWords);
}
/**
* 将敏感词库构建成DFA树形结构
* 数据结构:{敏={感={词={isEnd=1}, isEnd=0}, isEnd=0}, isEnd=0}
*/
private void addSensitiveWordsToHashMap(Set<String> sensitiveWords) {
sensitiveWordMap = new HashMap<>(sensitiveWords.size());
for (String word : sensitiveWords) {
Map<Object, Object> nowMap = sensitiveWordMap;
for (int i = 0; i < word.length(); i++) {
char keyChar = word.charAt(i);
Object wordMap = nowMap.get(keyChar);
if (wordMap != null) {
nowMap = (Map<Object, Object>) wordMap;
} else {
Map<Object, Object> newMap = new HashMap<>();
newMap.put("isEnd", "0");
nowMap.put(keyChar, newMap);
nowMap = newMap;
}
if (i == word.length() - 1) {
nowMap.put("isEnd", "1"); // 最后一个字符
}
}
}
}
/**
* 检查文本中是否包含敏感词
*/
@Override
public boolean hasSensitiveWord(String text) {
for (int i = 0; i < text.length(); i++) {
int matchFlag = checkSensitiveWord(text, i, MIN_MATCH_TYPE);
if (matchFlag > 0) {
return true;
}
}
return false;
}
/**
* 过滤敏感词(替换为*号)
*/
@Override
public String filter(String text) {
return filter(text, '*');
}
/**
* 过滤敏感词(自定义替换字符)
*/
public String filter(String text, char replaceChar) {
String result = text;
Set<String> set = getSensitiveWords(text);
for (String word : set) {
String replaceString = getReplaceChars(replaceChar, word.length());
result = result.replaceAll(word, replaceString);
}
return result;
}
// 其他辅助方法...
}
1.3 过滤器责任链管理
java
/**
* 过滤器责任链
* 使用责任链模式组合多个过滤器
*/
@Component
public class FilterChain {
@Autowired
private List<SensitiveWordFilter> filters;
/**
* 执行过滤
*/
public String process(String content) {
String result = content;
for (SensitiveWordFilter filter : filters) {
result = filter.filter(result);
}
return result;
}
/**
* 检查是否包含敏感词
*/
public boolean check(String content) {
for (SensitiveWordFilter filter : filters) {
if (filter.hasSensitiveWord(content)) {
return true;
}
}
return false;
}
}
1.4 RESTful API接口
java
/**
* 敏感词检测API控制器
*/
@RestController
@RequestMapping("/api/sensitive-word")
public class SensitiveWordController {
@Autowired
private FilterChain filterChain;
/**
* 检测文本是否包含敏感词
*/
@PostMapping("/check")
public ResponseEntity<CheckResult> check(@RequestBody CheckRequest request) {
boolean hasSensitive = filterChain.check(request.getContent());
return ResponseEntity.ok(new CheckResult(hasSensitive));
}
/**
* 过滤文本中的敏感词
*/
@PostMapping("/filter")
public ResponseEntity<FilterResult> filter(@RequestBody FilterRequest request) {
String filteredContent = filterChain.process(request.getContent());
return ResponseEntity.ok(new FilterResult(filteredContent));
}
// 请求响应DTO类
@Data
public static class CheckRequest {
private String content;
}
@Data
public static class CheckResult {
private boolean hasSensitiveWord;
public CheckResult(boolean hasSensitiveWord) {
this.hasSensitiveWord = hasSensitiveWord;
}
}
// 其他DTO...
}
二、前端实现:Vue3+TypeScript实时检测
2.1 敏感词检测Composable
使用Vue3的Composition API封装敏感词检测逻辑
java
// composables/useSensitiveWord.ts
import { ref } from 'vue';
import api from '@/api';
interface CheckResult {
hasSensitiveWord: boolean;
}
interface FilterResult {
content: string;
}
/**
* 敏感词检测Composable
* 提供实时检测和过滤功能
*/
export const useSensitiveWord = () => {
const isLoading = ref(false);
const error = ref<string | null>(null);
/**
* 检测文本是否包含敏感词
*/
const checkSensitiveWord = async (content: string): Promise<boolean> => {
if (!content.trim()) return false;
isLoading.value = true;
error.value = null;
try {
const response = await api.post<CheckResult>('/sensitive-word/check', { content });
return response.data.hasSensitiveWord;
} catch (err) {
error.value = '敏感词检测失败';
console.error('敏感词检测错误:', err);
return false;
} finally {
isLoading.value = false;
}
};
/**
* 过滤文本中的敏感词
*/
const filterSensitiveWord = async (content: string): Promise<string> => {
if (!content.trim()) return content;
isLoading.value = true;
error.value = null;
try {
const response = await api.post<FilterResult>('/sensitive-word/filter', { content });
return response.data.content;
} catch (err) {
error.value = '敏感词过滤失败';
console.error('敏感词过滤错误:', err);
return content; // 出错时返回原内容
} finally {
isLoading.value = false;
}
};
return {
isLoading,
error,
checkSensitiveWord,
filterSensitiveWord
};
};
2.2 实时检测输入框组件
java
<!-- components/SensitiveInput.vue -->
<template>
<div class="sensitive-input">
<textarea
:value="modelValue"
@input="onInput"
@blur="onBlur"
:class="{ 'has-sensitive': hasSensitiveWord }"
:placeholder="placeholder"
class="sensitive-textarea"
></textarea>
<div v-if="hasSensitiveWord" class="sensitive-warning">
⚠️ 内容包含敏感词,已自动标记
</div>
<div v-if="isLoading" class="loading-indicator">
检测中...
</div>
</div>
</template>
<script setup lang="ts">
import { ref, watch, defineProps, defineEmits } from 'vue';
import { useSensitiveWord } from '@/composables/useSensitiveWord';
const props = defineProps({
modelValue: {
type: String,
default: ''
},
placeholder: {
type: String,
default: '请输入内容'
},
// 是否启用实时检测
realtimeCheck: {
type: Boolean,
default: true
}
});
const emits = defineEmits(['update:modelValue', 'sensitive-detected']);
const { checkSensitiveWord, isLoading } = useSensitiveWord();
const hasSensitiveWord = ref(false);
let checkTimeout: number | null = null;
// 输入处理
const onInput = (event: Event) => {
const value = (event.target as HTMLTextAreaElement).value;
emits('update:modelValue', value);
if (props.realtimeCheck) {
// 防抖处理,避免频繁请求
if (checkTimeout) {
clearTimeout(checkTimeout);
}
checkTimeout = setTimeout(() => {
checkContent(value);
}, 500) as unknown as number;
}
};
// 失焦时检测
const onBlur = () => {
checkContent(props.modelValue);
};
// 检测内容
const checkContent = async (content: string) => {
if (!content) {
hasSensitiveWord.value = false;
return;
}
hasSensitiveWord.value = await checkSensitiveWord(content);
emits('sensitive-detected', hasSensitiveWord.value);
};
// 监听内容变化
watch(() => props.modelValue, (newValue) => {
if (!newValue) {
hasSensitiveWord.value = false;
}
});
</script>
<style scoped>
.sensitive-textarea {
width: 100%;
min-height: 100px;
padding: 8px;
border: 1px solid #ddd;
border-radius: 4px;
}
.sensitive-textarea.has-sensitive {
border-color: #ff4d4f;
background-color: #fff2f0;
}
.sensitive-warning {
color: #ff4d4f;
font-size: 12px;
margin-top: 4px;
}
.loading-indicator {
font-size: 12px;
color: #1890ff;
margin-top: 4px;
}
</style>
2.3 在页面中使用敏感词检测
java
<!-- pages/CreatePost.vue -->
<template>
<div class="create-post">
<h2>发布新内容</h2>
<form @submit.prevent="submitForm">
<div class="form-group">
<label>标题</label>
<SensitiveInput
v-model="formData.title"
placeholder="请输入标题"
@sensitive-detected="onTitleSensitiveDetected"
/>
</div>
<div class="form-group">
<label>内容</label>
<SensitiveInput
v-model="formData.content"
placeholder="请输入内容"
:realtime-check="true"
@sensitive-detected="onContentSensitiveDetected"
/>
</div>
<div v-if="hasSensitiveWord" class="submit-warning">
⚠️ 内容包含敏感词,请修改后再提交
</div>
<button
type="submit"
:disabled="hasSensitiveWord || isSubmitting"
class="submit-button"
>
{{ isSubmitting ? '提交中...' : '提交' }}
</button>
</form>
</div>
</template>
<script setup lang="ts">
import { ref, reactive, computed } from 'vue';
import SensitiveInput from '@/components/SensitiveInput.vue';
import { useSensitiveWord } from '@/composables/useSensitiveWord';
interface FormData {
title: string;
content: string;
}
const formData = reactive<FormData>({
title: '',
content: ''
});
const titleHasSensitive = ref(false);
const contentHasSensitive = ref(false);
const isSubmitting = ref(false);
const { filterSensitiveWord } = useSensitiveWord();
const hasSensitiveWord = computed(() => {
return titleHasSensitive.value || contentHasSensitive.value;
});
const onTitleSensitiveDetected = (hasSensitive: boolean) => {
titleHasSensitive.value = hasSensitive;
};
const onContentSensitiveDetected = (hasSensitive: boolean) => {
contentHasSensitive.value = hasSensitive;
};
const submitForm = async () => {
if (hasSensitiveWord.value) {
return;
}
isSubmitting.value = true;
try {
// 提交前再次过滤,确保安全
const filteredTitle = await filterSensitiveWord(formData.title);
const filteredContent = await filterSensitiveWord(formData.content);
// 调用API提交数据
await api.createPost({
title: filteredTitle,
content: filteredContent
});
// 提交成功处理
alert('提交成功!');
formData.title = '';
formData.content = '';
} catch (error) {
console.error('提交失败:', error);
alert('提交失败,请重试');
} finally {
isSubmitting.value = false;
}
};
</script>
三、高级功能扩展
3.1 敏感词管理后台
实现敏感词的动态增删改查,支持多种匹配模式(精确匹配、模糊匹配、拼音匹配等)。
3.2 过滤算法策略扩展
可以轻松扩展新的过滤算法:
java
/**
* 拼音敏感词过滤器
*/
@Component
public class PinyinFilter implements SensitiveWordFilter {
// 实现拼音检测逻辑
}
/**
* 形近字过滤器
*/
@Component
public class SimilarCharacterFilter implements SensitiveWordFilter {
// 实现形近字检测逻辑
}
3.3 性能优化建议
- 敏感词树缓存:使用Redis缓存DFA树结构,减少初始化时间
- 异步检测:对于长文本,使用异步处理避免阻塞主线程
- 3分级过滤:根据敏感程度分级处理,不同级别采用不同策略
总结
本文详细介绍了如何基于SpringBoot和Vue3+TypeScript构建一个完整的敏感词过滤系统。通过策略模式 和责任链模式的应用,我们的系统具备了良好的扩展性和维护性。DFA算法保证了高效检测,前后端分离设计实现了实时反馈用户体验。
核心优势:
- 高性能:DFA算法实现毫秒级检测
- 可扩展:策略模式+责任链模式轻松扩展新功能
- 前后端协同:前端实时检测+后端可靠过滤
- 易于管理:支持动态词库管理
这种设计方案不仅适用于敏感词过滤,还可以扩展到其他文本处理场景,如广告过滤、垃圾信息识别等,具有很高的实用价值和借鉴意义。