在Word中,用VBA比较两段文本的相似度

效果1:

去掉字符串中回车,进行改进后效果:

代码:

vbscript 复制代码
Function LevenshteinDistance(s As String, t As String) As Integer
    Dim d() As Integer
    Dim i As Integer
    Dim j As Integer
    Dim cost As Integer

    Dim sLen As Integer
    Dim tLen As Integer

    sLen = Len(s)
    tLen = Len(t)

    ReDim d(sLen, tLen)

    For i = 0 To sLen
        d(i, 0) = i
    Next i

    For j = 0 To tLen
        d(0, j) = j
    Next j

    For i = 1 To sLen
        For j = 1 To tLen
            If mid(s, i, 1) = mid(t, j, 1) Then
                cost = 0
            Else
                cost = 1
            End If

            d(i, j) = GetMinValue(GetMinValue(d(i - 1, j) + 1, d(i, j - 1) + 1), d(i - 1, j - 1) + cost)
        Next j
    Next i

    LevenshteinDistance = d(sLen, tLen)
End Function
Function GetMinValue(ByVal Num1, ByVal Num2)
    Dim MinValue As Double
    MinValue = Num1
    If Num2 < MinValue Then MinValue = Num2
    GetMinValue = MinValue
End Function
Function similarity1(s As String, t As String) As Double
    Dim maxLen As Integer
    Dim dist As Integer
    If Len(s) > Len(t) Then
        maxLen = Len(s)
        
    Else
        maxLen = Len(t)
    End If
    If maxLen = 0 Then
        similarity1 = 1#  ' 如果两个字符串都为空,视为完全相似
        Exit Function
    End If

    dist = LevenshteinDistance(s, t)
    similarity1 = 1# - (dist / maxLen)
End Function

Sub TestSimilarity()
    Dim str1 As String
    Dim str2 As String
    Dim similarity As Double

    str1 = ActiveDocument.Content.Paragraphs(1).Range.text
    str2 = ActiveDocument.Content.Paragraphs(3).Range.text
    str1 = Replace(str1, vbCr, "")
    str2 = Replace(str2, vbCr, "")
    
    similarity = similarity1(str1, str2)
    MsgBox "文本相似度: " & Format(similarity, "0.00%")
End Sub
相关推荐
CodeCraft Studio14 小时前
纯前端文档编辑组件——Spire.WordJS全新发布
前端·javascript·word·office·spire.wordjs·web文档编辑·在线文档编辑器
伟贤AI之路18 小时前
原创分享:Markdown 转 Word 工具,一键导出Word/PDF文档
pdf·word·markdown·markdown转
爱吃山竹的大肚肚18 小时前
使用 poi-tl 生成 Word 文档并上传到 Minio
word
我的golang之路果然有问题19 小时前
word中latex插入矩阵的语法问题
笔记·学习·矩阵·word·latex·template method·分享
wtsolutions19 小时前
Excel to JSON by WTSolutions 4.0.0 版本更新公告
json·excel·wps·插件·转换·加载项·wtsolutions
wtsolutions19 小时前
Excel to JSON by WTSolutions 4.0.0 Update Announcement
json·excel·wps·addin·wtsolutions·conversion
liyayou19 小时前
WPS Word根据模板生成不同日期数据页
word
程序员柒叔20 小时前
Dify知识库- Word文档处理
大模型·word·workflow·知识库·工作流·dify
程序员学长李白2 天前
WPS绿色纯净版(无联网功能) v10.1.0.6876
经验分享·电脑·wps·推荐
VBAMatrix2 天前
新一代邮件合并!按Word模板批量生成个性化文档
word·办公自动化·邮件合并·审计·报告工具·批量生成合同