源码解析：Fuse.js 是如何实现模糊搜索的？

大家的阅读是我发帖的动力，本文首发于我的博客：deerblog.gu-nami.com/，转载请注明出处。

🎈前言

最近一直都在用 Fuse.js 来进行表格搜索，有点好奇是怎么实现的，于是就翻了一下源码。

Fuse.js 是一个功能强大、轻量级、零依赖关系的 JS 模糊搜索库，适用于前后端各种 JS 项目。Fuse.js 根据匹配度、匹配位置、句子长度的因素，给每个搜索项评分并排序，得到模糊搜索的结果。举个利兹🕊：

ts 复制代码

const list = ['apple', 'orange', 'banana']
const fuse = new Fuse(list)
console.log(fuse.search('appnana'));
// [ { item: 'banana', refIndex: 2 }, { item: 'apple', refIndex: 0 } ]

也可以用于复杂对象的搜索：

ts 复制代码

const list = [
  { item: 'vue', tag: [ 'front end', 'browser' ] },
  { item: 'axios', tag: [ 'request lib' ] },
  { item: 'echart', tag: [ 'graph lib' ] },
  { item: 'naive ui', tag: [ 'UI lib' ] },
  { item: 'next', tag: [ 'full stack', 'server side render' ] }
]
const fuse = new Fuse(list, {
  keys: ['item', 'tag']
})
console.log(fuse.search('livue'));
// [
//   { item: { item: 'vue', tag: [Array] }, refIndex: 0 },
//   { item: { item: 'naive ui', tag: [Array] }, refIndex: 3 },
//   { item: { item: 'axios', tag: [Array] }, refIndex: 1 }
// ]

它的核心模糊匹配算法是源自 Bitap 算法。Bitap算法（也被称为 shift-or、shift-and 或 Baeza-Yates-Gonnet 算法）是一种模糊字符串搜索算法。该算法可以判断给定文本是否包含与给定模式"大致相等"的子字符串，这里的"大致相等"是根据子串和模式之间的 Levenshtein 编辑距离是否在给定的距离以内来判断的。Bitap 算法预先计算模式串字符位置，使用二进制运算来进行，速度非常的快，时间复杂度是 <math xmlns="http://www.w3.org/1998/Math/MathML"> O ( m n ) O(mn) </math>O(mn)， <math xmlns="http://www.w3.org/1998/Math/MathML"> m m </math>m、 <math xmlns="http://www.w3.org/1998/Math/MathML"> n n </math>n 分别是目标串和模式串长度。

Levenshtein 编辑距离，就是两个字符串之间，通过增加、减少、替换字符，由一个转换成另一个所需的最少编辑操作次数。

下面以字符串数组为例，分析 Fuse.js 的算法。

✨初始化

我们一般先会创建Fuse的实例：

ts 复制代码

const list = [/*...*/]
const fuse = new Fuse(list, {/*...*/})

创建实例时，内部会根据传入的数据创建索引FuseIndex，它的数据结构大概长这样子：

ts 复制代码

{
  v: string // 待匹配的目标串
  i: number // 目标串在原数据的位置
  n: number // 范数（？），随目标串单词个数上升，影响搜索结果的排序
}[]

📑创建索引

函数入口在 src/core/index.js。创建Fuse实例后，根据传入的docs构建索引。

ts 复制代码

export default class Fuse {
  constructor(docs, options = {}, index) {
    this.options = { ...Config, ...options }
    if (
      this.options.useExtendedSearch &&
      !process.env.EXTENDED_SEARCH_ENABLED
    ) {
      throw new Error(ErrorMsg.EXTENDED_SEARCH_UNAVAILABLE)
    }
    this._keyStore = new KeyStore(this.options.keys)
    this.setCollection(docs, index)
  }

  setCollection(docs, index) {
    this._docs = docs

    if (index && !(index instanceof FuseIndex)) {
      throw new Error(ErrorMsg.INCORRECT_INDEX_TYPE)
    }

    this._myIndex =
      index ||
      createIndex(this.options.keys, this._docs, {
        getFn: this.options.getFn,
        fieldNormWeight: this.options.fieldNormWeight
      })
  }
}

这里由于我们以字符串数组为例，这里就没有this.options.keys了，KeyStore先不看了，来看createIndex：

ts 复制代码

export function createIndex(
  keys,
  docs,
  { getFn = Config.getFn, fieldNormWeight = Config.fieldNormWeight } = {}
) {
  const myIndex = new FuseIndex({ getFn, fieldNormWeight })
  myIndex.setKeys(keys.map(createKey))
  myIndex.setSources(docs)
  myIndex.create()
  return myIndex
}

export default class FuseIndex {
  constructor({
    getFn = Config.getFn,
    fieldNormWeight = Config.fieldNormWeight
  } = {}) {
    this.norm = normGenerator(fieldNormWeight, 3)
    this.getFn = getFn
    this.isCreated = false

    this.setIndexRecords()
  }
  setIndexRecords(records = []) {
    this.records = records
  }
  setSources(docs = []) {
    this.docs = docs
  }
}

FuseIndex创建后，加入keys和docs，然后调用create创建索引：

ts 复制代码

create() {
  if (this.isCreated || !this.docs.length) {
    return
  }
  this.isCreated = true
  // List is Array<String>
  if (isString(this.docs[0])) {
    this.docs.forEach((doc, docIndex) => {
      this._addString(doc, docIndex)
    })
  } else {
    // List is Array<Object>
    this.docs.forEach((doc, docIndex) => {
      this._addObject(doc, docIndex)
    })
  }
  this.norm.clear()
}
_addString(doc, docIndex) {
  if (!isDefined(doc) || isBlank(doc)) {
    return
  }
  let record = {
    v: doc,
    i: docIndex,
    n: this.norm.get(doc)
  }
  this.records.push(record)
}

对于字符串的数据源来说，每一条索引是由v文本（也就是匹配时的目标串）、i在原数据数组中的下标、nnorm（大概叫...范数吧，和长度有关的一个值）。this.norm.get是一个随文本长度下降的权重，计算方法如下：

ts 复制代码

const SPACE = /[^ ]+/g
// Field-length norm: the shorter the field, the higher the weight.
// Set to 3 decimals to reduce index size.
export default function norm(weight = 1, mantissa = 3) {
  const cache = new Map()
  const m = Math.pow(10, mantissa)
  return {
    get(value) {
      // 这里 Token 是按空格分割算的，那中文就只有一个 Token，感觉像 bug
      const numTokens = value.match(SPACE).length
      if (cache.has(numTokens)) {
        return cache.get(numTokens)
      }
      // Default function is 1/sqrt(x), weight makes that variable
      const norm = 1 / Math.pow(numTokens, 0.5 * weight)
      // In place of `toFixed(mantissa)`, for faster computation
      const n = parseFloat(Math.round(norm * m) / m)
      cache.set(numTokens, n)
      return n
    },
    clear() {
      cache.clear()
    }
  }
}

简单来说，也就是
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"> n u m T o k e n s − 0.5 ∗ w e i g h t numTokens^{-0.5*weight} </math>numTokens−0.5∗weight

可见，numTokens越小，norm就大，后续可以看到分数随norm增加而下降，数据项根据分数从小到大排序，这样子比较短的数据项的排名就靠前。

这里的weight，可以在创建时从options.fieldNormWeight传入，默认为 1，norm随weight减小而增加。

到这里，索引就创建完了，如果传入['apple', 'orange', 'banana']，就有索引

json 复制代码

[
  { v: 'apple', i: 0, n: 1 },
  { v: 'orange', i: 1, n: 1 },
  { v: 'banana', i: 2, n: 1 }
]

📙小结

简单总结一下大致流程：

graph LR A("new Fuse()")-->B("createIndex")-->C("myIndex = new FuseIndex()")-->D("myIndex.create()")

🗄搜索

创建完索引后，使用search函数进行模糊搜索，就像这样子

ts 复制代码

const fuse = new Fuse(/*...*/)
const ans = fuse.search('appnana')

🚪搜索入口

搜索的流程从search方法开始：

ts 复制代码

search(query, { limit = -1 } = {}) {
  // ...
  let results = isString(query)
    ? isString(this._docs[0])
      ? this._searchStringList(query)
      : this._searchObjectList(query)
    : this._searchLogical(query)

  computeScore(results, { ignoreFieldNorm })

  if (shouldSort) {
    results.sort(sortFn)
  }

  // ...
}

字符串的数据调用_searchStringList方法，得出符合要求的每一项及其得分。

ts 复制代码

_searchStringList(query) {
  const searcher = createSearcher(query, this.options)
  const { records } = this._myIndex
  const results = []
  // Iterate over every string in the index
  records.forEach(({ v: text, i: idx, n: norm }) => {
    if (!isDefined(text)) {
      return
    }
    const { isMatch, score, indices } = searcher.searchIn(text)
    if (isMatch) {
      results.push({
        item: text,
        idx,
        matches: [{ score, value: text, norm, indices }]
      })
    }
  })
  return results
}

结果中，results的每一项对于每一个源数据项的结果，matches对于字符串的数据来说只有一项，在搜索对象时，是每一个key对应的结果。computeScore计算其总分。

可以看到，结果是matches各个元素分数乘方之积。分数的取值范围是 <math xmlns="http://www.w3.org/1998/Math/MathML"> [ 0 , 1 ] [0, 1] </math>[0,1]，后面的results.sort(sortFn)默认从小到大排序，key的weight越大，norm越大，分数越小，排名也越靠前。

ts 复制代码

import Config from './config'

// Practical scoring function
export default function computeScore(
  results,
  { ignoreFieldNorm = Config.ignoreFieldNorm }
) {
  results.forEach((result) => {
    let totalScore = 1

    result.matches.forEach(({ key, norm, score }) => {
      const weight = key ? key.weight : null

      totalScore *= Math.pow(
        score === 0 && weight ? Number.EPSILON : score,
        (weight || 1) * (ignoreFieldNorm ? 1 : norm)
      )
    })

    result.score = totalScore
  })
}

下面进入 Fuse.js 的核心部分，就是它的模糊匹配。

🗜创建搜索器

回到_searchStringList方法，可以看到创建了一个搜索器createSearcher，然后把每项索引扔进去计算分值：

ts 复制代码

_searchStringList(query) {
  const searcher = createSearcher(query, this.options)
  const { records } = this._myIndex
  const results = []
  // Iterate over every string in the index
  records.forEach(({ v: text, i: idx, n: norm }) => {
    if (!isDefined(text)) {
      return
    }
    const { isMatch, score, indices } = searcher.searchIn(text)
    if (isMatch) {
      results.push({
        item: text,
        idx,
        matches: [{ score, value: text, norm, indices }]
      })
    }
  })
  return results
}

在createSearcher，它会根据搜索的模式串创建一个BitapSearch并缓存起来：

ts 复制代码

import { BitapSearch } from '../search'

const registeredSearchers = []

export default function register(...args) {
  registeredSearchers.push(...args)
}

export function createSearcher(pattern, options) {
  for (let i = 0, len = registeredSearchers.length; i < len; i += 1) {
    let searcherClass = registeredSearchers[i]
    if (searcherClass.condition(pattern, options)) {
      return new searcherClass(pattern, options)
    }
  }
  return new BitapSearch(pattern, options)
}

✒用二进制掩码记录目标串字符

Bitap 算法会把模式串转化为二进制掩码，利用位运算来快速计算模式串和目标串之间的相似度。来看BitapSearch的构造函数：

ts 复制代码

export default class BitapSearch {
  constructor(/* ... */) {
    // ...
    this.chunks = []
    if (!this.pattern.length) {
      return
    }
    const addChunk = (pattern, startIndex) => {
      this.chunks.push({
        pattern,
        alphabet: createPatternAlphabet(pattern),
        startIndex
      })
    }
    const len = this.pattern.length
    // const MAX_BITS = 32
    if (len > MAX_BITS) {
      let i = 0
      const remainder = len % MAX_BITS
      const end = len - remainder
      while (i < end) {
        addChunk(this.pattern.substr(i, MAX_BITS), i)
        i += MAX_BITS
      }
      if (remainder) {
        const startIndex = len - MAX_BITS
        addChunk(this.pattern.substr(startIndex), startIndex)
      }
    } else {
      addChunk(this.pattern, 0)
    }
  }
}

在构造函数中，BitapSearch按照 31 的长度为一节对模式串进行切分，用二进制掩码每一个字符的位置。

ts 复制代码

function createPatternAlphabet(pattern) {
  var mask = {};
  for (var i = 0, len = pattern.length; i < len; i += 1) {
    var _char = pattern.charAt(i);
    mask[_char] = (mask[_char] || 0) | 1 << len - i - 1;
  }
  return mask;
}

例如：abc，a 记录为 100，b 就是 010， c 就是 001。

至于为什么是 31 的长度，大概是因为 JS 二进制运算使用 32 位的二进制整数吧。另外，Bitap 算法中也是使用 31 位的二进制记录字符位置的。

🔍匹配入口

回到_searchStringList，serachIn方法用来进行字符串匹配。

ts 复制代码

_searchStringList(query) {
  const searcher = createSearcher(query, this.options)
  const { records } = this._myIndex
  const results = []
  records.forEach(({ v: text, i: idx, n: norm }) => {
    // ...
    const { isMatch, score, indices } = searcher.searchIn(text)
    // ...
  })
  return results
}

searchIn遇到完全匹配直接返回：

ts 复制代码

searchIn(text) {
  const { isCaseSensitive, includeMatches } = this.options
  if (!isCaseSensitive) {
    text = text.toLowerCase()
  }
  // Exact match
  if (this.pattern === text) {
    let result = {
      isMatch: true,
      score: 0
    }
    if (includeMatches) {
      result.indices = [[0, text.length - 1]]
    }
    return result
  }
  // ... Bitap
}

如果模式串this.pattern和目标串text相等，则记分值为 0，也就是完全匹配，结束逻辑。否则进入 Bitap 模糊匹配的逻辑。

后面是 Bitap 的逻辑，这里用this.chunks中每一个模式串分块的掩码和目标串进行匹配，最后分数取每一分块分数的平均值。这里有几个值得注意的变量，后面会用到。

location：预期匹配子串出现的位置；
distance：影响匹配的子串与期望位置偏离对分数的影响，越大，偏离对分数影响越小；ignoreLocation可以使得分数计算忽略距离因素，如果你想搜索超长的文本，这大概会有用吧。
threshold：分数阈值，分数大于此值会被视为不匹配。

这些变量的值都是可以在new Fuse()的时候配置的，具体看Fuse.js 的文档。

ts 复制代码

searchIn(text) {
  // ...
  
  let totalScore = 0
  let hasMatches = false
  this.chunks.forEach(({ pattern, alphabet, startIndex }) => {
    const { isMatch, score, indices } = search(text, pattern, alphabet, {
      location: location + startIndex,
      distance,
      threshold,
      findAllMatches,
      minMatchCharLength,
      includeMatches,
      ignoreLocation
    })
    if (isMatch) {
      hasMatches = true
    }
    totalScore += score
    // ...
  })
  let result = {
    isMatch: hasMatches,
    score: hasMatches ? totalScore / this.chunks.length : 1
  }
  if (hasMatches && includeMatches) {
    result.indices = allIndices
  }
  return result
}

📘小结

下面进一步介绍算法，这里先小结一下。Fuse 进行模糊匹配的总体流程是，先根据传入的数据源对每一项数据进行索引创建，搜索的时候，输入模式串，根据索引使用 Bitap 算法计算出每一项数据的得分，经过处理后返回允许的匹配度下符合匹配的数据。

graph LR D("search")-->E("searcher = new BitapSearch")-->|用二进制掩码记录目标串|H(addChunk) D-->F("searcher.searchIn")-->|完全匹配|I("return { isMatch: true, score: 0 }") F-->|Bitap 算法|G(search)

💻基于 Bitap 的模糊匹配

下面来看模糊匹配算法的实现，也是 Fuse.js 的核心。文件位置在 src/search/bitap/search.js，这里进行分段地解析。下面定义了一些变量，用到的时候下面再说，先可以不用关注它们：

ts 复制代码

export default function search(
  text,
  pattern,
  patternAlphabet,
  {
    location = Config.location,
    distance = Config.distance,
    threshold = Config.threshold,
    findAllMatches = Config.findAllMatches,
    minMatchCharLength = Config.minMatchCharLength,
    includeMatches = Config.includeMatches,
    ignoreLocation = Config.ignoreLocation
  } = {}
) {
  if (pattern.length > MAX_BITS) {
    throw new Error(ErrorMsg.PATTERN_LENGTH_TOO_LARGE(MAX_BITS))
  }

  const patternLen = pattern.length
  // Set starting location at beginning text and initialize the alphabet.
  const textLen = text.length
  // Handle the case when location > text.length
  const expectedLocation = Math.max(0, Math.min(location, textLen))
  // Highest score beyond which we give up.
  let currentThreshold = threshold
  // Is there a nearby exact match? (speedup)
  let bestLocation = expectedLocation

  // Performance: only computer matches when the minMatchCharLength > 1
  // OR if `includeMatches` is true.
  const computeMatches = minMatchCharLength > 1 || includeMatches
  // A mask of the matches, used for building the indices
  const matchMask = computeMatches ? Array(textLen) : []
  // ...
}

🧮计算分数

先来看看每个目标串和模式串匹配得分是怎么计算的。得分主要受 Bitap 算法中的错误（匹配子串和模式串的编辑距离）和与预期位置的偏离影响。简单的来说，也就是：
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block"> e r r o r s p a t t e r n . l e n g t h + abs ( e x p e c t e d L o c a t i o n − c u r r e n t L o c a t i o n ) d i s t a n c e \frac{errors}{pattern.length} + \frac{\text{abs}(expectedLocation - currentLocation)}{distance} </math>pattern.lengtherrors+distanceabs(expectedLocation−currentLocation)

错误越多、离预期位置越远分数越高，搜索结果排名越靠后，或者被排除出搜索结果。

ts 复制代码

export default function computeScore(
  pattern,
  {
    errors = 0,
    currentLocation = 0,
    expectedLocation = 0,
    distance = Config.distance,
    ignoreLocation = Config.ignoreLocation
  } = {}
) {
  const accuracy = errors / pattern.length
  if (ignoreLocation) {
    return accuracy
  }
  const proximity = Math.abs(expectedLocation - currentLocation)

  if (!distance) {
    // Dodge divide by zero error.
    return proximity ? 1.0 : accuracy
  }
  return accuracy + proximity / distance
}

🔦记录匹配分数下限

注意到currentThreshold的初始值是Config.threshold，在这里的作用是记录当前最小得分，如果说后面不可能比这个分数更小了，就退出。

类似地，expectedLocation也会根据当前最佳的匹配，更新位置。

先检测目标串text中是否包含模式串pattern，用于更新currentThreshold，后续减少运算。

ts 复制代码

// ...
let index
// Get all exact matches, here for speed up
while ((index = text.indexOf(pattern, bestLocation)) > -1) {
  let score = computeScore(pattern, {
    currentLocation: index,
    expectedLocation,
    distance,
    ignoreLocation
  })
  currentThreshold = Math.min(score, currentThreshold)
  bestLocation = index + patternLen
  // ...
}
// ...

📚模糊匹配

下面进入模糊匹配环节，它有两个循环，外层是允许的错误数，内层遍历目标串text。这里使用动态规划算法，状态记录在bitArr和lastBitArr中，分别表示当前和上次允许错误数的结果，finalScore记录最优的分数。在bitArr每一项都是二进制，每一位j表示在当前允许错误数下text.slice(j, j + patternLen)和pattern是否已成功匹配。

ts 复制代码

// Reset the best location
bestLocation = -1

let lastBitArr = []
let finalScore = 1
let binMax = patternLen + textLen

const mask = 1 << (patternLen - 1)

for (let i = 0; i < patternLen; i += 1) {
  // ...
}

🔎搜索匹配范围

下面进入外层循环，先根据bestLocation和currentThreshold计算出需要检索目标串的范围，也就是离bestLocation多远的地方分数不会大于currentThreshold。很明显这就是二分搜索。

ts 复制代码

for (let i = 0; i < patternLen; i += 1) {
  // Scan for the best match; each iteration allows for one more error.
  // Run a binary search to determine how far from the match location we can stray
  // at this error level.
  let binMin = 0
  let binMid = binMax

  while (binMin < binMid) {
    const score = computeScore(pattern, {
      errors: i,
      currentLocation: expectedLocation + binMid,
      expectedLocation,
      distance,
      ignoreLocation
    })
    if (score <= currentThreshold) {
      binMin = binMid
    } else {
      binMax = binMid
    }
    binMid = Math.floor((binMax - binMin) / 2 + binMin)
  }

  // Use the result from this iteration as the maximum for the next.
  binMax = binMid

  let start = Math.max(1, expectedLocation - binMid + 1)
  let finish = findAllMatches
    ? textLen
    : Math.min(expectedLocation + binMid, textLen) + patternLen
  
  // ...
}

搜索的左端点自然是expectedLocation - binMid + 1，右端点一般情况下是expectedLocation + binMid加上模式串长度patternLen，其实这里考虑的是在text末尾，在允许的错误下匹配了模式串pattern的情况。

这里有点没看懂：findAllMatches可以在创建Fuse时配置，为true表示：即使已经在字符串中找到了一个完全匹配，匹配函数会在字符串中继续执行，直到搜索模式的末尾。看起来像是考虑匹配text全文，但是这里没太看懂，有几个疑问，textLen为什么看起来是可能小于Math.min(expectedLocation + binMid, textLen) + patternLen，而且为什么它对start为什么没有影响。

⭕进行匹配

初始化状态后，从后往前开始匹配：

ts 复制代码

for (let i = 0; i < patternLen; i += 1) {
  // ...
  let bitArr = Array(finish + 2)
  bitArr[finish + 1] = (1 << i) - 1
  for (let j = finish; j >= start; j -= 1) {
    let currentLocation = j - 1
    let charMatch = patternAlphabet[text.charAt(currentLocation)]
    // ...
    
    // First pass: exact match
    bitArr[j] = ((bitArr[j + 1] << 1) | 1) & charMatch
    // Subsequent passes: fuzzy match
    if (i) {
      bitArr[j] |=
        ((lastBitArr[j + 1] | lastBitArr[j]) << 1) | 1 | lastBitArr[j + 1]
    }
    if (bitArr[j] & mask) {
      // 匹配完成...
    }
    // ...
  }
}

遍历匹配串时，每一步倒序加入目标串一个字符，这里用j记录循环变量，从finish递减到 1。

通过charMatch = patternAlphabet[text.charAt(j - 1)]获取当前目标串text的字符在模式串中的位置，通过((bitArr[j + 1] << 1) | 1) & charMatch记录当前位置是否与模式串匹配。举个栗子🌰：

如果目标串是'test'，模式串是tEst，patternAlphabet大概长这个样子：{ t: 1001, s: 10, E: 100 }。假设finish是 8，允许错误数i为 0，匹配情况如下所示：

j	text.charAt(j - 1)	charMatch	bitArr[j + 1]	bitArr[j]
8	undefined	undefined	0	0
7	undefined	undefined	0	0
6	undefined	undefined	0	0
5	undefined	undefined	0	0
4	't'	1001	0	1
3	's'	10	1	10
2	'e'	undefined	10	0
1	't'	1001	0	1

可以发现，如果上面匹配成功，最终bitArr[1]就是1000，也就是说bitArr[j] & (1 << (patternLen - 1))为真时，匹配成功。

我们看i > 1的情况，如果上次匹配当前位置离完全匹配就差 1 个编辑距离，那自然是匹配成功的。

到这里状态转移的过程大致描述清楚了，具体代码如下所示：

ts 复制代码

for (let i = 0; i < patternLen; i += 1) {
  // ...
  
  // Initialize the bit array
  let bitArr = Array(finish + 2)
  bitArr[finish + 1] = (1 << i) - 1

  for (let j = finish; j >= start; j -= 1) {
    let currentLocation = j - 1
    let charMatch = patternAlphabet[text.charAt(currentLocation)]
    // ...

    // First pass: exact match
    bitArr[j] = ((bitArr[j + 1] << 1) | 1) & charMatch

    // Subsequent passes: fuzzy match
    if (i) {
      bitArr[j] |=
        ((lastBitArr[j + 1] | lastBitArr[j]) << 1) | 1 | lastBitArr[j + 1]
    }
    if (bitArr[j] & mask) {
      finalScore = computeScore(pattern, {
        errors: i,
        currentLocation,
        expectedLocation,
        distance,
        ignoreLocation
      })

      // This match will almost certainly be better than any existing match.
      // But check anyway.
      if (finalScore <= currentThreshold) {
        // Indeed it is
        currentThreshold = finalScore
      
        bestLocation = currentLocation

        // Already passed `loc`, downhill from here on in.
        if (bestLocation <= expectedLocation) {
          break
        }

        // When passing `bestLocation`, don't exceed our current distance from `expectedLocation`.
        start = Math.max(1, 2 * expectedLocation - bestLocation)
      }
    }
  }

  // ...
  lastBitArr = bitArr
}

🆗后续处理

然后计算下次循环分数的最小值，如果高于阈值则退出

ts 复制代码

for (let i = 0; i < patternLen; i += 1) {
  // ...
  // No hope for a (better) match at greater error levels.
  const score = computeScore(pattern, {
    errors: i + 1,
    currentLocation: expectedLocation,
    expectedLocation,
    distance,
    ignoreLocation
  })

  if (score > currentThreshold) {
    break
  }
  // ...
}

返回匹配结果

ts 复制代码

export default function search(/* ... */) {
  // ...
  const result = {
    isMatch: bestLocation >= 0,
    // Count exact matches (those with a score of 0) to be "almost" exact
    score: Math.max(0.001, finalScore)
  }
  // ...
  return result
}

✔结语

到这里，Fuse.js 的源码大致解读完了。当然例如key分值的计算，对复杂数据项的处理还没有涉及到，可能后续更新一下，欢迎催更。总之我们对 Fuse.js 的工作原理和实现细节有了大致的理解。Fuse.js 通过其高效的索引机制和 Bitap 算法，提供了一种既快速又准确的模糊搜索解决方案。在前端长列表搜索等场景非常有用，作为一个轻量级无依赖的第三方库，在服务端小型项目也有用武之地。