使用工厂模式和策略模式实现布隆过滤器

使用工厂模式和策略模式实现布隆过滤器的大概流程如下：

定义布隆过滤器接口：首先定义一个布隆过滤器接口，包括添加元素和判断元素是否存在两个基本操作。
实现具体的布隆过滤器类：创建一个具体的布隆过滤器类，实现布隆过滤器接口中的方法。在这个类中，需要定义布隆过滗器的数据结构（比如位数组）、大小等属性。
定义哈希策略接口：定义一个哈希策略接口，包含计算哈希值的方法。
实现具体的哈希策略类：创建多个具体的哈希策略类，实现哈希策略接口中的方法，每个类对应一种哈希函数的计算方法。
创建布隆过滤器工厂类：定义一个布隆过滤器工厂类，其中包含一个用于创建布隆过滤器对象的工厂方法。工厂方法接受布隆过滤器的大小和哈希策略对象作为参数，并返回一个具体的布隆过滤器对象。
使用布隆过滤器工厂：在需要创建布隆过滤器对象的地方，调用布隆过滤器工厂的工厂方法来创建布隆过滤器对象，并传入相应的哈希策略对象。

代码示例

java 复制代码

// 1. 布隆过滤器接口
public interface BloomFilter {
    void add(String element);
    boolean mightContain(String element);
}

// 2. 哈希策略接口（策略模式的核心）
public interface HashStrategy {
    int[] getHashValues(String input, int size);
}

// 3. 具体哈希策略实现（策略模式的具体策略）
public class SimpleHashStrategy implements HashStrategy {
    @Override
    public int[] getHashValues(String input, int size) {
        int[] hashValues = new int[3]; // 使用3个哈希函数
        
        // 简单的哈希实现
        hashValues[0] = Math.abs(input.hashCode() % size);
        hashValues[1] = Math.abs((input.hashCode() * 31) % size);
        hashValues[2] = Math.abs((input.hashCode() * 37) % size);
        
        return hashValues;
    }
}

public class MurmurHashStrategy implements HashStrategy {
    @Override
    public int[] getHashValues(String input, int size) {
        // 假设这里使用Murmur哈希算法实现
        int[] hashValues = new int[4]; // 使用4个哈希函数
        
        // 实际项目中会使用真正的Murmur哈希算法
        hashValues[0] = Math.abs((input.hashCode() * 41) % size);
        hashValues[1] = Math.abs((input.hashCode() * 43) % size);
        hashValues[2] = Math.abs((input.hashCode() * 47) % size);
        hashValues[3] = Math.abs((input.hashCode() * 53) % size);
        
        return hashValues;
    }
}

// 4. 具体的布隆过滤器实现
// 用于判断 String 元素是否存在。
public class StandardBloomFilter implements BloomFilter {
    private BitSet bitSet; // 位数组
    private int size;
    private HashStrategy hashStrategy; // 使用的哈希策略
    
    public StandardBloomFilter(int size, HashStrategy hashStrategy) {
        this.size = size;
        this.bitSet = new BitSet(size);
        this.hashStrategy = hashStrategy;
    }
    
    @Override
    public void add(String element) {
        int[] hashValues = hashStrategy.getHashValues(element, size);
        
        for (int hashValue : hashValues) {
            bitSet.set(hashValue);
        }
    }
    
    @Override
    public boolean mightContain(String element) {
        int[] hashValues = hashStrategy.getHashValues(element, size);
        
        for (int hashValue : hashValues) {
            if (!bitSet.get(hashValue)) {
                return false; // 肯定不存在
            }
        }
        
        return true; // 可能存在
    }
}
// 新的布隆过滤器实现 - 计数布隆过滤器
// 与普通布隆过滤器不同，它可以记录元素出现的次数
public class CountingBloomFilter implements BloomFilter {
    private int[] counters; // 使用整数数组而非位数组，以便计数
    private int size;
    private HashStrategy hashStrategy;
    
    public CountingBloomFilter(int size, HashStrategy hashStrategy) {
        this.size = size;
        this.counters = new int[size];
        this.hashStrategy = hashStrategy;
    }
    
    @Override
    public void add(String element) {
        int[] hashValues = hashStrategy.getHashValues(element, size);
        
        for (int hashValue : hashValues) {
            counters[hashValue]++; // 增加计数而非简单地设置位
        }
    }
    
    @Override
    public boolean mightContain(String element) {
        int[] hashValues = hashStrategy.getHashValues(element, size);
        
        for (int hashValue : hashValues) {
            if (counters[hashValue] == 0) {
                return false; // 肯定不存在
            }
        }
        
        return true; // 可能存在
    }
    
    // 新增方法：移除元素
    public void remove(String element) {
        int[] hashValues = hashStrategy.getHashValues(element, size);
        
        for (int hashValue : hashValues) {
            if (counters[hashValue] > 0) {
                counters[hashValue]--; // 减少计数
            }
        }
    }
    
    // 新增方法：获取估计计数
    public int approximateCount(String element) {
        int[] hashValues = hashStrategy.getHashValues(element, size);
        int minCount = Integer.MAX_VALUE;
        
        for (int hashValue : hashValues) {
            minCount = Math.min(minCount, counters[hashValue]);
        }
        
        return minCount;
    }
}

// 5. 布隆过滤器工厂（工厂模式的核心）
public class BloomFilterFactory {
    // 原有方法
    public static BloomFilter createFilter(int size, HashStrategy hashStrategy) {
        return new StandardBloomFilter(size, hashStrategy);
    }
    
    // 新增方法：创建特定类型的过滤器
    public static BloomFilter createFilter(int size, HashStrategy hashStrategy, FilterType type) {
        switch (type) {
            case STANDARD:
                return new StandardBloomFilter(size, hashStrategy);
            case COUNTING:
                return new CountingBloomFilter(size, hashStrategy);
            // 将来还可以添加更多类型
            default:
                return new StandardBloomFilter(size, hashStrategy);
        }
    }
    
    // 直接创建计数布隆过滤器的便捷方法
    public static CountingBloomFilter createCountingFilter(int size, HashStrategy hashStrategy) {
        return new CountingBloomFilter(size, hashStrategy);
    }
}

// 枚举类定义过滤器类型
public enum FilterType {
    STANDARD,
    COUNTING
}


// 6. 使用示例
public class UserRegistrationSystem {
    public static void main(String[] args) {
        // 创建哈希策略
        HashStrategy simpleStrategy = new SimpleHashStrategy();
        
        // 使用工厂创建布隆过滤器 - 用于存储已注册的用户名
        BloomFilter usernameFilter = BloomFilterFactory.createFilter(10000, simpleStrategy, FilterType.STANDRAD);
        
        // 添加一些已注册的用户名
        usernameFilter.add("john_doe");
        usernameFilter.add("jane_smith");
        usernameFilter.add("mike_jones");
        
        // 检查用户名是否可能已经存在
        String newUsername = "john_doe";
        if (usernameFilter.mightContain(newUsername)) {
            System.out.println("用户名'" + newUsername + "'可能已存在，请尝试其他用户名");
        } else {
            System.out.println("用户名'" + newUsername + "'可用!");
            usernameFilter.add(newUsername);
        }
        
        // 尝试一个新用户名
        newUsername = "alex_wilson";
        if (usernameFilter.mightContain(newUsername)) {
            System.out.println("用户名'" + newUsername + "'可能已存在，请尝试其他用户名");
        } else {
            System.out.println("用户名'" + newUsername + "'可用!");
            usernameFilter.add(newUsername);
        }
        
        // 使用不同的哈希策略创建新的布隆过滤器
        HashStrategy murmurStrategy = new MurmurHashStrategy();
        BloomFilter advancedFilter = BloomFilterFactory.createFactory(20000, murmurStrategy);


        // 使用新的过滤器：计数布隆管滤器：
        // 创建一个计数布隆过滤器
		BloomFilter countingFilter = BloomFilterFactory.createFilter(
		    10000, 
		    new MurmurHashStrategy(), 
		    FilterType.COUNTING
		);
		
		// 或者使用专门的便捷方法
		CountingBloomFilter countingFilter = BloomFilterFactory.createCountingFilter(
		    10000, 
		    new MurmurHashStrategy()
		);
		
		// 添加元素
		countingFilter.add("word1");
		countingFilter.add("word1");  // 添加两次
		countingFilter.add("word2");
		
		// 检查元素
		System.out.println("'word1'存在吗? " + countingFilter.mightContain("word1"));
		
		// 使用计数布隆过滤器特有的功能
		System.out.println("'word1'的估计出现次数: " + countingFilter.approximateCount("word1"));
		
		// 移除元素
		countingFilter.remove("word1");
		System.out.println("移除一次后,'word1'的估计出现次数: " + countingFilter.approximateCount("word1"));

    }
}

工厂模式的优点

工厂模式提供了一种封装对象创建逻辑的方法，客户端代码不需要知道具体实例化哪个类。

在我们的代码中：

BloomFilterFactory类：这是工厂模式的核心，它包含一个静态方法来创建布隆过滤器
createFilter方法：这个方法根据参数创建适当的布隆过滤器实例

工厂模式的好处是，当我们需要更改布隆过滤器的实现时（例如从StandardBloomFilter改为OptimizedBloomFilter），只需要修改工厂类，而不需要改变使用布隆过滤器的客户端代码。

工厂模式的真正好处

工厂模式的好处远不止"根据传入参数创建不同的对象"这么简单。虽然这是一个表面功能，但工厂模式的核心优势在于：

封装对象创建逻辑

最核心的好处是将"如何创建对象"的复杂过程与"使用对象"的代码分离。
java 复制代码
```
// 不用工厂模式：
BloomFilter filter = new StandardBloomFilter(10000, new SimpleHashStrategy());

// 使用工厂模式：
BloomFilter filter = BloomFilterFactory.createFilter(10000, simpleStrategy);
```
看似差别不大，但当创建过程复杂时（比如需要进行参数校验、依赖注入、初始化设置等），优势就明显了。
降低系统耦合度

客户端代码不需要直接依赖具体的布隆过滤器实现类，只需要依赖工厂和接口。
java 复制代码
```
// 客户端代码只依赖这两个接口：
BloomFilter filter = BloomFilterFactory.createFilter(...);
filter.add("something");
```
即使后来我们将StandardBloomFilter替换为OptimizedBloomFilter，客户端代码也不需要任何改动。
支持系统扩展（开闭原则）

当需要新增一种布隆过滤器实现时，只需：
- 创建新的过滤器类
- 修改工厂类以支持创建新类型
- 而所有使用布隆过滤器的代码都不需要改变，这符合"开闭原则"：对扩展开放，对修改关闭。
隐藏实现细节

客户端不需要知道布隆过滤器是如何实现的，只需要知道它有哪些功能。这样实现细节的变更不会影响客户端代码。

策略模式

策略模式是指定义一系列算法，将每个算法封装起来，并使它们可以互相替换。策略模式让算法可以独立于使用它的客户端而变化。

在上面代码中：

HashStrategy接口：这是策略模式的核心，定义了计算哈希值的方法接口

具体的策略实现：

SimpleHashStrategy：使用简单的哈希函数实现
MurmurHashStrategy：使用更复杂的Murmur哈希算法实现

策略模式允许我们随时替换哈希算法而不需要修改布隆过滤器的实现。例如，可以从简单哈希切换到性能更好的Murmur哈希，或者添加自定义的哈希策略。