LeetCode Day4 -- 哈希表

[1. 啥是哈希表？](#1. 啥是哈希表？)

[2. 啥时候用哈希表？](#2. 啥时候用哈希表？)

[2.1 存在性检查 → 集合Set](#2.1 存在性检查 → 集合Set)

[2.2 键值映射 → 字典Dict](#2.2 键值映射 → 字典Dict)

[2.3 频率统计 → Dict or Counter](#2.3 频率统计 → Dict or Counter)

[3. LeetCode](#3. LeetCode)

[3.1 集合](#3.1 集合)

[（1）2215 找出两数组的不同](#（1）2215 找出两数组的不同)

[（2）1207 独一无二的出现次数](#（2）1207 独一无二的出现次数)

[（3）128 最长连续序列](#（3）128 最长连续序列)

[3.2 字典](#3.2 字典)

[（1）49 字母异位词分组](#（1）49 字母异位词分组)

[（2）2352 相等行列对](#（2）2352 相等行列对)

[（3）1 两数之和](#（3）1 两数之和)

[3.3 统计元素出现频率](#3.3 统计元素出现频率)

[（1）1657 确定两个字符串是否接近](#（1）1657 确定两个字符串是否接近)

1. 啥是哈希表？

（1）哈希表（Hash Table）是一种通过键值对 (key-value pairs)存储数据的数据结构，核心是使用哈希函数(hash function)将键(key)映射到数组的特定位置。

（2）在Python中，哈希表通常有两种形式：set和dict。

set: 存储唯一元素的集合，不支持键值对。
dict: 存储键值对，键是唯一的。

2. 啥时候用哈希表？

快速查找：如判断元素是否在集合中（O(1)时间复杂度）。

去重：如获取唯一元素集合。

计数：利用字典统计元素出现频率。

映射关系：存储键值对，如缓存、数据库索引等。

集合运算：如求交集、并集、差集等。

2.1 存在性检查 → 集合Set

（1）应用场景：

判断元素是否存在

去重处理

唯一性验证

交/并/差集运算

（2）实现模板：

python 复制代码

def existence_check(data):
    # 1. 创建哈希集合
    seen = set()
    result = []
    
    # 2. 遍历数据
    for item in data:
        # 3. 检查或添加
        if item not in seen:      # 存在性检查
            result.append(item)
            seen.add(item)        # 记录访问
    
    # 4. 返回结果
    return result

2.2 键值映射 → 字典Dict

（1）应用场景：

索引-值映射（如两数之和）

分组统计（如字母异位词）

缓存实现

频率统计

（2）实现模板：

python 复制代码

def value_mapping(data):
    # 1. 创建哈希字典
    mapping = {}
    
    # 2. 遍历数据
    for item in data:
        # 3. 检查或更新映射
        if item not in mapping:   # 首次出现
            mapping[item] = ...   # 初始化值
        else:
            mapping[item] = ...   # 更新值
    
    # 4. 返回结果
    return mapping

2.3 频率统计 → Dict or Counter

（1）应用场景：

元素计数

Top K 问题

频率唯一性检查

字谜判断

（2）实现模板：

python 复制代码

from collections import Counter

def frequency_analysis(data):
    # 方法1：手动统计
    freq = {}
    for item in data:
        freq[item] = freq.get(item, 0) + 1  # 安全获取更新
    
    # 方法2：Counter简化版
    freq = Counter(data)
    
    # 3. 处理结果（如找频率唯一）
    unique_counts = set()
    for count in freq.values():
        if count in unique_counts:
            return False
        unique_counts.add(count)
    
    return True

3. LeetCode

3.1 集合

（1）2215 找出两数组的不同

给你两个下标从 0 开始的整数数组 nums1 和 nums2 ，请你返回一个长度为 2 的列表 answer ，其中：answer[0] 是 nums1 中所有不存在于 nums2 中的不同整数组成的列表；answer[1] 是 nums2 中所有不存在于 nums1 中的不同整数组成的列表。

python 复制代码

class Solution(object):
    def findDifference(self, nums1, nums2):
        """
        :type nums1: List[int]
        :type nums2: List[int]
        :rtype: List[List[int]]
        """
        set1 = set(nums1)
        set2 = set(nums2)

        result1 = list(set1-set2)    ## 求差集
        result2 = list(set2-set1)

        return [result1, result2]

（2）1207 独一无二的出现次数

给你一个整数数组 arr，如果每个数的出现次数都是独一无二的，就返回 true；否则返回 false

python 复制代码

class Solution(object):
    def uniqueOccurrences(self, arr):
        """
        :type arr: List[int]
        :rtype: bool
        """
        count = dict()
        seen = set()
        for num in arr:
            ## 从字典count中获取键num对应的值。如果键num不存在，则返回默认值0
            count[num] = count.get(num,0) + 1  
        for value in count.values():
            if value in seen:
                return False
            seen.add(value)
        return True

（3）128 最长连续序列

给定一个未排序的整数数组 nums ，找出数字连续的最长序列（不要求序列元素在原数组中连续）的长度。请你设计并实现时间复杂度为 O(n)的算法解决此问题。

从集合中的随机num出发，查看其左（lower）右（higher）最多延伸到哪里。

如果lower（num-1）存在，那么：

**①num_set.remove(lower)：**从集合set中移除当前lower，因为如果从新的num开始查找，则说明当前的lower是和之前的num连续，不可能和新num连续

②length++：当前连续序列长度更新

③lower--：继续找，看有没有更小的连续

python 复制代码

class Solution(object):
    def longestConsecutive(self, nums):
        """
        :type nums: List[int]
        :rtype: int
        """
        if not nums:
            return 0
        
        num_set = set(nums)
        max_len = 0

        while num_set:
            num = num_set.pop()  # 随机取一个元素
            length = 1

            lower = num-1       ## 向更小方向扩展
            while lower:
                num_set.remove(lower)   
                length += 1
                lower -= 1
            
            higher = num+1      ## 向更大方向扩展
            while higher:
                num_set.remove(higher)
                length += 1
                higher += 1

            max_len = max(max_len, length)

        return max_len

3.2 字典

（1）49 字母异位词分组

给你一个字符串数组，请你将字母异位词（通过重新排列不同单词或短语的字母而形成的单词或短语，并使用所有原字母一次）组合在一起。可以按任意顺序返回结果列表。

输入: strs = $"eat", "tea", "tan", "ate", "nat", "bat"$

输出: $\["bat"$ , $"nat","tan"$ , $"ate","eat","tea"$ ]

字母异位词：包含的字母及其个数完全相同 → 排序后应是完全相同的字符串

创建列表类型的字典，字典的 key 是排好序的字符串，value 是对应的原字符串组成的

（2）2352 相等行列对

给你一个下标从 0 开始、大小为 n x n 的整数矩阵 grid ，返回满足 Ri行和Cj列相等的行列对(Ri, Cj)的数目*。*如果行和列以相同的顺序包含相同的元素（即相等的数组），则认为二者是相等的。

1.预处理所有行：将每一行转换为可哈希对象（如元组），存储到字典中记录出现次数

2.处理列并匹配：遍历每一列，将其转换为相同格式的可哈希对象，检查是否在行字典中存在，存在则累加次数

python 复制代码

from collections import defaultdict
class Solution(object):
    def equalPairs(self, grid):
        """
        :type grid: List[List[int]]
        :rtype: int
        """
        n=len(grid)
        row_dict = defaultdict(int)

        for i in range(n):
            row_tuple = tuple(grid[i])  ## 将每一行转换为元组（可哈希对象）
            row_dict[row_tuple] += 1    ## 记录每个行元组出现的次数

        
        count = 0
        for j in range(n):
            ## 对每一列：提取各行的第 j 个元素形成元组
            col_tuple = tuple(grid[i][j] for i in range(n))     
            if col_tuple in row_dict:           ## 检查列元组是否在行字典中
                count += row_dict[col_tuple]    ## 如果存在，累加该行元组的出现次数

        return count

（3）1 两数之和

给定一个整数数组 nums 和一个整数目标值 target，请你在该数组中找出 和为目标值 target 的那两个整数，并返回它们的数组下标。你可以假设每种输入只会对应一个答案，并且你不能使用两次相同的元素。你可以按任意顺序返回答案。

两数之和 → 根据当前 num 找其和 target 的差值是否存在于数组中

返回的是下标 → 通过哈希表 {num：index}实现，让数组数值和下标一一对应

python 复制代码

class Solution(object):
    def twoSum(self, nums, target):
        """
        :type nums: List[int]
        :type target: int
        :rtype: List[int]
        """
        num_map = {}    ## 创建哈希表{num：index}
        for i, num in enumerate(nums):
            x = target - nums[i]
            if x in num_map:        ## 要找的数在哈希表里，直接返回对应索引
                return [i, num_map[x]]
            num_map[num] = i        ## 将当前num和其index加入哈希表

3.3 统计元素出现频率

（1）1657 确定两个字符串是否接近

如果可以使用以下操作从一个字符串得到另一个字符串，则认为两个字符串接近：

你可以根据需要对任意一个字符串多次使用这两种操作。给你两个字符串，word1 和 word2 。如果word1和word2接近，就返回 true ；否则，返回false。

两个字符串接近的条件：

两个字符串的长度必须相同（因为操作不会改变长度）。

两个字符串包含的字符种类必须相同（即字符集合相同）。

两个字符串的字符频率排序后必须相同（因为操作2允许我们重新分配字符，所以只要频率的集合相同，就可以通过操作2将一种字符的频率分配给另一种字符）。

python 复制代码

from collections import Counter
class Solution(object):
    def closeStrings(self, word1, word2):
        """
        :type word1: str
        :type word2: str
        :rtype: bool
        """
        if len(word1) != len(word2):
            return False
        
        if set(word1) != set(word2):
            return False

        count1 = Counter(word1)    ## 使用Counter统计字符频率
        count2 = Counter(word2)
        if sorted(count1.values()) == sorted(count2.values()):
            return True
        else: 
            return False