初级数据结构——哈希表

前言

这一期我们一起学习初级数据结构最后一篇内容，初级数据结构------哈希表。数据结构中的哈希表（Hash Table，也叫散列表）是一种高效的数据结构，它提供了快速的插入、删除和查找操作。以下是对哈希表的详细分析：

一、哈希表的基本概念

哈希表是根据关键码值（Key value）而直接进行访问的数据结构。它通过把关键码值映射到表中一个位置来访问记录，以加快查找的速度。这个映射函数叫做散列函数，存放记录的数组叫做散列表。

二、哈希表的构成

哈希表数组：一个固定大小的数组，用于存储数据。数组的大小通常为质数，以减少潜在的哈希冲突。

哈希函数：一个将关键字映射到哈希表数组中某个位置的函数。哈希函数的选择对哈希表的性能至关重要。常见的哈希函数实现方式包括除法散列法、平方散列法和斐波那契散列法等。

存储槽：哈希表数组中的每个元素称为存储槽，用于存储键值对。

三、哈希表的工作原理

插入操作：计算关键字的哈希值，根据哈希值找到对应的存储槽。如果该槽为空，则直接将关键字和值插入到该槽中。如果该槽已被占用，则根据冲突解决策略处理（如使用链表法或开放寻址法）。

查找操作：计算关键字的哈希值，根据哈希值找到对应的存储槽。如果该槽为空，则表示关键字不存在。如果该槽已被占用，则根据冲突解决策略查找并返回对应的关键字和值。

删除操作：计算关键字的哈希值，找到对应的存储槽，并删除该槽中的键值对。但需要注意的是，在删除时可能需要处理冲突，以确保哈希表的正确性。

四、哈希表的冲突解决

哈希冲突（Hash Collision）是指不同的关键字通过同一个哈希函数可能得到同一哈希地址，即key1 ≠ key2，而Hash(key1) = Hash(key2)。为了处理哈希冲突，通常有两种方法：

链表法：在每个存储槽中维护一个链表，所有映射到该槽的关键字都存储在链表中。当发生冲突时，将新的关键字插入到对应槽的链表中。

开放寻址法：寻找空闲位置来存储冲突的元素。开放寻址法包括线性探测、二次探测、双散列和再哈希法等方法。

五、哈希表的优缺点

优点：

哈希表能够快速地处理大量的数据，提供快速的插入、删除和查找操作。

哈希表的实现相对简单，代码易于理解和维护。

缺点：

哈希表是基于数组的，数组创建后扩容成本比较高，因此当哈希表被填满时，性能会严重下降。

哈希表中的数据都是没有顺序的，因此不支持顺序遍历。

哈希函数的选择对哈希表的性能至关重要，但设计一个高效的哈希函数并不容易。

六、哈希表的应用场景

哈希表在许多场景中都有广泛应用，以下是一些常见的应用场景：

数据库索引：哈希表可以用于加速数据库查询操作，提高查询性能。

缓存系统：哈希表可以用于实现缓存系统，如Redis等，提高数据访问速度。

编译器：哈希表可以用于实现编译器的符号表，加速变量查找和替换操作。

字典和集合：哈希表可以用于实现字典和集合数据结构，提高查找、插入和删除操作的性能。

七、代码模版

cpp 复制代码

#include<iostream>
using namespace std;

template<typename KeyType,typename ValueType>
class HashNode {
public:
	KeyType key;
	ValueType value;
	HashNode* next;

	HashNode(const KeyType& key, const ValueType& value) {//有参构造函数
		this->key = key;
		this->value = value;
		this->next = NULL;
	}

};

template<typename KeyType, typename ValueType>
class HashTable {
	int size;
	HashNode<KeyType, ValueType>** table;//相当于一个二维数组

	int hash(const KeyType& key) const {//哈希表的索引位置,默认size参数为256
		int hashkey = key % size;//取余求索引
		if (hashkey < 0) {
			hashkey += size;
		}
		return hashkey;
	}
public:
	HashTable(int size = 256);//代表哈希表的长度为256
	~HashTable();
	void insert(const KeyType& key, const ValueType& value);//插入元素
	void remove(const KeyType& key);//删除元素
	bool find(const KeyType& key, ValueType& value) const;//查找元素

};

template<typename KeyType, typename ValueType>
HashTable<KeyType,ValueType>::HashTable(int size) {
	this->size = size;
	this->table = new HashNode<KeyType, ValueType>* [size];
	for (int i = 0; i < size; i++) {
		this->table[i] = NULL;//哈希表里面的元素要致空
	}
}

template<typename KeyType, typename ValueType>
HashTable<KeyType, ValueType>::~HashTable() {
	for (int i = 0; i < size; i++) {
		if (table[i]) {
			HashNode<KeyType, ValueType>* curr = table[i];
			while (curr) {
				HashNode<KeyType, ValueType>* t = curr->next;
				delete curr;
				curr = t;
			}
			table[i] = NULL;
		}
	}
}

template<typename KeyType, typename ValueType>
void HashTable<KeyType, ValueType>::insert(const KeyType& key, const ValueType& value) {
	int idx = hash(key);//先计算键的索引
	HashNode<KeyType, ValueType>* now = new HashNode<KeyType, ValueType>(key, value);
	if (!table[idx]) {//该索引对应的链表没有节点就把该节点变成头结点
		table[idx] = now;
	}
	else {
		now->next = table[idx];//头插法
		table[idx] = now;
	}
}

template<typename KeyType, typename ValueType>
void HashTable<KeyType, ValueType>::remove(const KeyType& key) {
	int idx = hash(key);
	if (table[idx]) {//如果该索引的链表不为空
		if (table[idx]->key == key) {
			HashNode<KeyType, ValueType>* t = table[idx]->next;
			delete table[idx];
			table[idx] = t;
		}
		else {
			HashNode<KeyType, ValueType>* curr = table[idx];
			while (curr->next && curr->next->key != key) {
				curr = curr->next;
			}
			if (curr->next) {
				HashNode<KeyType, ValueType>* t = curr->next->next;
				delete curr->next;
				curr->next = t;
			}
		}
	}
}

template<typename KeyType, typename ValueType>
bool HashTable<KeyType, ValueType>::find(const KeyType& key, ValueType& value) const {
	int idx = hash(key);
	if (table[idx]) {
		if (table[idx]->key == key) {
			value = table[idx]->value;
			return true;
		}
		else {
			HashNode<KeyType, ValueType>* curr = table[idx];
			while (curr->next && curr->next->key != key) {
				curr = curr->next;
			}
			if (curr->next) {
				value = curr->next->value;
				return true;
			}
		}
	}
	return false;
}

template<typename KeyType>
class hashCounter {//哈希表计数器,记录一个键有多少个元素多少个元素
	int* counter;
	int counterIndex;
	int counterSize;
	HashTable<KeyType, int>* hash;

public:
	hashCounter(int size = 256);
	~hashCounter();
	void reset();
	int add(const KeyType& key);
	int sub(const KeyType& key);
	int get(const KeyType& key);
};

template<typename KeyType>
hashCounter<KeyType>::hashCounter(int size) {
	counterSize = size;
	counterIndex = 0;
	counter = new int[counterSize];
	hash = NULL;
	reset();//重新申请内存空间
}

template<typename KeyType>
hashCounter<KeyType>::~hashCounter() {
	delete[]counter;
	if (hash) {
		delete hash;
		hash = NULL;
	}
}

template<typename KeyType>
void hashCounter<KeyType>::reset() {
	if (hash) {
		delete hash;
		hash = NULL;
	}
	hash = new HashTable<KeyType, int>(counterSize);
	counterIndex = 0;
	for (int i = 0; i < counterSize; i++) {
		counter[i] = 0;
	}
}

template<typename KeyType>
int hashCounter<KeyType>::add(const KeyType& key) {
	int idx;
	if (!hash->find(key, idx)) {//如果在哈希表里找不到这个键说明键的个数加一
		idx = counterIndex++;
		hash->insert(key, idx);//这里要好好理解，用来标记key所以的value的值
	}
	return ++counter[idx];//记录每一个key链表有多少个元素，加元素就加一
}

template<typename KeyType>
int hashCounter<KeyType>::sub(const KeyType& key) {
	int idx;
	if (hash->find(key, idx)) {
		return --counter[idx];//减元素就减一
	}
	return 0;
}

template<typename KeyType>
int hashCounter<KeyType>::get(const KeyType& key) {
	int idx;
	if (hash->find(key, idx)) {
		return counter[idx];
	}
	return 0;
}


int main() {
	hashCounter<long long>h(1000);
	for (int i = 0; i < 5; i++) {
		h.add(1);
	}
	h.add(2);
	h.add(2);
	h.sub(1);
	cout << h.get(1) << endl;
	return 0;
}

八、总结

综上所述，哈希表是一种高效的数据结构，具有快速的插入、删除和查找操作。但需要注意的是，哈希表也有其缺点和局限性，因此在实际应用中需要根据具体场景和需求进行选择和优化。

九、结语

下期我们一起来做有关哈希表的经典例题，一共二十多道，巩固一下知识点，加强我们对哈希表的应用能力，更完下期初级数据结构也算是完结了，我们下期不见不散。

关注我让我们一起学习编程，希望大家能点赞关注支持一下，让我有持续更新的动力，谢谢大家