socc19 echash部分代码讲解二 hashtable

前言：这次来讲解一下一些具体的实现细节

hash_node作用

cpp 复制代码

struct hash_node  // 链表的形式改写
{
    char *key;
    uint64_t  value;

    struct hash_node *next;
};

hash_node 是一个记录对象索引的哈希结构，用来建立 key 到其存储位置的映射。

key 是字符串形式的对象标识；

value 是 64 位压缩信息：包括所在 chunk ID、在 chunk 中的位置、value 长度、key 长度；

用于快速查找 key 存在的位置，并辅助后续的读取或删除操作。

好的，下面是你提供的代码的逐行详细解释，包括每个函数的作用及关键逻辑：

下面我们来看看hashtable的具体代码

创建一个编码的 value

c 复制代码

uint64_t create_value(uint32_t key_length, uint32_t chunk_id, uint32_t position, uint32_t length)
{
    uint64_t tmp = 0;
    tmp = tmp | length;
    tmp = tmp | ((uint64_t)position << 12);
    tmp = tmp | ((uint64_t)chunk_id << 24);
    tmp = tmp | ((uint64_t)key_length << 56);
    return tmp;
}

将 key 的长度、chunk ID、在 chunk 中的位置和长度压缩编码为一个 64 位整数，方便作为 value 存储。

初始化哈希表

c 复制代码

void hash_table_init(struct hash_node *hash_table[])
{
    memset(hash_table, 0, sizeof(struct hash_node *) * HASH_MAX_SIZE);
}

初始化哈希表，将所有桶位置设为 NULL。

销毁哈希表

c 复制代码

void hash_table_destory(struct hash_node *hash_table[])
{
    int i = 0;
    for(i = 0; i < HASH_MAX_SIZE; i++)
    {
        struct hash_node *p = hash_table[i], *pre = p;
        while(p)
        {
            pre = p;
            p = p->next;
            free(pre);
        }
    }
}

遍历整个哈希表并释放每个链表节点的内存。

初始化一个 hash_node 结构

c 复制代码

struct hash_node *hash_node_init(const char *key, const uint64_t value)
{
    struct hash_node *hn = NULL;
    hn = (struct hash_node *)malloc(sizeof(struct hash_node));
    hn->key = (char *)malloc(sizeof(char) * strlen(key) + 1);
    strcpy(hn->key, key);
    hn->key[strlen(key)] = '\0';
    hn->value = value;
    hn->next = NULL;
    return hn;
}

创建并返回一个新的 hash_node，存储 key 和 value（被编码过的元信息）。

显示一个 hash_node 的信息

c 复制代码

char *display(struct hash_node *hn)
{
    char *s = (char *)malloc(10000 * sizeof(char));
    int n = sprintf(s, "%c%s%s%llu%c", '[', hn->key, "]:[", hn->value, ']');
    s[n] = '\0';
    return s;
}

返回一个字符串表示 [key]:[value] 格式，用于调试打印。

哈希函数

c 复制代码

uint32_t hash_coding(const char *key)
{
    uint32_t hv = hashkit_jenkins(key, strlen(key), 0);
    return hv;
}

使用 libhashkit 的 Jenkins 哈希函数对 key 进行哈希。

在哈希表中查找 key

c 复制代码

struct hash_node *find_hash_table(struct hash_node *hash_table[], const char *key, uint32_t hv)
{
    if(key == NULL)
        return NULL;
    struct hash_node *hn = hash_table[hv % HASH_MAX_SIZE];
    while(hn)
    {
        if(strcmp(key, hn->key) == 0)
            return hn;
        hn = hn->next;
    }
    return NULL;
}

给定哈希值和 key，查找匹配的节点。

获取 value

c 复制代码

uint64_t get_value_hash_table(struct hash_node *hash_table[], const char *key)
{
    if(key == NULL)
        return 0;
    uint32_t hv = hashkit_jenkins(key, strlen(key), 0);
    struct hash_node *hn = hash_table[hv % HASH_MAX_SIZE];
    while(hn)
    {
        if(strcmp(key, hn->key) == 0)
            return hn->value;
        hn = hn->next;
    }
    return 0;
}

与 find_hash_table 类似，但只返回 value，不返回指针。

插入节点（已知哈希值）

有点像前式链向星

c 复制代码

void insert_hash_table_hv(struct hash_node *hash_table[], struct hash_node *hn, uint32_t hv)
{
    if(hn == NULL)
        return;
    hn->next = hash_table[hv % HASH_MAX_SIZE];
    hash_table[hv % HASH_MAX_SIZE] = hn;
}

将节点插入链表头部（哈希冲突使用链表解决）。

插入节点（使用 key 自动算哈希值）

c 复制代码

void insert_hash_table_key(struct hash_node *hash_table[], struct hash_node *hn, const char *key)
{
    uint32_t hv = hashkit_jenkins(key, strlen(key), 0);
    if(hn == NULL)
        return;
    hn->next = hash_table[hv % HASH_MAX_SIZE];
    hash_table[hv % HASH_MAX_SIZE] = hn;
}

与上一个函数功能一样，区别在于自动计算哈希值。

删除指定 key 的节点

c 复制代码

int del_hash_table_key(struct hash_node *hash_table[], const char *key)
{
    if(key == NULL)
        return 1;
    uint32_t hv = hashkit_jenkins(key, strlen(key), 0);
    struct hash_node *hn = hash_table[hv % HASH_MAX_SIZE];

    if(hn && strcmp(key, hn->key) == 0)
    {
        hash_table[hv % HASH_MAX_SIZE] = hn->next;
    }
    else
    {
        struct hash_node *pre = hn;
        while(hn)
        {
            if(strcmp(key, hn->key) == 0)
                break;
            pre = hn;
            hn = hn->next;
        }
        if(hn == NULL)
            return 1;
        pre->next = hn->next;
    }
    return 0;
}

删除 key 对应的节点，返回 0 表示成功，1 表示没找到。

总结

这段代码实现了一个基础但功能完整的哈希表结构：

使用 Jenkins 哈希函数；
支持插入、查找、删除；
value 被编码为位置、长度等信息；
在 ECHash 框架中作为元数据索引使用；
用于追踪某个键在哪个 chunk 中、在哪个偏移、长度多少等关键信息。

socc19 echash部分代码讲解 二 hashtable

hash_node作用

创建一个编码的 value

初始化哈希表

销毁哈希表

初始化一个 hash_node 结构

显示一个 hash_node 的信息

哈希函数

在哈希表中查找 key

获取 value

插入节点（已知哈希值）

插入节点（使用 key 自动算哈希值）

删除指定 key 的节点

总结

socc19 echash部分代码讲解二 hashtable