第七章查找

查找 { 基本概念：静态查找、动态查找线性结构 { 顺序查找折半查找分块查找树形结构 { 二叉排序树二叉平衡树红黑树 B 树、 B + 树散列结构------散列表 { 性能分析冲突处理效率指标------平均查找长度 { 查找成功查找失败查找\left\{ \begin{array}{l} 基本概念：静态查找、动态查找\\ 线性结构\left\{ \begin{array}{l} 顺序查找\\ 折半查找\\ 分块查找 \end{array} \right.\\ 树形结构\left\{ \begin{array}{l} 二叉排序树\\ 二叉平衡树\\ 红黑树\\ B树、B+树 \end{array} \right.\\ 散列结构------散列表\left\{ \begin{array}{l} 性能分析\\ 冲突处理 \end{array} \right.\\ 效率指标------平均查找长度\left\{ \begin{array}{l} 查找成功\\ 查找失败 \end{array} \right. \end{array} \right. 查找⎩ ⎨ ⎧基本概念：静态查找、动态查找线性结构⎩ ⎨ ⎧顺序查找折半查找分块查找树形结构⎩ ⎨ ⎧二叉排序树二叉平衡树红黑树B树、B+树散列结构------散列表{性能分析冲突处理效率指标------平均查找长度{查找成功查找失败

文章目录

- [7.1 查找的基本概念](#7.1 查找的基本概念)
- [7.2 顺序查找和折半查找](#7.2 顺序查找和折半查找)
- - [7.2.1 顺序查找](#7.2.1 顺序查找)
  - [7.2.2 折半查找](#7.2.2 折半查找)
  - - [例一 Let's Make a Convex!](#例一 Let's Make a Convex!)
  - [7.2.3 分块查找](#7.2.3 分块查找)
  - - [例二数列分块入门 4](#例二数列分块入门 4)
- [7.3 树形查找](#7.3 树形查找)
- - [7.3.1 二叉排序树](#7.3.1 二叉排序树)
  - [7.3.2 平衡二叉树](#7.3.2 平衡二叉树)
  - - [例三普通平衡树](#例三普通平衡树)
  - [7.3.3 红黑树](#7.3.3 红黑树)
  - [7.3.4 笛卡尔树](#7.3.4 笛卡尔树)
  - - [例四树的序](#例四树的序)
    - [例五二叉搜索树的结构](#例五二叉搜索树的结构)
- [7.4 B树和B+树](#7.4 B树和B+树)
- - [7.4.1 B树及其基本操作](#7.4.1 B树及其基本操作)
  - [7.4.2 B+树的基本概念](#7.4.2 B+树的基本概念)
- [7.5 散列表（Hash）表](#7.5 散列表（Hash）表)
- - [7.5.1 散列表的基本概念](#7.5.1 散列表的基本概念)
  - [7.5.2 散列函数的构造方法](#7.5.2 散列函数的构造方法)
  - [7.5.3 处理冲突的方法](#7.5.3 处理冲突的方法)
  - [7.5.4 散列查找及性能分析的应用](#7.5.4 散列查找及性能分析的应用)
- 参考文献

7.1 查找的基本概念

ASL是衡量查找算法效率的核心指标。

7.2 顺序查找和折半查找

7.2.1 顺序查找

c 复制代码

typedef struct {        // 查找表的数据结构（顺序表）
    ElemType * elem;    // 动态数组基址
    int TableLen;       // 表的长度
} SSTable;

int Search_Seq(SSTable ST, ElemType key) {
    ST.elem[0] = key;                               // "哨兵"
    for (i = ST.TableLen; ST.elem[i] != key; i--);  // 从后往前找
    return i;   // 若查找成功，则返回元素下标；若查找失败，则返回0
}

7.2.2 折半查找

c 复制代码

int Binary_Search(SSTable L, ElemType key) {
    int low = 0, high = L.TableLen - 1, mid;
    while (low <= high) {
        mid = (low + high) / 2;         // 取中间位置
        if (L.elem[mid] == key)
            return mid;                 // 查找成功则返回所在位置
        else if (L.elem[mid] > key)
            high = mid - 1;             // 从前半部分继续查找
        else
            low = mid + 1;              // 从后半部分继续查找
    }
    return -1;                          // 查找失败，返回-1
}

A S L = l o g 2 ( n + 1 ) − 1 ASL = log_2(n + 1) - 1 ASL=log2(n+1)−1

例一 Let's Make a Convex!

Let's Make a Convex! - Problem - QOJ.ac

c 复制代码

#include <bits/stdc++.h>
using namespace std;
typedef long long ll;

const int N = 200010;

ll n;
ll a[N], b[N], res[N];

ll find(ll id)
{
    ll l = 0, r = id - 2;
    while (l < r)
    {
        ll mid = l + r + 1 >> 1;
        if (b[id - 1] - b[mid - 1] <= a[id]) r = mid - 1;
        else l = mid;
    }
    return l;
}

void solve()
{
    cin >> n;
    for (int i = 1; i <= n; i++) cin >> a[i], res[i] = 0;
    sort(a + 1, a + n + 1);
    for (int i = 1; i <= n; i++) b[i] = b[i - 1] + a[i];
    for (int i = 1; i <= n; i++)
    {
        ll p = find(i);
        if (p == 0) continue;
        res[i - p + 1] = max(res[i - p + 1], p);
    }

    for (int i = 1; i <= n; i++) res[i] = max(res[i], res[i - 1] - 1);
    for (int i = 1; i <= n; i++)
        if (res[i]) cout << b[res[i] + i - 1] - b[res[i] - 1] << ' ';
        else cout << 0 << ' ';
    cout << endl;
}

int main()
{
    cin.tie(0);
    ios::sync_with_stdio(false);

    ll Case = 1;
    cin >> Case;
    while (Case --> 0) solve();

    return 0;
}

7.2.3 分块查找

将长度为 n n n 的查找表均匀地分为 b b b 块，每块包含 s s s 个记录（ n = b s n = bs n=bs）

A S L = L I + L S = b + 1 2 + s + 1 2 = s 2 + 2 s + n 2 s ASL = L_I + L_S = \frac {b + 1} {2} + \frac {s + 1} {2} = \frac {s ^ 2 + 2s + n} {2s} ASL=LI+LS=2b+1+2s+1=2ss2+2s+n

例二数列分块入门 4

#6280. 数列分块入门 4 - 题目 - LibreOJ

c 复制代码

#include <bits/stdc++.h>
using namespace std;
typedef long long ll;

const int N = 50010;

ll n;
ll id[N], len;
ll a[N], b[N], s[N];

void add(ll l, ll r, ll x)
{
    ll sid = id[l], eid = id[r];
    if (sid == eid)
    {
        for (int i = l; i <= r; i++) a[i] += x, s[sid] += x;
        return;
    }
    for (int i = l; id[i] == sid; i++) a[i] += x, s[sid] += x;
    for (int i = sid + 1; i < eid; i++) b[i] += x, s[i] += len * x;
    for (int i = r; id[i] == eid; i--) a[i] += x, s[eid] += x;
}

ll query(ll l, ll r, ll p)
{
    ll sid = id[l], eid = id[r];
    ll ans = 0;
    if (sid == eid)
    {
        for (int i = l; i <= r; i++) ans = (ans + a[i] + b[sid]) % p;
        return ans;
    }
    for (int i = l; id[i] == sid; i++) ans = (ans + a[i] + b[sid]) % p;
    for (int i = sid + 1; i < eid; i++) ans = (ans + s[i]) % p;
    for (int i = r; id[i] == eid; i--) ans = (ans + a[i] + b[eid]) % p;
    return ans;
}

void solve()
{
    cin >> n;
    len = sqrt(n);
    for (int i = 1; i <= n; i++)
    {
        cin >> a[i];
        id[i] = (i - 1) / len + 1;
        s[id[i]] += a[i];
    }
    for (int i = 1; i <= n; i++)
    {
        ll op, l, r, c;
        cin >> op >> l >> r >> c;
        if (op == 0) add(l, r, c);
        else cout << query(l, r, c + 1) << endl;
    }
}

int main()
{
    cin.tie(0);
    ios::sync_with_stdio(false);

    ll Case = 1;
    while (Case --> 0) solve();

    return 0;
}

7.3 树形查找

7.3.1 二叉排序树

二叉排序树的查找

c 复制代码

BSTNode * BST_Search(BiTree T, ElemType key) {
    while (T != NULL && key != T->data) {
        if (key < T->data) T = T->lchild;
        else T = T->rchild;
    }
    return T;
}

二叉排序树的插入

c 复制代码

int BST_Insert(BiTree & T, KeyType k) {
    if (T == NULL) {
        T = (BiTree) malloc(sizeof(BSTNode));
        T->data = k;
        T->lchild = T->rchild = NULL;
        return 1;
    }
    else if (k == T->data) return 0;
    else if (k < T->data) return BST_Insert(T->lchild, k);
    else return BST_Insert(T->rchild, k);
}

二叉排序树的构造

c 复制代码

void Creat_BST(BiTree & T, KeyType str[], int n) {
    T = NULL;
    int i = 0;
    while (i < n) {
        BST_Insert(T, str[i]);
        i++;
    }
}

7.3.2 平衡二叉树

例三普通平衡树

P3369 【模板】普通平衡树 - 洛谷

c 复制代码

#include <bits/stdc++.h>
using namespace std;

const int N = 100010, INF = 1e8;

int n;
struct Node
{
    int l, r;
    int key, val;
    int cnt, siz;
} tr[N];

int root, idx;

void pushup(int p)
{
    tr[p].siz = tr[tr[p].l].siz + tr[tr[p].r].siz + tr[p].cnt;
}

int get_node(int key)
{
    tr[++idx].key = key;
    tr[idx].val = rand();
    tr[idx].cnt = tr[idx].siz = 1;
    return idx;
}

void zig(int &p)    // 右旋
{
    int q = tr[p].l;
    tr[p].l = tr[q].r, tr[q].r = p, p = q;
    pushup(tr[p].r), pushup(p);
}

void zag(int &p)    // 左旋
{
    int q = tr[p].r;
    tr[p].r = tr[q].l, tr[q].l = p, p = q;
    pushup(tr[p].l), pushup(p);
}

void build()
{
    get_node(-INF), get_node(INF);
    root = 1, tr[1].r = 2;
    pushup(root);
    
    if (tr[1].val < tr[2].val) zag(root);
}

void insert(int &p, int key)
{
    if (!p) p = get_node(key);
    else if (tr[p].key == key) tr[p].cnt++;
    else if (tr[p].key > key)
    {
        insert(tr[p].l, key);
        if (tr[tr[p].l].val > tr[p].val) zig(p);
    }
    else
    {
        insert(tr[p].r, key);
        if (tr[tr[p].r].val > tr[p].val) zag(p);
    }
    pushup(p);
}

void remove(int &p, int key)
{
    if (!p) return;
    else if (tr[p].key == key)
    {
        if (tr[p].cnt > 1) tr[p].cnt--;
        else if (tr[p].l || tr[p].r)
        {
            if (!tr[p].r || tr[tr[p].l].val > tr[tr[p].r].val)
            {
                zig(p);
                remove(tr[p].r, key);
            }
            else
            {
                zag(p);
                remove(tr[p].l, key);
            }
        }
        else p = 0;
    }
    else if (tr[p].key > key) remove(tr[p].l, key);
    else remove(tr[p].r, key);
    
    pushup(p);
}

int get_rank_by_key(int p, int key)     // 通过数值找排名
{
    if (!p) return 0;
    if (tr[p].key == key) return tr[tr[p].l].siz;
    if (tr[p].key > key) return get_rank_by_key(tr[p].l, key);
    return tr[tr[p].l].siz + tr[p].cnt + get_rank_by_key(tr[p].r, key);
}

int get_key_by_rank(int p, int rank)    // 通过排名找数值
{
    if (!p) return INF;
    if (tr[tr[p].l].siz >= rank) return get_key_by_rank(tr[p].l, rank);
    if (tr[tr[p].l].siz + tr[p].cnt >= rank) return tr[p].key;
    return get_key_by_rank(tr[p].r, rank - tr[tr[p].l].siz - tr[p].cnt);
}

int get_prev(int p, int key)    // 找到严格小于key的最大数
{
    if (!p) return -INF;
    if (tr[p].key >= key) return get_prev(tr[p].l, key);
    return max(tr[p].key, get_prev(tr[p].r, key));
}

int get_next(int p, int key)    // 找到严格大于key的最小数
{
    if (!p) return INF;
    if (tr[p].key <= key) return get_next(tr[p].r, key);
    return min(tr[p].key, get_next(tr[p].l, key));
}

int main()
{
    build();
    
    cin >> n;
    while (n--)
    {
        int opt, x;
        cin >> opt >> x;
        if (opt == 1) insert(root, x);
        else if (opt == 2) remove(root, x);
        else if (opt == 3) cout << get_rank_by_key(root, x) << endl;
        else if (opt == 4) cout << get_key_by_rank(root, x + 1) << endl;
        else if (opt == 5) cout << get_prev(root, x) << endl;
        else cout << get_next(root, x) << endl;
    }
    
    return 0;
}

7.3.3 红黑树

每个结点或是红色，或是黑色的。
根结点是黑色的。
叶结点（虚构的外部结点、NULL结点）都是黑色的。
不存在两个相邻的红结点（红结点的父结点和孩子结点均是黑色的）。
对每个结点，从该结点到任意一个叶结点的简单路径上，所含黑结点的数量相同。

7.3.4 笛卡尔树

例四树的序

P1377 $TJOI2011$ 树的序 - 洛谷

c 复制代码

#include <bits/stdc++.h>
using namespace std;
typedef long long ll;

const int N = 100010;

ll n;
ll root;
ll a[N], r[N], l[N];

ll stk[N];
void build()
{
	for (int i = 1, top = 0, pos = 0; i <= n; i++)
	{
		pos = top;
		while (pos > 0 && a[stk[pos]] > a[i]) pos--;
		if (pos > 0) r[stk[pos]] = i;
		if (pos < top) l[i] = stk[pos + 1];
		stk[++pos] = i;
		top = pos;
	}
	root = stk[1];
}

vector<ll> res;
void dfs(ll u)
{
	res.push_back(u);
	if (l[u] != 0) dfs(l[u]);
	if (r[u] != 0) dfs(r[u]);
}

void solve()
{
	cin >> n;
	for (int i = 1, t; i <= n; i++) cin >> t, a[t] = i;
	build();
	dfs(root);
	for (auto i : res) cout << i << ' ';
	cout << endl;
}

int main()
{
	ios::sync_with_stdio(0);
	cin.tie(0);
	
	ll Case = 1;
	while (Case --> 0) solve();

	return 0;
}

例五二叉搜索树的结构

L3-016 二叉搜索树的结构 - 团体程序设计天梯赛-练习集

c 复制代码

#include <bits/stdc++.h>
using namespace std;
typedef long long ll;

const int N = 110;

ll n;
ll root;
ll a[N], b[N], rson[N], lson[N];
ll fa[N], dep[N];

ll stk[N];
void build()
{
	for (int i = 1, top = 0, pos = 0; i <= n; i++)
	{
		pos = top;
		while (pos > 0 && a[stk[pos]] > a[i]) pos--;
		if (pos > 0) rson[stk[pos]] = i;
		if (pos < top) lson[i] = stk[pos + 1];
		stk[++pos] = i;
		top = pos;
	}
	root = stk[1], fa[stk[1]] = -1;
}

void dfs(ll u, ll father)
{
	fa[u] = father;
	if (father != -1) dep[u] = dep[father] + 1;
	if (lson[u] != 0) dfs(lson[u], u);
	if (rson[u] != 0) dfs(rson[u], u);
}

void solve()
{
	cin >> n;
	set<ll> have;
	for (int i = 1; i <= n; i++)
	{
		cin >> b[i];
		have.insert(b[i]);
	}
	ll idx = 0;
	map<ll, ll> mp;
	for (auto i : have) mp[i] = ++idx;
	for (int i = 1; i <= n; i++) a[mp[b[i]]] = i;
	build();
	dfs(root, -1);

    ll q;
    cin >> q;
    cin.ignore(1);
    while (q--)
    {
        string s;
        getline(cin, s);
        stringstream ss(s);
        if (s.find(" is the root") != s.npos)
        {
            ll t;
            ss >> t
            if (!have.count(t))
            {
                cout << "No" << endl;
                continue;
            }
            if (root != mp[t]) cout << "No" << endl;
            else cout << "Yes" << endl;
        }
        else if (s.find(" are siblings") != s.npos)
        {
            string tmp;
            ll l, r;
            ss >> l >> tmp >> r;
            if (!have.count(l) || !have.count(r))
            {
                cout << "No" << endl;
                continue;
            }
            if (fa[mp[l]] == fa[mp[r]]) cout << "Yes" << endl;
            else cout << "No" << endl;
        }
        else if (s.find(" are on the same level") != s.npos)
        {
            string tmp;
            ll l, r;
            ss >> l >> tmp >> r;
            if (!have.count(l) || !have.count(r))
            {
                cout << "No" << endl;
                continue;
            }
            if (dep[mp[l]] == dep[mp[r]]) cout << "Yes" << endl;
            else cout << "No" << endl;
        }
        else if (s.find(" is the parent of ") != s.npos)
        {
            string tmp;
            ll l, r;
            ss >> l;
            for (int i = 1; i <= 4; i++) ss >> tmp;
            ss >> r;
            if (!have.count(l) || !have.count(r))
            {
                cout << "No" << endl;
                continue;
            }
            if (fa[mp[r]] == mp[l]) cout << "Yes" << endl;
            else cout << "No" << endl;
        }
        else if (s.find(" is the left child of ") != s.npos)
        {
            string tmp;
            ll l, r;
            ss >> l;
            for (int i = 1; i <= 5; i++) ss >> tmp;
            ss >> r;
            if (!have.count(l) || !have.count(r))
            {
                cout << "No" << endl;
                continue;
            }
            if (lson[mp[r]] == mp[l]) cout << "Yes" << endl;
            else cout << "No" << endl;
        }
        else if (s.find(" is the right child of ") != s.npos)
        {
            string tmp;
            ll l, r;
            ss >> l;
            for (int i = 1; i <= 5; i++) ss >> tmp;
            ss >> r;
            if (!have.count(l) || !have.count(r))
            {
                cout << "No" << endl;
                continue;
            }
            if (rson[mp[r]] == mp[l]) cout << "Yes" << endl;
            else cout << "No" << endl;
        }
    }
}

int main()
{
	ios::sync_with_stdio(0);
	cin.tie(0);
	
	ll Case = 1;
	while (Case --> 0) solve();

	return 0;
}

7.4 B树和B+树

7.4.1 B树及其基本操作

特性1：结点容量限制

每个结点最多有 m m m 棵子树
每个结点最多有 m − 1 m-1 m−1 个关键字
关键字数量 = 子树数量 − 1 - 1 −1

特性2：根节点特殊规则

如果根节点不是叶子结点，至少有 2 2 2 棵子树
即：根节点至少有 1 1 1 个关键字（非叶子时）

特性3：非叶节点最小限制

除根节点外，非叶节点至少有 ⌈ m / 2 ⌉ ⌈m/2⌉ ⌈m/2⌉ 棵子树
即：至少有 ⌈ m / 2 ⌉ − 1 ⌈m/2⌉-1 ⌈m/2⌉−1 个关键字
保证树的平衡性和搜索效率

7.4.2 B+树的基本概念

一颗 m m m 阶 B+ 树满足如下条件：

节点的容量限制
- 每个 非叶子节点 （分支节点）最多有 m m m 棵子树。
- 除 根节点 外，每个 非叶子节点 至少有 ⌈ m / 2 ⌉ ⌈m/2⌉ ⌈m/2⌉ 棵子树。
关键字与子树的关系
- 在一个 非叶子节点 中，如果有 k k k 个关键字，那么它会有 k + 1 k+1 k+1 棵子树。
- 关键字起到分隔值域的作用，子树对应这些分隔区间。
数据存储位置
- 所有 数据记录 （或指向数据的指针）都存储在 叶子节点 中。
- 非叶子节点 只存储关键字，用于索引和导航，不直接存放数据。
叶子节点的顺序结构
- 所有 叶子节点 之间通过链表指针相连。
- 这种顺序结构可以支持高效的范围查询和顺序遍历。

7.5 散列表（Hash）表

7.5.1 散列表的基本概念

H a s h ( k e y ) = A d d r H ( k e y ) = A d d r Hash(key) = Addr \\ H(key) = Addr Hash(key)=AddrH(key)=Addr

7.5.2 散列函数的构造方法

直接定址法

除留余数法

数字分析法

平方取中法

7.5.3 处理冲突的方法

开放定址法

线性探测法
平方探测法
双散列法
伪随机序列法

拉链法

7.5.4 散列查找及性能分析的应用

α = 表中记录数 n 散列表长度 m \alpha = \frac{表中记录数n} {散列表长度m} α=散列表长度m表中记录数n

参考文献

王道论坛. 数据结构考研复习指导. 电子工业出版社, 2027.

学习资源 | 计算机考研杂货铺

编程题 - 题目列表 - 团体程序设计天梯赛-练习集

首页 - LibreOJ

luogu.com.cn

文章目录

7.1 查找的基本概念

7.2 顺序查找和折半查找

7.2.1 顺序查找

7.2.2 折半查找

例一 Let's Make a Convex!

7.2.3 分块查找

例二 数列分块入门 4

7.3 树形查找

7.3.1 二叉排序树

7.3.2 平衡二叉树

例三 普通平衡树

7.3.3 红黑树

7.3.4 笛卡尔树

例四 树的序

例五 二叉搜索树的结构

7.4 B树和B+树

7.4.1 B树及其基本操作

7.4.2 B+树的基本概念

7.5 散列表（Hash）表

7.5.1 散列表的基本概念

7.5.2 散列函数的构造方法

7.5.3 处理冲突的方法

7.5.4 散列查找及性能分析的应用

参考文献

例二数列分块入门 4

例三普通平衡树

例四树的序

例五二叉搜索树的结构