进阶1.Huffuman树
问题描述
Huffman树在编码中有着广泛的应用。在这里,我们只关心Huffman树的构造过程。
给出一列数{pi }={p 0, p 1, ..., pn -1},用这列数构造Huffman树的过程如下:
1. 找到{pi }中最小的两个数,设为pa 和pb ,将pa 和pb 从{pi }中删除掉,然后将它们的和加入到{pi }中。这个过程的费用记为pa + pb 。
2. 重复步骤1,直到{pi }中只剩下一个数。
在上面的操作过程中,把所有的费用相加,就得到了构造Huffman树的总费用。
本题任务:对于给定的一个数列,现在请你求出用该数列构造Huffman树的总费用。
例如,对于数列{pi }={5, 3, 8, 2, 9},Huffman树的构造过程如下:
1. 找到{5, 3, 8, 2, 9}中最小的两个数,分别是2和3,从{pi }中删除它们并将和5加入,得到{5, 8, 9, 5},费用为5。
2. 找到{5, 8, 9, 5}中最小的两个数,分别是5和5,从{pi }中删除它们并将和10加入,得到{8, 9, 10},费用为10。
3. 找到{8, 9, 10}中最小的两个数,分别是8和9,从{pi }中删除它们并将和17加入,得到{10, 17},费用为17。
4. 找到{10, 17}中最小的两个数,分别是10和17,从{pi }中删除它们并将和27加入,得到{27},费用为27。
5. 现在,数列中只剩下一个数27,构造过程结束,总费用为5+10+17+27=59。
个人总结
本题可以用优先队列(最小堆)来模拟:把所有数放入最小堆。每次弹出两个最小的数,累加它们的和到总费用,再把和放回堆。重复直到堆中只剩一个数。
cpp
#include <bits/stdc++.h>
using namespace std;
int main() {
int n;
cin >> n;
// 最小堆(默认是最大堆,用 greater 改为最小堆)
priority_queue<int, vector<int>, greater<int>> pq;
for (int i = 0; i < n; i++) {
int x;
cin >> x;
pq.push(x);
}
int totalCost = 0;
while (pq.size() > 1) {
int a = pq.top(); pq.pop();
int b = pq.top(); pq.pop();
int sum = a + b;
totalCost += sum;
pq.push(sum);
}
cout << totalCost << endl;
return 0;
}
计算机英语翻译
原文:
Overfitting is a common problem during the training of machine learning models. When a model performs extremely well on training data but poorly on test data, it is considered to suffer from overfitting. This usually occurs when the model is too complex or when the amount of training data is insufficient. To reduce overfitting, researchers have proposed various techniques such as regularization, data augmentation, and cross-validation. Regularization methods introduce penalty terms into the loss function to limit the magnitude of model parameters, thereby making the model simpler and more stable. Data augmentation increases the diversity of training data by applying operations such as rotation, cropping, or adding noise to the original data. In addition, cross-validation evaluates the generalization ability of a model by repeatedly splitting the dataset into training and validation sets. These techniques can effectively improve the performance of machine learning models in real-world applications.
翻译:
过拟合是机器学习模型训练过程中普遍的问题。当模型在训练数据上表现非常优秀但在测试数据上表现不好时,就被认为是遭遇过拟合问题。这通常发生在模型太复杂或训练数据的数量不充足时。为了减少过拟合,研究者们已经提出了许多技术,例如规则、数据增强和xx。xx模型将惩罚项引入损失函数中,以限制模型参数的xx,从而使得模型更简单和稳定。数据增强通过对原始数据进行旋转、xx或增加噪音等方式增加训练数据的多样性。此外,xx通过重复将数据集分割为训练集和验证集来评估模型的xx能力。这些技术可以有效提升机器学习模型在现实世界应用中的表现。
计算机英语单词扇贝打卡
