【day57】 - 技术栈

　　给定两个仅由大写字母或小写字母组成的字符串(长度介于1到10之间)，它们之间的关系是以下4种情况之一：

　　1：两个字符串长度不等。比如 Beijing 和 Hebei

　　2：两个字符串不仅长度相等，而且相应位置上的字符完全一致(区分大小写)，比如 Beijing 和 Beijing

　　3：两个字符串长度相等，相应位置上的字符仅在不区分大小写的前提下才能达到完全一致（也就是说，它并不满足情况2）。比如 beijing 和 BEIjing

　　4：两个字符串长度相等，但是即使是不区分大小写也不能使这两个字符串一致。比如 Beijing 和 Nanjing

　　编程判断输入的两个字符串之间的关系属于这四类中的哪一类，给出所属的类的编号。

cpp 复制代码

#include<iostream>
using namespace std;
int main() {
	string a, b;
	cin >> a >> b;
	int lena = a.size();
	int lenb = b.size();
	int cur = 2;
	if (lena != lenb) {
		cout << 1 << endl;
		return 0;
	}
	int gap = 'a' - 'A';
	for (int i = 0; i < lena; i++) {
		if (a[i] == b[i])continue;
		if (a[i] == b[i] + gap || a[i] + gap == b[i]) {
			cur = 3;
		}
		else {
			cur = 4;
			break;
		}
	}
	cout << cur << endl;
	return 0;
}

Reinforcement learning is a machine learning approach that learns optimal strategies through interaction with the environment. In the reinforcement learning framework, an agent observes the state of the environment and takes corresponding actions in order to receive rewards or penalties. The goal of the agent is to find a policy that maximizes long-term cumulative rewards through continuous exploration and learning. Unlike supervised learning, reinforcement learning usually does not rely on large amounts of labeled data but improves decision-making ability through trial and error. Reinforcement learning has achieved success in many complex tasks such as robotic control, autonomous driving, and game artificial intelligence. In the famous Go program AlphaGo, reinforcement learning was combined with deep neural networks, enabling computers to reach or even surpass the level of top human players. However, in practical applications, reinforcement learning still faces challenges such as low sample efficiency and high training costs.

强化学习是机器学习的一个方法，他通过与环境交互来学习最优策略。在强化学习框架中，智能体观察环境情况，做出相应的行为来得到奖励或惩罚。智能体的目标是找到能够最大化长期积累奖励的策略，通过连续的探索和学习。与监督学习不同强化学习通常不依赖大量的已标签数据，但他是通过尝试和误差来提升决策制定能力。强化学习在许多复杂的任务上取得了成功，例如机器人控制自动驾驶以及游戏人工智能。在著名的go程序Alpha go中，强化学习与生存神经网络相结合，让电脑能够达到甚至超过顶尖人类玩家的水平。然而，在实际应用当中强化学习仍然面临着例如样本效率和高训练成本的挑战。