【VLN】VLN Paradigm Alg:Reinforcement learning 强化学习及其细节(4)强化学习(RL) 研究的是智能体如何通过与环境交互,以最大化累积奖励为目标来学习策略。 监督学习则是通过标注数据,以最小化损失函数为目标来学习模型。 • RL is about how an agent learn a policy through interaction with the environment by maximizing the rewards • Supervised Learning is about learning a model through labeled data by