offline 2 online | 重要性采样,把 offline + online 数据化为 on-policy samplesRecent advance in deep offline reinforcement learning (RL) has made it possible to train strong robotic agents from offline datasets. However, depending on the quality of the trained agents and the application being considered, it is often desirable to fi