11 тыс подписчиков

🎲 Anti-Exploration by Random Network Distillation, Tinkoff Research, ICML 2023

We propose a new ensemble-free offline RL algorithm called SAC-RND. We evaluate our method on the D4RL (Fu et al., 2020) benchmark, and show that SAC-RND achieves performance comparable to ensemble-based methods while outperforming ensemble-free approaches.

Ученые из Tinkoff Research открыли новый Offline-RL алгоритм, который показывает SOTA-результаты, сравнимые с ансамблевыми моделями (в некоторых случаях даже лучше), и при этом требует до 20 раз меньше времени на обучение.

🖥 Github: https://github.com/tinkoff-ai/sac-rnd

🤓 Paper: https://proceedings.mlr.press/v202/nikulin23a.html

ai_machinelearning_big_data

🎲 Anti-Exploration by Random Network Distillation, Tinkoff Research, ICML 2023 We propose a new ensemble-free offline RL algorithm called SAC-RND. We evaluate our method on the D4RL (Fu et al.

Около минуты

8 августа 2023