Latest News

Predictive Information Accelerates Learning in RL

[Submitted on 24 Jul 2020 (v1), last revised 26 Oct 2020 (this version, v2)]

Download PDF

Abstract: The Predictive Information is the mutual information between the past and the
future, I(X_past; X_future). We hypothesize that capturing the predictive
information is useful in RL, since the ability to model what will happen next
is necessary for success on many tasks. To test our hypothesis, we train Soft
Actor-Critic (SAC) agents from pixels with an auxiliary task that learns a
compressed representation of the predictive information of the RL environment
dynamics using a contrastive version of the Conditional Entropy Bottleneck
(CEB) objective. We refer to these as Predictive Information SAC (PI-SAC)
agents. We show that PI-SAC agents can substantially improve sample efficiency
over challenging baselines on tasks from the DM Control suite of continuous
control environments. We evaluate PI-SAC agents by comparing against
uncompressed PI-SAC agents, other compressed and uncompressed agents, and SAC
agents directly trained from pixels. Our implementation is given on GitHub.

Submission history

From: Kuang-Huei Lee [view email]

[v1]
Fri, 24 Jul 2020 08:14:41 UTC (2,916 KB)

[v2]
Mon, 26 Oct 2020 00:27:00 UTC (7,562 KB)

Read More

Show More

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button

Adblock Detected

Please consider supporting us by disabling your ad blocker