memorax.networks.heads#

Output heads for different RL objectives.

Policy Heads#

Categorical - Categorical policy for discrete actions.

Gaussian - Gaussian policy for continuous actions.

SquashedGaussian - Squashed Gaussian policy (tanh-bounded, used in SAC).

Value Heads#

VNetwork - State value function head.

HLGaussVNetwork - HL-Gauss value head with two-hot cross-entropy loss.

Q-Network Heads#

DiscreteQNetwork - Q-network for discrete actions.

ContinuousQNetwork - Q-network for continuous actions.

TwinContinuousQNetwork - Twin Q-networks for SAC.

C51QNetwork - Categorical DQN with distributional value estimation.

General Value Functions#

GVF - General Value Function with custom cumulant and discount.

Horde - Collection of GVF demons alongside a main head.

Temperature#

Alpha - Learnable temperature parameter for SAC.

Beta - Learnable temperature parameter.

Other#

PredecessorHead - Predecessor representation head.