memorax.networks.heads#
Output heads for different RL objectives.
Policy Heads#
Categorical - Categorical policy for discrete actions.
Gaussian - Gaussian policy for continuous actions.
SquashedGaussian - Squashed Gaussian policy (tanh-bounded, used in SAC).
Value Heads#
VNetwork - State value function head.
HLGaussVNetwork - HL-Gauss value head with two-hot cross-entropy loss.
Q-Network Heads#
DiscreteQNetwork - Q-network for discrete actions.
ContinuousQNetwork - Q-network for continuous actions.
TwinContinuousQNetwork - Twin Q-networks for SAC.
C51QNetwork - Categorical DQN with distributional value estimation.
General Value Functions#
GVF - General Value Function with custom cumulant and discount.
Horde - Collection of GVF demons alongside a main head.
Temperature#
Alpha - Learnable temperature parameter for SAC.
Beta - Learnable temperature parameter.
Other#
PredecessorHead - Predecessor representation head.