memorax.algorithms#
Reinforcement learning algorithms for training agents.
PPO#
PPO - Proximal Policy Optimization for discrete and continuous action spaces.
PPOConfig - Configuration dataclass for PPO.
PPOState - Training state for PPO.
DQN#
DQN - Deep Q-Network with double and dueling variants.
DQNConfig - Configuration dataclass for DQN.
DQNState - Training state for DQN.
SAC#
SAC - Soft Actor-Critic for continuous control.
SACConfig - Configuration dataclass for SAC.
SACState - Training state for SAC.
PQN#
PQN - Parallelised Q-Network (on-policy Q-learning).
PQNConfig - Configuration dataclass for PQN.
PQNState - Training state for PQN.