exarl.agents.agent_vault.ddpg
Module Contents
Classes
Deep deterministic policy gradient agent. |
Functions
|
Attributes
- exarl.agents.agent_vault.ddpg.logger
- exarl.agents.agent_vault.ddpg.update_target(target_weights, weights, tau)
- class exarl.agents.agent_vault.ddpg.DDPG(env, is_learner)
Bases:
exarl.ExaAgentDeep deterministic policy gradient agent. Inherits from ExaAgent base class.
DDPG constructor
- Parameters
env (OpenAI Gym environment object) – env object indicates the RL environment
is_learner (bool) – Used to indicate if the agent is a learner or an actor
- is_learner :bool
- remember(self, state, action, reward, next_state, done)
Add experience to replay buffer
- Parameters
state (list or array) – Current state of the system
action (list or array) – Action to take
reward (list or array) – Environment reward
next_state (list or array) – Next state of the system
done (bool) – Indicates episode completion
- update_grad(self, state_batch, action_batch, reward_batch, next_state_batch)
Update gradients - training step
- Parameters
state_batch (list) – list of states
action_batch (list) – list of actions
reward_batch (list) – list of rewards
next_state_batch (list) – list of next states
- get_actor(self)
Define actor network
- Returns
model – actor model
- get_critic(self)
Define critic network
- Returns
model – critic network
- has_data(self)
Indicates if the buffer has data
- Returns
bool – True if buffer has data
- generate_data(self)
Generate data for training
- Yields
state_batch (list) – list of states action_batch (list): list of actions reward_batch (list): list of rewards next_state_batch (list): list of next states
- train(self, batch)
Train the NN
- Parameters
batch (list) – sampled batch of experiences
- target_train(self)
Update target model
- action(self, state)
Returns sampled action with added noise
- Parameters
state (list or array) – Current state of the system
- Returns
action (list or array) – Action to take policy (int): random (0) or inference (1)
- get_weights(self)
Get weights from target model
- Returns
weights (list) – target model weights
- set_weights(self, weights)
Set model weights
- Parameters
weights (list) – model weights
- update(self)
- load(self, filename)
Load model weights from pickle file
- Parameters
filename (string) – full path of model file
- save(self, filename)
Save model weights to pickle file
- Parameters
filename (string) – full path of model file
- monitor(self)
- set_agent(self)
- epsilon_adj(self)
Update epsilon value