`exarl.agents.agent_vault.ddpg`

Module Contents

Classes

DDPG

Deep deterministic policy gradient agent.

Functions

update_target(target_weights, weights, tau)

Attributes

logger

exarl.agents.agent_vault.ddpg.logger

exarl.agents.agent_vault.ddpg.update_target(target_weights, weights, tau)

class exarl.agents.agent_vault.ddpg.DDPG(env, is_learner)

Bases: exarl.ExaAgent

Deep deterministic policy gradient agent. Inherits from ExaAgent base class.

DDPG constructor

Parameters

env (OpenAI Gym environment object) – env object indicates the RL environment
is_learner (bool) – Used to indicate if the agent is a learner or an actor

is_learner :bool

remember(self, state, action, reward, next_state, done)

Add experience to replay buffer

Parameters

state (list or array) – Current state of the system
action (list or array) – Action to take
reward (list or array) – Environment reward
next_state (list or array) – Next state of the system
done (bool) – Indicates episode completion

update_grad(self, state_batch, action_batch, reward_batch, next_state_batch)

Update gradients - training step

Parameters

state_batch (list) – list of states
action_batch (list) – list of actions
reward_batch (list) – list of rewards
next_state_batch (list) – list of next states

get_actor(self)

Define actor network

Returns: model – actor model

get_critic(self)

Define critic network

Returns: model – critic network

has_data(self)

Indicates if the buffer has data

Returns: bool – True if buffer has data

generate_data(self)

Generate data for training

Yields: state_batch (list) – list of states action_batch (list): list of actions reward_batch (list): list of rewards next_state_batch (list): list of next states

train(self, batch)

Train the NN

Parameters: batch (list) – sampled batch of experiences

target_train(self): Update target model

action(self, state)

Returns sampled action with added noise

Parameters: state (list or array) – Current state of the system
Returns: action (list or array) – Action to take policy (int): random (0) or inference (1)

get_weights(self)

Get weights from target model

Returns: weights (list) – target model weights

set_weights(self, weights)

Set model weights

Parameters: weights (list) – model weights

update(self)

load(self, filename)

Load model weights from pickle file

Parameters: filename (string) – full path of model file

save(self, filename)

Save model weights to pickle file

Parameters: filename (string) – full path of model file

monitor(self)

set_agent(self)

epsilon_adj(self): Update epsilon value

exarl.agents.agent_vault.ddpg

Module Contents

Classes

Functions

Attributes

`exarl.agents.agent_vault.ddpg`