exarl.workflows.workflow_vault.async_learner

Module Contents

Classes

ASYNC

Asynchronous workflow class: inherits from ExaWorkflow base class.

Attributes

logger

exarl.workflows.workflow_vault.async_learner.logger
class exarl.workflows.workflow_vault.async_learner.ASYNC

Bases: exarl.ExaWorkflow

Asynchronous workflow class: inherits from ExaWorkflow base class. In this approach, the EXARL architecture is separated into “learner” and “actors”. Actor refers to the part of the agent with only the target network. A simple round-robin scheduling scheme is used to distribute work from the learner to the actors. The learner consists of a target model that is trained using experiences collected by the actors. Each actor consists of a model replica that receives the updated weights from the learner. This model is used to infer the next action given a state of the environment. The environment can be rendered/simulated to update the state using this action. In contrast to other architectures, each actor in EXARL independently stores experiences and runs the Bellman equation to generate training data. The training data is sent back to the learner (once enough data is collected). By locally running the Bellman equations in each actor in parallel, the load is equally distributed among all actor processes. The learner distributes work by parallelizing across episodes, and actors request work in a round-robin fashion. Each actor runs all of the steps in an episode to completion before requesting more work from the learner. This process is repeated until the learner gathers experiences from all episodes.

Async class constructor.

run(self, workflow)

This function implements the asynchronous workflow in EXARL using two-sided point-to-point MPI communication.

Parameters
  • workflow (ExaLearner type object) – The ExaLearner object is used to access

  • class. (different members of the base) –

Returns

None