exarl.workflows.workflow_vault

Submodules

Package Contents

Classes

SYNC

Synchronous workflow class: inherits from the ExaWorkflow base class.

ASYNC

Asynchronous workflow class: inherits from ExaWorkflow base class.

RMA

RMA workflow class: inherits from ExaWorkflow base class.

RANDOM

Random workflow class: inherits from Exaworkflow base class.

class exarl.workflows.workflow_vault.SYNC

Bases: exarl.ExaWorkflow

Synchronous workflow class: inherits from the ExaWorkflow base class. It features a single learner and multiple actors. The MPI processes are statically launched and are split into multiple groups. The environment processes can be set during launchtime as a candle parameter and runs multiple multi-process environments. The experiences generated by the environments are gathered and sent to learner for training.

run(self, workflow)

This function implements the synchronous workflow in EXARL and uses MPI collective communication.

Parameters
  • workflow (ExaLearner type object) – The ExaLearner object is used to access

  • class. (different members of the base) –

Returns

None

class exarl.workflows.workflow_vault.ASYNC

Bases: exarl.ExaWorkflow

Asynchronous workflow class: inherits from ExaWorkflow base class. In this approach, the EXARL architecture is separated into “learner” and “actors”. Actor refers to the part of the agent with only the target network. A simple round-robin scheduling scheme is used to distribute work from the learner to the actors. The learner consists of a target model that is trained using experiences collected by the actors. Each actor consists of a model replica that receives the updated weights from the learner. This model is used to infer the next action given a state of the environment. The environment can be rendered/simulated to update the state using this action. In contrast to other architectures, each actor in EXARL independently stores experiences and runs the Bellman equation to generate training data. The training data is sent back to the learner (once enough data is collected). By locally running the Bellman equations in each actor in parallel, the load is equally distributed among all actor processes. The learner distributes work by parallelizing across episodes, and actors request work in a round-robin fashion. Each actor runs all of the steps in an episode to completion before requesting more work from the learner. This process is repeated until the learner gathers experiences from all episodes.

Async class constructor.

run(self, workflow)

This function implements the asynchronous workflow in EXARL using two-sided point-to-point MPI communication.

Parameters
  • workflow (ExaLearner type object) – The ExaLearner object is used to access

  • class. (different members of the base) –

Returns

None

class exarl.workflows.workflow_vault.RMA

Bases: exarl.ExaWorkflow

RMA workflow class: inherits from ExaWorkflow base class. The RMA worflow uses one-sided MPI communication for exchanging data between learners and actors. The data is written into an RMA window or ”memory pool” and the learners and actors can read/write from this pool, independent of each other.

RMA class constructor. Contrains a list of different data structures that can be used for the “memory pool”.

run(self, workflow)

This function implements the RMA workflow in EXARL using one-sided MPI communication.

Parameters
  • workflow (ExaLearner type object) – The ExaLearner object is used to access

  • class. (different members of the base) –

Returns

None

class exarl.workflows.workflow_vault.RANDOM

Bases: exarl.ExaWorkflow

Random workflow class: inherits from Exaworkflow base class. Used for testing inference against random actions.

Random workflow class constructor. The weight file gets loaded for inference.

run(self, workflow)

This function implements the random workflow in EXARL. :param workflow: The ExaLearner object is used to access :type workflow: ExaLearner type object :param different members of the base class.:

Returns

None