exarl.workflows.workflow_vault
Submodules
Package Contents
Classes
Synchronous workflow class: inherits from the ExaWorkflow base class. |
|
Asynchronous workflow class: inherits from ExaWorkflow base class. |
|
RMA workflow class: inherits from ExaWorkflow base class. |
|
Random workflow class: inherits from Exaworkflow base class. |
- class exarl.workflows.workflow_vault.SYNC
Bases:
exarl.ExaWorkflowSynchronous workflow class: inherits from the ExaWorkflow base class. It features a single learner and multiple actors. The MPI processes are statically launched and are split into multiple groups. The environment processes can be set during launchtime as a candle parameter and runs multiple multi-process environments. The experiences generated by the environments are gathered and sent to learner for training.
- run(self, workflow)
This function implements the synchronous workflow in EXARL and uses MPI collective communication.
- Parameters
workflow (ExaLearner type object) – The ExaLearner object is used to access
class. (different members of the base) –
- Returns
None
- class exarl.workflows.workflow_vault.ASYNC
Bases:
exarl.ExaWorkflowAsynchronous workflow class: inherits from ExaWorkflow base class. In this approach, the EXARL architecture is separated into “learner” and “actors”. Actor refers to the part of the agent with only the target network. A simple round-robin scheduling scheme is used to distribute work from the learner to the actors. The learner consists of a target model that is trained using experiences collected by the actors. Each actor consists of a model replica that receives the updated weights from the learner. This model is used to infer the next action given a state of the environment. The environment can be rendered/simulated to update the state using this action. In contrast to other architectures, each actor in EXARL independently stores experiences and runs the Bellman equation to generate training data. The training data is sent back to the learner (once enough data is collected). By locally running the Bellman equations in each actor in parallel, the load is equally distributed among all actor processes. The learner distributes work by parallelizing across episodes, and actors request work in a round-robin fashion. Each actor runs all of the steps in an episode to completion before requesting more work from the learner. This process is repeated until the learner gathers experiences from all episodes.
Async class constructor.
- run(self, workflow)
This function implements the asynchronous workflow in EXARL using two-sided point-to-point MPI communication.
- Parameters
workflow (ExaLearner type object) – The ExaLearner object is used to access
class. (different members of the base) –
- Returns
None
- class exarl.workflows.workflow_vault.RMA
Bases:
exarl.ExaWorkflowRMA workflow class: inherits from ExaWorkflow base class. The RMA worflow uses one-sided MPI communication for exchanging data between learners and actors. The data is written into an RMA window or ”memory pool” and the learners and actors can read/write from this pool, independent of each other.
RMA class constructor. Contrains a list of different data structures that can be used for the “memory pool”.
- run(self, workflow)
This function implements the RMA workflow in EXARL using one-sided MPI communication.
- Parameters
workflow (ExaLearner type object) – The ExaLearner object is used to access
class. (different members of the base) –
- Returns
None
- class exarl.workflows.workflow_vault.RANDOM
Bases:
exarl.ExaWorkflowRandom workflow class: inherits from Exaworkflow base class. Used for testing inference against random actions.
Random workflow class constructor. The weight file gets loaded for inference.
- run(self, workflow)
This function implements the random workflow in EXARL. :param workflow: The ExaLearner object is used to access :type workflow: ExaLearner type object :param different members of the base class.:
- Returns
None