gflownet.envs.composite.base
Base class for composite environments.
Composite environments are environments which consist of multiple environments.
Classes
Initializes the CompositeBase environment. |
Module Contents
- class gflownet.envs.composite.base.CompositeBase(**kwargs)[source]
Bases:
gflownet.envs.base.GFlowNetEnvInitializes the CompositeBase environment.
- property n_unique_envs: int[source]
Returns the number of unique environments.
- Returns:
int – The number of unique environments.
- Raises:
RuntimeError – If
self.subenvsis not an attribute of the environment.- Return type:
int
- get_action_space()[source]
Constructs a list with all possible actions, including EOS.
By default, the action space of a Composite environment consists of the concatenation of the actions of all unique environments.
Certain composite environments may make use of additional actions, for example to toggle specific sub-environments.
Sub-classes with additional actions should override this method.
In order to make all actions the same length (required to construct batches of actions as a tensor), the actions are zero-padded from the back.
In order to make all actions unique, the unique environment index is added as the first element of the action.
Note that the actions of unique environments are only added once to the action space, regardless of how many elements of the unique environment (sub-environments) there are in the composite environment. In other words, identical environments that are part of the composite environment share the actions and a given action will have an effect on the sub-environment that is next or active.
See: -
_pad_action()-_depad_action()- Return type:
List[Tuple]
- set_state(state, done=False)[source]
Sets a state and done.
The correct state and done of each sub-environment are set too.
- Parameters:
state (dict) – A state of the global composite environment.
done (bool) – Whether the trajectory of the environment is done or not.
- reset(env_id=None)[source]
Resets the environment by resetting the sub-environments.
- Parameters:
env_id (Union[int, str])
- get_policy_output(params)[source]
Defines the structure of the output of the policy model.
By default, the policy output is the concatenation of the policy outputs of the unique environments.
Sub-classes should override this method if the structure of the policy outputs changes, for example, if meta-actions are added.
- Parameters:
params (list) – A list of distribution parameters. This list has as many elements as there are unique environments, since all sub-environments of the same environment type are expected to be identical.
- Return type:
torchtyping.TensorType[policy_output_dim]