gflownet.envs.composite.setbase =============================== .. py:module:: gflownet.envs.composite.setbase .. autoapi-nested-parse:: Classes implementing the family of Set meta-environments, which allow to combine multiple sub-environments without any specific order. Classes ------- .. autoapisummary:: gflownet.envs.composite.setbase.BaseSet Module Contents --------------- .. py:class:: BaseSet(can_alternate_subenvs=True, **kwargs) Bases: :py:obj:`gflownet.envs.composite.base.CompositeBase` Initializes the BaseSet. :param can_alternate_subenvs: If True, actions of different sub-environments can alternate and each sub-environment action is preceded and followed by a meta-action to toggle the sub-environment. If False, once a sub-environment is activated, only actions of that sub-environment can be performed until it gets done (its EOS action is performed). :type can_alternate_subenvs: bool .. py:attribute:: can_alternate_subenvs :value: True .. py:property:: n_toggle_actions :type: int Returns the number of actions to toggle sub-environments or unique environments. If the Set allows alternating actions between sub-environments, the number of toggle actions is the number of sub-environments. Otherwise, toggle actions activate unique environments and the number of unique environments is returned. .. py:method:: get_action_space() Constructs list with all possible actions, including eos. The action space of a Set environment consists of: - The actions to activate specific sub-environments or unique environments. - The EOS action. - The concatenation of the actions of all unique environments In order to make all actions the same length (required to construct batches of actions as a tensor), the actions are zero-padded from the back. In order to make all actions unique, the unique environment index is added as the first element of the action. Note that the actions of unique environments are only added once to the action space, regardless of how many elements of the unique environment (sub-environments) there are in the set. In other words, identical environments that are part of the Set share the actions and a given action will have an effect on the sub-environment that is active. The actions to activate a specific sub-environment are represented as: (-1, subenv index, ZERO-PADDING) See: - :py:meth:`~gflownet.envs.composite.setbase.BaseSet._pad_action` - :py:meth:`~gflownet.envs.composite.setbase.BaseSet._depad_action` .. py:method:: action_produces_permutation(action, is_backward = False) Determines whether an action produces permutations in the resulting state. The Set introduces actions that produce permutations, in particular in the key ``_keys`` of the state. These actions are introduced if ``self.can_alternate_subenvs`` is False. In particular, the actions that produce permutations are backward actions that toggle a sub-environment. Note that this method does not check whether all relevant substates are identical, in which case, there is effectively not more than one permutation. Instead, True is returned if the action _could_ produce permutations in the resulting state. :param action: An action of the environment. :type action: tuple :param is_backward: Whether the transition to consider is backward (True) or forward (False). :type is_backward: bool :returns: *bool* -- Whether the input actions produces permutations in the resulting state, in the direction indicated by ``is_backward``. .. py:method:: get_mask_invalid_actions_forward(state = None, done = None) Computes the forward actions mask of the state. The mask of the Set environment is the concatenation of the following: - A one-hot encoding of the index of the sub-environment or unique environment (True at the index of the active environment). All False if the only valid actions are meta-actions. - Actual (main) mask of invalid actions: - The mask of the actions to activate a sub-environment or unique environment, OR - The mask of the active sub-environment. The mask is False-padded from the back up to mask_dim. .. py:method:: get_mask_invalid_actions_backward(state = None, done = None) Computes the backward actions mask of the state. The mask of the Set environment is the concatenation of the following: - A one-hot encoding of the index of the subenv (True at the index of the active environment). All False if no sub-environment is active. - Actual (main) mask of invalid actions: - The mask of the actions to activate a sub-environment, OR - The mask of the active sub-environment. The mask is False-padded from the back up to mask_dim. .. py:method:: mask_conditioning(mask, env_cond, backward) Conditions the input mask based on the restrictions imposed by a conditioning environment, env_cond. This method is overriden because the base mask_conditioning would change the mask unaware of the special Stack format. Therefore, this method calls the mask_conditioning() method of the currently relevant sub-environment and returns the mask with the correct Stack format. .. py:method:: step(action, skip_mask_check = False) Executes forward step given an action. Actions may be either sub-environent actions, or set actions. If the former, the action is performed by the corresponding sub-environment and then the set state is updated accordingly. If the latter, no sub-environment is involved and the changes are in the meta-data of the state (active subenv and toggle flag) Because the same action may correspond to multiple sub-environments, the action will always be performed on the active sub-environment. - Toggle actions: - Activate the corresponding sub-environment if no sub-environment is currently active. - If can_alternate_subenvs is True, the toggle flag is set to 1. - Reset the active sub-environment flag to -1 if a sub-environment is currently active. - The toggle flag is expected to be 0 and it remains 0. - Environment actions: - Updates the corresponding sub-environment as well as the set state. - If can_alternate_subenvs is True, the toggle flag is set to 0. :param action: Action to be executed. The input action is global, that is padded. :type action: tuple :returns: * **self.state** (*dict*) -- The state after executing the action. * **action** (*int*) -- Action executed. * **valid** (*bool*) -- False, if the action is not allowed for the current state. True otherwise. .. py:method:: step_backwards(action, skip_mask_check = False) Executes backward step given an action. Actions may be either sub-environent actions, or set actions. If the former, the action is performed by the corresponding sub-environment and then the set state is updated accordingly. If the latter, no sub-environment is involved and the changes are in the meta-data of the state (active subenv and toggle flag) Because the same action may correspond to multiple sub-environments, the action will always be performed on the active sub-environment. - Toggle actions: - Activate the corresponding sub-environment if no sub-environment is currently active. - Reset the active sub-environment flag to -1 if a sub-environment is currently active. - Set the toggle flag to 0. - Environment actions: - Updates the corresponding sub-environment as well as the set state. - If can_alternate_subenvs is True, set the toggle flag is set to 1. :param action: Action to be executed. The input action is global, that is padded. :type action: tuple :returns: * **self.state** (*dict*) -- The state after executing the action. * **action** (*int*) -- Action executed. * **valid** (*bool*) -- False, if the action is not allowed for the current state. True otherwise. .. py:method:: get_parents(state = None, done = None, action = None) Determines all parents and actions that lead to state. :param state: State in environment format. If not, self.state is used. :type state: dict :param done: Whether the trajectory is done. If None, self.done is used. :type done: bool :param action: Ignored. :type action: tuple :returns: * **parents** (*list*) -- List of parents in state format * **actions** (*list*) -- List of actions that lead to state for each parent in parents .. py:method:: sample_actions_batch(policy_outputs, mask = None, states_from = None, is_backward = False, random_action_prob = 0.0, temperature_logits = 1.0) Samples a batch of actions from a batch of policy outputs. This method calls the sample_actions_batch() method of the sub-environment corresponding to each state in the batch, or samples the actions to activate a sub-environment for the environments with no active environment. Note that in order to call sample_actions_batch() of the sub-environments, we need to first extract the part of the policy outputs, the masks and the states that correspond to the sub-environment. .. py:method:: get_logprobs(policy_outputs, actions, mask, states_from, is_backward) Computes log probabilities of actions given policy outputs and actions. :param policy_outputs: The output of the GFlowNet policy model. :type policy_outputs: tensor :param mask: The mask containing information about invalid actions and special cases. :type mask: tensor :param actions: The actions (global) from each state in the batch for which to compute the log probability. :type actions: list or tensor :param states_from: The states originating the actions, in environment format. :type states_from: tensor :param is_backward: True if the actions are backward, False if the actions are forward (default). :type is_backward: bool .. py:method:: action2representative(action) Replaces the part of the action associated with a sub-environment by its representative. The part of the action that identifies the sub-environment concerned by the action remains unaffected. :param action: An action of the Set environment (padded) :type action: tuple :returns: *tuple* -- A representative of the action, re-padded as a Set action that should be in the action space. .. py:method:: get_valid_actions(mask = None, state = None, done = None, backward = False) Returns the list of non-invalid (valid, for short) according to the mask of invalid actions. This method is overridden because the mask of a Set of environments does not cover the entire action space, but only the relevant sub-environment or the toggle actions, depending on the state. Therefore, this method calls the get_valid_actions() method of the active sub-environment or retrieves the valid toggle actions and returns the padded actions. .. py:method:: get_policy_output(params) Defines the structure of the output of the policy model. This method is overriden to add the policy outputs corresponding to the Set actions. These are concatenated to the policy outputs of the unique environments, obtained from the parent's method. The policy output is the concatenation of the policy outputs corresponding to the Set actions (actions to activate a sub-environment and EOS) and the policy outputs of the unique environments. :param params: A list of distribution parameters. This list has as many elements as there are unique environments, since all sub-environments of the same environment type are expected to be identical. :type params: list .. py:method:: is_source(state = None) Returns True if the environment's state or the state passed as parameter (if not None) is the source state of the environment. This method is overriden for efficiency (for example, it would return False immediately if the meta-data part of the state is not the source's) and to cover special uses of the Set. :param state: None, or a state in environment format. :type state: dict :returns: *bool* -- Whether the state is the source state of the environment .. py:method:: equal(state_x, state_y) Checks whether the two input states are equal. This method is overriden in order to account for the fact that states with permuted substates must be considered equal if the permutations are indeed equivalent. The permutatation of substates is not done by permuting the substates directly bu by permuting the list of keys in ``state["_keys"]``. Thus, this method returns True if all keys of the state dictionary are equal (except ``_keys`` which is ignored) and the substates are equal, after accounting for the permutation. This method uses the parent method in order to compare the substates. If a substate is a dictionary containing the key ``_keys``, then it is assumed it is a Set state and the current method is used. If Set states appear deeper in the substates, the comparison is not expected to behave as expected. :param state_x: One of the Set states to be compared. :type state_x: dict :param state_y: The other Set state to be compared. :type state_y: dict :returns: *bool* -- True if the two input states are equal; False otherwise. .. py:method:: __eq__(other, ignored_keys = []) Checks whether the current environment instance is equal to the input environment instance. This method is overriden to ignore the keys: - ``envs_unique_cache`` :param other: The environment instance to be compared. :type other: GFlowNetEnv :param ignored_keys: A list of keys (strings) to be ignored in the comparison. This parameter may be used by subclasses that may need to ignore certain keys. True if the environments's attributes are considered equal; False otherwise. :type ignored_keys: list