base ==== .. py:module:: base .. autoapi-nested-parse:: Represent sequence-like environments. Sequences are constructed by adding tokens from a dictionary, from left to right. Classes ------- .. autoapisummary:: base.SequenceBase Module Contents --------------- .. py:class:: SequenceBase(tokens = [0, 1], min_length = 1, max_length = 5, pad_token = -1, **kwargs) Bases: :py:obj:`gflownet.envs.base.GFlowNetEnv` Initialize a sequence environment. :param tokens: Vocabulary of tokens used to build sequences. :type tokens: Iterable :param min_length: Minimum valid sequence length before the EOS action is allowed. :type min_length: int :param max_length: Maximum sequence length. :type max_length: int :param pad_token: Token used to pad incomplete sequences. :type pad_token: int, float, str :param \*\*kwargs: Additional keyword arguments forwarded to :class:`GFlowNetEnv`. .. py:attribute:: device .. py:attribute:: tokens :value: (0, 1) .. py:attribute:: pad_token :value: -1 .. py:attribute:: n_tokens :value: 2 .. py:attribute:: min_length :value: 1 .. py:attribute:: max_length :value: 5 .. py:attribute:: eos_idx :value: -1 .. py:attribute:: pad_idx :value: 0 .. py:attribute:: dtype .. py:attribute:: idx2token .. py:attribute:: token2idx .. py:attribute:: source .. py:attribute:: eos .. py:method:: get_action_space() Construct the list of all possible actions, including EOS. An action is represented by a single-element tuple indicating the index of the token to be added to the current sequence (state). The action space of this parent class is: action_space: [(1,), (2,), (-1,)] .. py:method:: get_mask_invalid_actions_forward(state = None, done = None) Return the mask of invalid forward actions. The returned list has one entry per action: - True if the forward action is invalid from the current state. - False otherwise. :param state: Input state. If None, self.state is used. :type state: tensor :param done: Whether the trajectory is done. If None, self.done is used. :type done: bool :returns: *A list of boolean values.* .. py:method:: get_parents(state = None, done = None, action = None) Determine all parents and actions that lead to a state. The GFlowNet graph is a tree and there is only one parent per state. :param state: Input state. If None, self.state is used. :type state: tensor :param done: Whether the trajectory is done. If None, self.done is used. :type done: bool :param action: Ignored :type action: None :returns: * **parents** (*list*) -- List of parents in state format. This environment has a single parent per state. * **actions** (*list*) -- List of actions that lead to state for each parent in parents. This environment has a single parent per state. .. py:method:: step(action, skip_mask_check = False) Execute a step for the given action. :param action: Action to be executed. An action is represented by a single-element tuple indicating the index of the token to be added to the current sequence (state). :type action: tuple :param skip_mask_check: If True, skip computing forward mask of invalid actions to check if the action is valid. :type skip_mask_check: bool :returns: * **self.state** (*list*) -- The sequence after executing the action * **action** (*tuple*) -- Action executed * **valid** (*bool*) -- False, if the action is not allowed for the current state. .. py:method:: states2proxy(states) Prepare a batch of states for a proxy. States are represented by the tokens instead of the indices, with padding up to the max_length. Important: by default, the output of states2proxy() is a list of lists, instead of a tensor as in most environments. This is to allow for string tokens. Example, with max_length = 5: - Sequence (tokens): 0100 - state: [1, 2, 1, 1, 0] - proxy format: [0, 1, 0, 0, -1] :param states: A batch of states in environment format, either as a list of states or as a single tensor. :type states: list or tensor :returns: *A list containing all the states in the batch, represented themselves as lists.* .. py:method:: states2policy(states) Prepare a batch of states for the policy model. States are one-hot encoded. Example, with max_length = 5: - Sequence (tokens): 0100 - state: [1, 2, 1, 1, 0] - policy format: [0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0] | 0 | 1 | 0 | 0 | PAD | :param states: A batch of states in environment format, either as a list of states or as a single tensor. :type states: list or tensor :returns: *A tensor containing all the states in the batch.* .. py:method:: state2readable(state = None) Convert a state into a human-readable string. Example, with max_length = 5: - state: [1, 2, 1, 1, 0] - readable: "0 1 0 0" The output string contains the token corresponding to each index in the state, separated by spaces. :param states: A state in environment format. If None, self.state is used. :type states: tensor :returns: *A string of space-separated tokens.* .. py:method:: readable2state(readable) Convert a readable state into environment format. Example, with max_length = 5: - readable: "0 1 0 0" - state: [1, 2, 1, 1, 0] :param readable: A state in readable format - space-separated tokens. :type readable: str :returns: *A tensor containing the indices of the tokens.* .. py:method:: get_all_terminating_states() Construct a batch with all terminating states in the sample space. .. py:method:: get_uniform_terminating_states(n_states, seed = None) Construct a batch of states sampled uniformly from the sample space. :param n_states: The number of states to sample. :type n_states: int :param seed: Random seed. :type seed: int