base
====

.. py:module:: base

.. autoapi-nested-parse::

   Represent sequence-like environments.

   Sequences are constructed by adding tokens from a dictionary, from left to
   right.


Classes
-------

.. autoapisummary::

   base.SequenceBase


Module Contents
---------------

.. py:class:: SequenceBase(tokens = [0, 1], min_length = 1, max_length = 5, pad_token = -1, **kwargs)

   Bases: :py:obj:`gflownet.envs.base.GFlowNetEnv`


   Initialize a sequence environment.

   :param tokens: Vocabulary of tokens used to build sequences.
   :type tokens: Iterable
   :param min_length: Minimum valid sequence length before the EOS action is allowed.
   :type min_length: int
   :param max_length: Maximum sequence length.
   :type max_length: int
   :param pad_token: Token used to pad incomplete sequences.
   :type pad_token: int, float, str
   :param \*\*kwargs: Additional keyword arguments forwarded to :class:`GFlowNetEnv`.


   .. py:attribute:: device


   .. py:attribute:: tokens
      :value: (0, 1)


   .. py:attribute:: pad_token
      :value: -1


   .. py:attribute:: n_tokens
      :value: 2


   .. py:attribute:: min_length
      :value: 1


   .. py:attribute:: max_length
      :value: 5


   .. py:attribute:: eos_idx
      :value: -1


   .. py:attribute:: pad_idx
      :value: 0


   .. py:attribute:: dtype


   .. py:attribute:: idx2token


   .. py:attribute:: token2idx


   .. py:attribute:: source


   .. py:attribute:: eos


   .. py:method:: get_action_space()

      Construct the list of all possible actions, including EOS.

      An action is represented by a single-element tuple indicating the index of the
      token to be added to the current sequence (state).

      The action space of this parent class is:
          action_space: [(1,), (2,), (-1,)]


   .. py:method:: get_mask_invalid_actions_forward(state = None, done = None)

      Return the mask of invalid forward actions.

      The returned list has one entry per action:
          - True if the forward action is invalid from the current state.
          - False otherwise.

      :param state: Input state. If None, self.state is used.
      :type state: tensor
      :param done: Whether the trajectory is done. If None, self.done is used.
      :type done: bool

      :returns: *A list of boolean values.*


   .. py:method:: get_parents(state = None, done = None, action = None)

      Determine all parents and actions that lead to a state.

      The GFlowNet graph is a tree and there is only one parent per state.

      :param state: Input state. If None, self.state is used.
      :type state: tensor
      :param done: Whether the trajectory is done. If None, self.done is used.
      :type done: bool
      :param action: Ignored
      :type action: None

      :returns: * **parents** (*list*) -- List of parents in state format. This environment has a single parent per
                  state.
                * **actions** (*list*) -- List of actions that lead to state for each parent in parents. This
                  environment has a single parent per state.


   .. py:method:: step(action, skip_mask_check = False)

      Execute a step for the given action.

      :param action: Action to be executed. An action is represented by a single-element tuple
                     indicating the index of the token to be added to the current sequence
                     (state).
      :type action: tuple
      :param skip_mask_check: If True, skip computing forward mask of invalid actions to check if the
                              action is valid.
      :type skip_mask_check: bool

      :returns: * **self.state** (*list*) -- The sequence after executing the action
                * **action** (*tuple*) -- Action executed
                * **valid** (*bool*) -- False, if the action is not allowed for the current state.


   .. py:method:: states2proxy(states)

      Prepare a batch of states for a proxy.

      States are represented by the tokens instead of the indices, with
      padding up to the max_length.

      Important: by default, the output of states2proxy() is a list of lists, instead
      of a tensor as in most environments. This is to allow for string tokens.

      Example, with max_length = 5:
        - Sequence (tokens): 0100
        - state: [1, 2, 1, 1, 0]
        - proxy format: [0, 1, 0, 0, -1]

      :param states: A batch of states in environment format, either as a list of states or as a
                     single tensor.
      :type states: list or tensor

      :returns: *A list containing all the states in the batch, represented themselves as lists.*


   .. py:method:: states2policy(states)

      Prepare a batch of states for the policy model.

      States are one-hot encoded.

      Example, with max_length = 5:
        - Sequence (tokens): 0100
        - state: [1, 2, 1, 1, 0]
        - policy format: [0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0, 0]
                         |   0   |    1   |    0   |    0   |   PAD  |

      :param states: A batch of states in environment format, either as a list of states or as a
                     single tensor.
      :type states: list or tensor

      :returns: *A tensor containing all the states in the batch.*


   .. py:method:: state2readable(state = None)

      Convert a state into a human-readable string.

      Example, with max_length = 5:
        - state: [1, 2, 1, 1, 0]
        - readable: "0 1 0 0"

      The output string contains the token corresponding to each index in the state,
      separated by spaces.

      :param states: A state in environment format. If None, self.state is used.
      :type states: tensor

      :returns: *A string of space-separated tokens.*


   .. py:method:: readable2state(readable)

      Convert a readable state into environment format.

      Example, with max_length = 5:
        - readable: "0 1 0 0"
        - state: [1, 2, 1, 1, 0]

      :param readable: A state in readable format - space-separated tokens.
      :type readable: str

      :returns: *A tensor containing the indices of the tokens.*


   .. py:method:: get_all_terminating_states()

      Construct a batch with all terminating states in the sample space.


   .. py:method:: get_uniform_terminating_states(n_states, seed = None)

      Construct a batch of states sampled uniformly from the sample space.

      :param n_states: The number of states to sample.
      :type n_states: int
      :param seed: Random seed.
      :type seed: int