gflownet.envs.scrabble ====================== .. py:module:: gflownet.envs.scrabble .. autoapi-nested-parse:: Scrabble environment: starting from an emtpy sequence, letters are added one by one up to a maximum length. Attributes ---------- .. autoapisummary:: gflownet.envs.scrabble.LETTERS Classes ------- .. autoapisummary:: gflownet.envs.scrabble.Scrabble Module Contents --------------- .. py:data:: LETTERS :value: ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S',... .. py:class:: Scrabble(letters = None, max_length = 7, pad_token = '0', **kwargs) Bases: :py:obj:`gflownet.envs.base.GFlowNetEnv` Scrabble environment: sequences are constructed starting from an empty sequence and adding one letter at a time. States are represented by a list of indices corresponding to each letter, starting from 1, and are padded with index 0. Actions are represented by a single-element tuple with the index of the letter to be added. The EOS action is by (-1, ). .. attribute:: letters An tuple containing the letters to form words. By default, LETTERS is used. :type: tuple .. attribute:: max_length Maximum length of the sequences. Default is 7, like in the standard game. :type: int .. attribute:: pad_token PAD token. Default: "0". :type: str .. py:attribute:: pad_token :value: '0' .. py:attribute:: n_letters .. py:attribute:: max_length :value: 7 .. py:attribute:: eos_idx :value: -1 .. py:attribute:: pad_idx :value: 0 .. py:attribute:: idx2token .. py:attribute:: token2idx .. py:attribute:: source :value: [0, 0, 0, 0, 0, 0, 0] .. py:attribute:: eos .. py:method:: get_action_space() Constructs list with all possible actions, including eos. An action is represented by a single-element tuple indicating the index of the letter to be added to the current sequence (state). The action space of this parent class is: action_space: [(0,), (1,), (-1,)] .. py:method:: get_mask_invalid_actions_forward(state = None, done = None) Returns a list of length the action space with values: - True if the forward action is invalid from the current state. - False otherwise. :param state: Input state. If None, self.state is used. :type state: tensor :param done: Whether the trajectory is done. If None, self.done is used. :type done: bool :returns: *A list of boolean values.* .. py:method:: get_parents(state = None, done = None, action = None) Determines all parents and actions that lead to state. The GFlowNet graph is a tree and there is only one parent per state. :param state: Input state. If None, self.state is used. :type state: tensor :param done: Whether the trajectory is done. If None, self.done is used. :type done: bool :param action: Ignored :type action: None :returns: * **parents** (*list*) -- List of parents in state format. This environment has a single parent per state. * **actions** (*list*) -- List of actions that lead to state for each parent in parents. This environment has a single parent per state. .. py:method:: step(action, skip_mask_check = False) Executes step given an action. :param action: Action to be executed. An action is a tuple int values indicating the dimensions to increment by 1. :type action: tuple :param skip_mask_check: If True, skip computing forward mask of invalid actions to check if the action is valid. :type skip_mask_check: bool :returns: * **self.state** (*list*) -- The sequence after executing the action * **action** (*tuple*) -- Action executed * **valid** (*bool*) -- False, if the action is not allowed for the current state. .. py:method:: states2proxy(states) Prepares a batch of states in "environment format" for a proxy: the batch is simply converted into a tensor of indices. :param states: A batch of states in environment format, either as a list of states or as a list of tensors. :type states: list or tensor :returns: *A list containing all the states in the batch, represented themselves as lists.* .. py:method:: states2policy(states) Prepares a batch of states in "environment format" for the policy model: states are one-hot encoded. :param states: A batch of states in environment format, either as a list of states or as a list of tensors. :type states: list or tensor :returns: *A tensor containing all the states in the batch.* .. py:method:: state2readable(state = None) Converts a state into a human-readable string. The output string contains the letter corresponding to each index in the state, separated by spaces. :param states: A state in environment format. If None, self.state is used. :type states: tensor :returns: *A string of space-separated letters.* .. py:method:: readable2state(readable) Converts a state in readable format into the "environment format" (tensor) :param readable: A state in readable format - space-separated letters. :type readable: str :returns: *A tensor containing the indices of the letters.* .. py:method:: get_uniform_terminating_states(n_states, seed = None) Constructs a batch of n states uniformly sampled in the sample space of the environment. :param n_states: The number of states to sample. :type n_states: int :param seed: Random seed. :type seed: int