gflownet.envs.scrabble
Scrabble environment: starting from an emtpy sequence, letters are added one by one up to a maximum length.
Attributes
Classes
Scrabble environment: sequences are constructed starting from an empty sequence and |
Module Contents
- gflownet.envs.scrabble.LETTERS = ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S',...[source]
- class gflownet.envs.scrabble.Scrabble(letters=None, max_length=7, pad_token='0', **kwargs)[source]
Bases:
gflownet.envs.base.GFlowNetEnvScrabble environment: sequences are constructed starting from an empty sequence and adding one letter at a time.
States are represented by a list of indices corresponding to each letter, starting from 1, and are padded with index 0.
Actions are represented by a single-element tuple with the index of the letter to be added. The EOS action is by (-1, ).
- Parameters:
letters (Iterable)
max_length (int)
pad_token (str)
- letters
An tuple containing the letters to form words. By default, LETTERS is used.
- Type:
tuple
- max_length[source]
Maximum length of the sequences. Default is 7, like in the standard game.
- Type:
int
- get_action_space()[source]
Constructs list with all possible actions, including eos.
An action is represented by a single-element tuple indicating the index of the letter to be added to the current sequence (state).
- The action space of this parent class is:
action_space: [(0,), (1,), (-1,)]
- Return type:
List[Tuple]
- get_mask_invalid_actions_forward(state=None, done=None)[source]
- Returns a list of length the action space with values:
True if the forward action is invalid from the current state.
False otherwise.
- Parameters:
state (tensor) – Input state. If None, self.state is used.
done (bool) – Whether the trajectory is done. If None, self.done is used.
- Returns:
A list of boolean values.
- Return type:
List[bool]
- get_parents(state=None, done=None, action=None)[source]
Determines all parents and actions that lead to state.
The GFlowNet graph is a tree and there is only one parent per state.
- Parameters:
state (tensor) – Input state. If None, self.state is used.
done (bool) – Whether the trajectory is done. If None, self.done is used.
action (None) – Ignored
- Returns:
parents (list) – List of parents in state format. This environment has a single parent per state.
actions (list) – List of actions that lead to state for each parent in parents. This environment has a single parent per state.
- Return type:
Tuple[List, List]
- step(action, skip_mask_check=False)[source]
Executes step given an action.
- Parameters:
action (tuple) – Action to be executed. An action is a tuple int values indicating the dimensions to increment by 1.
skip_mask_check (bool) – If True, skip computing forward mask of invalid actions to check if the action is valid.
- Returns:
self.state (list) – The sequence after executing the action
action (tuple) – Action executed
valid (bool) – False, if the action is not allowed for the current state.
- Return type:
[List[int], Tuple[int], bool]
- states2proxy(states)[source]
Prepares a batch of states in “environment format” for a proxy: the batch is simply converted into a tensor of indices.
- Parameters:
states (list or tensor) – A batch of states in environment format, either as a list of states or as a list of tensors.
- Returns:
A list containing all the states in the batch, represented themselves as lists.
- Return type:
torchtyping.TensorType[batch, state_dim]
- states2policy(states)[source]
Prepares a batch of states in “environment format” for the policy model: states are one-hot encoded.
- Parameters:
states (list or tensor) – A batch of states in environment format, either as a list of states or as a list of tensors.
- Returns:
A tensor containing all the states in the batch.
- Return type:
torchtyping.TensorType[batch, policy_input_dim]
- state2readable(state=None)[source]
Converts a state into a human-readable string.
The output string contains the letter corresponding to each index in the state, separated by spaces.
- Parameters:
states (tensor) – A state in environment format. If None, self.state is used.
state (List[int])
- Returns:
A string of space-separated letters.
- Return type:
str