gflownet.envs.scrabble

Scrabble environment: starting from an emtpy sequence, letters are added one by one up to a maximum length.

Attributes

LETTERS

Classes

Scrabble

Scrabble environment: sequences are constructed starting from an empty sequence and

Module Contents

gflownet.envs.scrabble.LETTERS = ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S',...[source]
class gflownet.envs.scrabble.Scrabble(letters=None, max_length=7, pad_token='0', **kwargs)[source]

Bases: gflownet.envs.base.GFlowNetEnv

Scrabble environment: sequences are constructed starting from an empty sequence and adding one letter at a time.

States are represented by a list of indices corresponding to each letter, starting from 1, and are padded with index 0.

Actions are represented by a single-element tuple with the index of the letter to be added. The EOS action is by (-1, ).

Parameters:
  • letters (Iterable)

  • max_length (int)

  • pad_token (str)

letters

An tuple containing the letters to form words. By default, LETTERS is used.

Type:

tuple

max_length[source]

Maximum length of the sequences. Default is 7, like in the standard game.

Type:

int

pad_token[source]

PAD token. Default: “0”.

Type:

str

pad_token = '0'[source]
n_letters[source]
max_length = 7[source]
eos_idx = -1[source]
pad_idx = 0[source]
idx2token[source]
token2idx[source]
source = [0, 0, 0, 0, 0, 0, 0][source]
eos[source]
get_action_space()[source]

Constructs list with all possible actions, including eos.

An action is represented by a single-element tuple indicating the index of the letter to be added to the current sequence (state).

The action space of this parent class is:

action_space: [(0,), (1,), (-1,)]

Return type:

List[Tuple]

get_mask_invalid_actions_forward(state=None, done=None)[source]
Returns a list of length the action space with values:
  • True if the forward action is invalid from the current state.

  • False otherwise.

Parameters:
  • state (tensor) – Input state. If None, self.state is used.

  • done (bool) – Whether the trajectory is done. If None, self.done is used.

Returns:

A list of boolean values.

Return type:

List[bool]

get_parents(state=None, done=None, action=None)[source]

Determines all parents and actions that lead to state.

The GFlowNet graph is a tree and there is only one parent per state.

Parameters:
  • state (tensor) – Input state. If None, self.state is used.

  • done (bool) – Whether the trajectory is done. If None, self.done is used.

  • action (None) – Ignored

Returns:

  • parents (list) – List of parents in state format. This environment has a single parent per state.

  • actions (list) – List of actions that lead to state for each parent in parents. This environment has a single parent per state.

Return type:

Tuple[List, List]

step(action, skip_mask_check=False)[source]

Executes step given an action.

Parameters:
  • action (tuple) – Action to be executed. An action is a tuple int values indicating the dimensions to increment by 1.

  • skip_mask_check (bool) – If True, skip computing forward mask of invalid actions to check if the action is valid.

Returns:

  • self.state (list) – The sequence after executing the action

  • action (tuple) – Action executed

  • valid (bool) – False, if the action is not allowed for the current state.

Return type:

[List[int], Tuple[int], bool]

states2proxy(states)[source]

Prepares a batch of states in “environment format” for a proxy: the batch is simply converted into a tensor of indices.

Parameters:

states (list or tensor) – A batch of states in environment format, either as a list of states or as a list of tensors.

Returns:

A list containing all the states in the batch, represented themselves as lists.

Return type:

torchtyping.TensorType[batch, state_dim]

states2policy(states)[source]

Prepares a batch of states in “environment format” for the policy model: states are one-hot encoded.

Parameters:

states (list or tensor) – A batch of states in environment format, either as a list of states or as a list of tensors.

Returns:

A tensor containing all the states in the batch.

Return type:

torchtyping.TensorType[batch, policy_input_dim]

state2readable(state=None)[source]

Converts a state into a human-readable string.

The output string contains the letter corresponding to each index in the state, separated by spaces.

Parameters:
  • states (tensor) – A state in environment format. If None, self.state is used.

  • state (List[int])

Returns:

A string of space-separated letters.

Return type:

str

readable2state(readable)[source]

Converts a state in readable format into the “environment format” (tensor)

Parameters:

readable (str) – A state in readable format - space-separated letters.

Returns:

A tensor containing the indices of the letters.

Return type:

List[int]

get_uniform_terminating_states(n_states, seed=None)[source]

Constructs a batch of n states uniformly sampled in the sample space of the environment.

Parameters:
  • n_states (int) – The number of states to sample.

  • seed (int) – Random seed.

Return type:

List[List[int]]