gflownet.envs.scrabble

Scrabble environment: starting from an emtpy sequence, letters are added one by one up to a maximum length.

Attributes

LETTERS

Classes

Scrabble

Scrabble environment: sequences are constructed starting from an empty sequence and

Module Contents

gflownet.envs.scrabble.LETTERS = ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S',...[source]

class gflownet.envs.scrabble.Scrabble(letters=None, max_length=7, pad_token='0', **kwargs)[source]

Bases: gflownet.envs.base.GFlowNetEnv

Scrabble environment: sequences are constructed starting from an empty sequence and adding one letter at a time.

States are represented by a list of indices corresponding to each letter, starting from 1, and are padded with index 0.

Actions are represented by a single-element tuple with the index of the letter to be added. The EOS action is by (-1, ).

Parameters:

letters (Iterable)
max_length (int)
pad_token (str)

letters

An tuple containing the letters to form words. By default, LETTERS is used.

Type:: tuple

max_length[source]

Maximum length of the sequences. Default is 7, like in the standard game.

Type:: int

pad_token[source]

PAD token. Default: “0”.

Type:: str

pad_token = '0'[source]

n_letters[source]

max_length = 7[source]

eos_idx = -1[source]

pad_idx = 0[source]

idx2token[source]

token2idx[source]

source = [0, 0, 0, 0, 0, 0, 0][source]

eos[source]

get_action_space()[source]

Constructs list with all possible actions, including eos.

An action is represented by a single-element tuple indicating the index of the letter to be added to the current sequence (state).

The action space of this parent class is:: action_space: [(0,), (1,), (-1,)]

Return type:: List[Tuple]

get_mask_invalid_actions_forward(state=None, done=None)[source]

Returns a list of length the action space with values:

True if the forward action is invalid from the current state.
False otherwise.

Parameters:

state (tensor) – Input state. If None, self.state is used.
done (bool) – Whether the trajectory is done. If None, self.done is used.

Returns:

A list of boolean values.

Return type:

List[bool]

get_parents(state=None, done=None, action=None)[source]

Determines all parents and actions that lead to state.

The GFlowNet graph is a tree and there is only one parent per state.

Parameters:

state (tensor) – Input state. If None, self.state is used.
done (bool) – Whether the trajectory is done. If None, self.done is used.
action (None) – Ignored

Returns:

parents (list) – List of parents in state format. This environment has a single parent per state.
actions (list) – List of actions that lead to state for each parent in parents. This environment has a single parent per state.

Return type:

Tuple[List, List]

step(action, skip_mask_check=False)[source]

Executes step given an action.

Parameters:

action (tuple) – Action to be executed. An action is a tuple int values indicating the dimensions to increment by 1.
skip_mask_check (bool) – If True, skip computing forward mask of invalid actions to check if the action is valid.

Returns:

self.state (list) – The sequence after executing the action
action (tuple) – Action executed
valid (bool) – False, if the action is not allowed for the current state.

Return type:

[List[int], Tuple[int], bool]

states2proxy(states)[source]

Prepares a batch of states in “environment format” for a proxy: the batch is simply converted into a tensor of indices.

Parameters:: states (list or tensor) – A batch of states in environment format, either as a list of states or as a list of tensors.
Returns:: A list containing all the states in the batch, represented themselves as lists.
Return type:: torchtyping.TensorType[batch, state_dim]

states2policy(states)[source]

Prepares a batch of states in “environment format” for the policy model: states are one-hot encoded.

Parameters:: states (list or tensor) – A batch of states in environment format, either as a list of states or as a list of tensors.
Returns:: A tensor containing all the states in the batch.
Return type:: torchtyping.TensorType[batch, policy_input_dim]

state2readable(state=None)[source]

Converts a state into a human-readable string.

The output string contains the letter corresponding to each index in the state, separated by spaces.

Parameters:

states (tensor) – A state in environment format. If None, self.state is used.
state (List[int])

Returns:

A string of space-separated letters.

Return type:

str

readable2state(readable)[source]

Converts a state in readable format into the “environment format” (tensor)

Parameters:: readable (str) – A state in readable format - space-separated letters.
Returns:: A tensor containing the indices of the letters.
Return type:: List[int]

get_uniform_terminating_states(n_states, seed=None)[source]

Constructs a batch of n states uniformly sampled in the sample space of the environment.

Parameters:

n_states (int) – The number of states to sample.
seed (int) – Random seed.

Return type:

List[List[int]]