gflownet.envs.tetris ==================== .. py:module:: gflownet.envs.tetris .. autoapi-nested-parse:: An environment inspired by the game of Tetris. Attributes ---------- .. autoapisummary:: gflownet.envs.tetris.PIECES gflownet.envs.tetris.PIECES_COLORS Classes ------- .. autoapisummary:: gflownet.envs.tetris.Tetris Module Contents --------------- .. py:data:: PIECES .. py:data:: PIECES_COLORS .. py:class:: Tetris(width = 10, height = 20, pieces = ['I', 'J', 'L', 'O', 'S', 'T', 'Z'], rotations = [0, 90, 180, 270], allow_redundant_rotations = False, allow_eos_before_full = False, **kwargs) Bases: :py:obj:`gflownet.envs.base.GFlowNetEnv` Tetris environment: an environment inspired by the game of tetris. It's not supposed to be a game, but rather a toy environment with an intuitive state and action space. The state space is 2D board, with all the combinations of pieces on it. Pieces that are added to the board are identified by a number that starts from piece_idx * max_pieces_per_type, and is incremented by 1 with each new piece from the same type. This number fills in the cells of the board where the piece is located. This enables telling apart pieces of the same type. The action space is the choice of piece, its rotation and horizontal location where to drop the piece. The action space may be constrained according to needs. .. attribute:: width Width of the board. :type: int .. attribute:: height Height of the board. :type: int .. attribute:: pieces Pieces to use, identified by [I, J, L, O, S, T, Z] :type: list .. attribute:: rotations Valid rotations, from [0, 90, 180, 270] :type: list .. py:attribute:: device .. py:attribute:: int .. py:attribute:: width :value: 10 .. py:attribute:: height :value: 20 .. py:attribute:: pieces :value: ['I', 'J', 'L', 'O', 'S', 'T', 'Z'] .. py:attribute:: rotations :value: [0, 90, 180, 270] .. py:attribute:: allow_redundant_rotations :value: False .. py:attribute:: allow_eos_before_full :value: False .. py:attribute:: max_pieces_per_type :value: 100 .. py:attribute:: piece2idx .. py:attribute:: idx2piece .. py:attribute:: piece2mat .. py:attribute:: rot2idx .. py:attribute:: source .. py:attribute:: eos .. py:attribute:: piece_rotation_mat .. py:attribute:: piece_rotation_mask_mat .. py:method:: get_action_space() Constructs list with all possible actions, including eos. An action is represented by a tuple of length 3 (piece, rotation, col). The piece is represented by its index, the rotation by the integer rotation in degrees and the location by horizontal cell in the board of the left-most part of the piece. .. py:method:: get_mask_invalid_actions_forward(state = None, done = None) Returns a list of length the action space with values: - True if the forward action is invalid from the current state. - False otherwise. .. py:method:: states2proxy(states) Prepares a batch of states in "environment format" for a proxy: : simply converts non-zero (non-empty) cells into 1s. :param states: A batch of states in environment format, either as a list of states or as a single tensor. :type states: list of 2D tensors or 3D tensor :returns: *A tensor containing all the states in the batch.* .. py:method:: states2policy(states) Prepares a batch of states in "environment format" for the policy model. See states2proxy(). :param states: A batch of states in environment format, either as a list of states or as a single tensor. :type states: list of 2D tensors or 3D tensor :returns: *A tensor containing all the states in the batch.* .. py:method:: state2readable(state = None) Converts a state (board) into a human-friendly string. .. py:method:: readable2state(readable, alphabet={}) Converts a human-readable string representing a state into a state as a list of positions. .. py:method:: get_parents(state = None, done = None, action = None) Determines all parents and actions that lead to state. See: _is_parent_action() :param state: Representation of a state, as a list of length length where each element is the position at each dimension. :type state: list :param done: Whether the trajectory is done. If None, done is taken from instance. :type done: bool :param action: Ignored :type action: None :returns: * **parents** (*list*) -- List of parents in state format * **actions** (*list*) -- List of actions that lead to state for each parent in parents .. py:method:: step(action, skip_mask_check = False) Executes step given an action. :param action: Action to be executed. An action is a tuple int values indicating the dimensions to increment by 1. :type action: tuple :param skip_mask_check: If True, skip computing forward mask of invalid actions to check if the action is valid. :type skip_mask_check: bool :returns: * **self.state** (*list*) -- The sequence after executing the action * **action** (*tuple*) -- Action executed * **valid** (*bool*) -- False, if the action is not allowed for the current state. .. py:method:: set_state(state, done = False) Sets the state and done. If done is True but incompatible with state (done is True, allow_eos_before_full is False and state is not full), then force done False and print warning. Also, make sure state is tensor. .. py:method:: plot_samples_topk(samples, rewards, k_top = 10, n_rows = 2, dpi = 150, **kwargs) Plot tetris boards of top K samples. :param samples: List of terminating states sampled from the policy. :type samples: list :param rewards: Rewards of the samples. :type rewards: list :param k_top: The number of samples that will be included in the plot. The k_top samples with the highest reward are selected. :type k_top: int :param n_rows: Number of rows in the plot. The number of columns will be calculated according the n_rows and k_top. :type n_rows: int :param dpi: DPI (dots per inch) of the figure, to determine the resolution. :type dpi: int