gflownet.utils.molecule.featurizer
Attributes
Classes
A class for converting RDKit molecule into DGL graph with featurizing chemical properties into node and edge features |
Module Contents
- class gflownet.utils.molecule.featurizer.MolDGLFeaturizer(atom_types)[source]
A class for converting RDKit molecule into DGL graph with featurizing chemical properties into node and edge features
- one_hot_encode(value, choices)[source]
Creates a one-hot encoding vector. :param value: The value for which the encoding should be one :param choices: A list of possible values :return: A one-hot encoding of the value as a 1d torch float tensor of length len(choices)
- get_node_features(mol)[source]
Simple atom featurization, considered features: - one-hot of the atom type index from self.atom_types tuple - atomic number (as a float number) - one-hot of the atom degree - one-hot of the atom hybritization
- Other atom features to add in the fututre:
.GetIsAromatic(), .GetImplicitValence(), .GetFormalCharge(), .IsAtomInRingOfSize(…) with various sizes, some other info about rings (see torsinal diff code) and maybe some others from Chenghao’s code
- Parameters:
mol – The rdkit.Chem.rdchem.Mol object
- Returns:
a torch.Tensor of node features of shape [number of atoms, node feature size]
(ordering of the nodes is gived by rdkit atoms order in mol)
- get_edges_and_edge_features(mol)[source]
Simple edge extraction and featurisation, considered features: - one-hot od the bound type Edges are directional (because of the dgl framework), each bond in the mol gives rise to two directional edges, which goes one after another in the output :param mol: The rdkit.Chem.rdchem.Mol object :returns: edges and edge_features, where
edges is a tuple of two lists (source nodes and destination nodes) of length 2 * number of bonds in mol
edge_features is a torch.Tensor of considered edge features (shape [2 * number of bonds, edge feature size])
- mol2dgl(mol)[source]
Converts rdkit.Chem.rdchem.Mol to dgl.heterograph.DGLHeteroGraph. Takes into account chemical properties of atoms and bonds, without considering their 3D positions (conformers of the molecule are not used here
- Parameters:
mol – The rdkit.Chem.rdchem.Mol object
- Returns:
dgl.heterograph.DGLHeteroGraph with .ndata and .edata containing atom and bond features