gflownet.evaluator.abstract
Abstract evaluator class for GFlowNetAgent.
Warning
Should not be used directly, but subclassed to implement specific evaluators for different tasks and environments.
See BaseEvaluator for a default,
concrete implementation of this abstract class.
This class handles some logic that will be the same for all evaluators.
The only requirements for a subclass are to implement the
eval() and
plot() methods
which will be called by the
eval_and_log() method:
def eval_and_log(self, it, metrics=None):
results = self.eval(metrics=metrics)
for m, v in results["metrics"].items():
setattr(self.gfn, m, v)
metrics_to_log = {
METRICS[k]["display_name"]: v for k, v in results["metrics"].items()
}
figs = self.plot(**results["data"])
self.logger.log_metrics(metrics_to_log, it, self.gfn.use_context)
self.logger.log_plots(figs, it, use_context=self.gfn.use_context)
See gflownet.evaluator for a full-fledged example and
gflownet.evaluator.base for a concrete implementation of this abstract class.
Attributes
All metrics that can be computed by a |
|
Union of all requirements of all metrics in |
Classes
Abstract evaluator class for |
Module Contents
- gflownet.evaluator.abstract.METRICS[source]
All metrics that can be computed by a
BaseEvaluator.Structured as a dict with the metric names as keys and the metric display names and requirements as values.
Requirements are used to decide which kind of data / samples is required to compute the metric.
Display names are used to log the metrics and to display them in the console.
Implementations of
AbstractEvaluatorcan add new metrics to this dict by implementing the methodAbstractEvaluator.define_new_metrics().
- class gflownet.evaluator.abstract.AbstractEvaluator(gfn_agent=None, **config)[source]
Abstract evaluator class for
GFlowNetAgent.In charge of evaluating the
GFlowNetAgent, computing metrics plotting figures and optionally logging results using theGFlowNetAgent’sLogger.You can use the
from_dir()orfrom_agent()class methods to easily instantiate this class from a run directory or an existing in-memoryGFlowNetAgent.Use
set_agent()to set the evaluator’sGFlowNetAgentafter initialization if it was not provided at instantiation asGflowNetEvaluator(gfn_agent=...).This
__init__function will call, in order:update_all_metrics_and_requirements()which uses new metrics defined in thedefine_new_metrics()method to update the globalMETRICSandALL_REQSvariables in classes inheriting fromAbstractEvaluator.self.metrics = self.make_metrics(self.config.metrics)usingmake_metrics()self.reqs = self.make_requirements()usingmake_requirements()
- Parameters:
gfn_agent (GFlowNetAgent, optional) – The GFlowNetAgent to evaluate. By default None. Should be set using the
from_dir()orfrom_agent()class methods.config (dict) – The configuration of the evaluator. Will be converted to an OmegaConf instance and stored in the
self.configattribute.
- metrics[source]
Dictionary of metrics to compute, with the metric names as keys and the metric display names and requirements as values.
- Type:
dict
- reqs[source]
The set of requirements for the metrics. Used to decide which kind of data / samples is required to compute the metric.
- Type:
set[str]
- logger
The logger to use to log the results of the evaluation. Will be set to the GFlowNetAgent’s logger.
- Type:
- property gfn[source]
Get the
GFlowNetAgentto evaluate.This is a read-only property. Use the
set_agent()method to set theGFlowNetAgent.- Returns:
GFlowNetAgent– TheGFlowNetAgentto evaluate.- Raises:
ValueError – If the
GFlowNetAgenthas not been set.
- set_agent(gfn_agent)[source]
Set the
GFlowNetAgentto evaluate after initialization.It is then accessible through the
self.gfnproperty.- Parameters:
gfn_agent (
GFlowNetAgent) – TheGFlowNetAgentto evaluate.
- define_new_metrics()[source]
Method to be implemented by subclasses to define new metrics.
Example
def define_new_metrics(self): return { "my_custom_metric": { "display_name": "My custom metric", "requirements": ["density", "new_req"], } }
- Returns:
dict – Dictionary of new metrics to add to the global
METRICSdict.
- update_all_metrics_and_requirements()[source]
Method to be implemented by subclasses to update the global dict of metrics and requirements.
- classmethod from_dir(path, no_wandb=True, print_config=False, device='cuda', load_final_ckpt=True)[source]
Instantiate a BaseEvaluator from a run directory.
- Parameters:
cls (BaseEvaluator) – Class to instantiate.
path (Union[str, os.PathLike]) – Path to the run directory from which to load the GFlowNetAgent.
no_wandb (bool, optional) – Prevent wandb initialization, by default True
print_config (bool, optional) – Whether or not to print the resulting (loaded) config, by default False
device (str, optional) – Device to use for the instantiated GFlowNetAgent, by default “cuda”
load_final_ckpt (bool, optional) – Use the latest possible checkpoint available in the path, by default True
- Returns:
BaseEvaluator – Instance of BaseEvaluator with the GFlowNetAgent loaded from the run.
- classmethod from_agent(gfn_agent)[source]
Instantiate a BaseEvaluator from a GFlowNetAgent.
- Parameters:
cls (BaseEvaluator) – Evaluator class to instantiate.
gfn_agent (GFlowNetAgent) – Instance of GFlowNetAgent to use for the BaseEvaluator.
- Returns:
BaseEvaluator – Instance of BaseEvaluator with the provided GFlowNetAgent.
- make_metrics(metrics=None)[source]
Parse metrics from a dict, list, a string or
None.If
None, all metrics are selected.If a string, it can be a comma-separated list of metric names, with or without spaces.
If a list, it should be a list of metric names (keys of
METRICS).If a dict, its keys should be metric names and its values will be ignored: they will be assigned from
METRICS.
All metrics must be in
METRICS.- Parameters:
metrics (Union[str, List[str]], optional) – Metrics to compute when running the
eval()method. Defaults toNone, i.e. all metrics inMETRICSare computed.- Returns:
dict – Dictionary of metrics to compute, with the metric names as keys and the metric display names and requirements as values.
- Raises:
ValueError – If a metric name is not in
METRICS.
- make_requirements(reqs=None, metrics=None)[source]
Make requirements for the metrics to compute.
If
metricsis provided, they must be as a dict of metrics. The requirements are computed from therequirementsattribute of the metrics.- Otherwise, the requirements are computed from the
reqsargument: If
reqsis"all", all requirements of all metrics are computed.If
reqsisNone, the evaluator’sself.reqsattribute is used.If
reqsis a list, it is used as the requirements.
- Otherwise, the requirements are computed from the
- Parameters:
reqs (Union[str, List[str]], optional) – The metrics requirements. Either
"all", a list of requirements orNoneto use the evaluator’sself.reqsattribute. By defaultNone.metrics (Union[str, List[str], dict], optional) – The metrics to compute requirements for. If not a dict, will be passed to
make_metrics(). By default None.
- Returns:
set[str] – The set of requirements for the metrics.
- should_log_train(step)[source]
Check if training logs should be done at the current step. The decision is based on the
self.config.train.periodattribute.Set
self.config.train.periodtoNoneor a negative value to disable training.- Parameters:
step (int) – Current iteration step.
- Returns:
bool – True if train logging should be done at the current step, False otherwise.
- should_eval(step)[source]
Check if testing should be done at the current step. The decision is based on the
self.config.test.periodattribute.Set
self.config.test.first_ittoTrueif testing should be done at the first iteration step. Otherwise, testing will be done aftterself.config.test.periodsteps.Set
self.config.test.periodtoNoneor a negative value to disable testing.- Parameters:
step (int) – Current iteration step.
- Returns:
bool – True if testing should be done at the current step, False otherwise.
- should_eval_top_k(step)[source]
Check if top k plots and metrics should be done at the current step. The decision is based on the
self.config.test.top_kandself.config.test.top_k_periodattributes.Set
self.config.test.top_ktoNoneor a negative value to disable top k plots and metrics.- Parameters:
step (int) – Current iteration step.
- Returns:
bool – True if top k plots and metrics should be done at the current step, False
- should_checkpoint(step)[source]
Check if checkpoints should be done at the current step. The decision is based on the
self.checkpoints.periodattribute.Set
self.checkpoints.periodtoNoneor a negative value to disable checkpoints.- Parameters:
step (int) – Current iteration step.
- Returns:
bool – True if checkpoints should be done at the current step, False otherwise.
- abstract plot(**kwargs)[source]
The main method to plot results.
Will be called by the
eval_and_log()method to plot the results of the evaluation. Will be passed the results of theeval()method:# in eval_and_log results = self.eval(metrics=metrics) figs = self.plot(**results["data"])
- Returns:
dict – Dictionary of figures to log, with the figure names as keys and the figures as values.
- abstract eval(metrics=None, **plot_kwargs)[source]
The main method to compute metrics and intermediate results.
This method should return a dict with two keys:
"metrics"and"data".The “metrics” key should contain the new metric(s) and the “data” key should contain the intermediate results that can be used to plot the new metric(s).
Example
>>> metrics = None # use the default metrics from the config file >>> results = gfne.eval(metrics=metrics) >>> plots = gfne.plot(**results["data"])
>>> metrics = "all" # compute all metrics, regardless of the config >>> results = gfne.eval(metrics=metrics)
>>> metrics = ["l1", "kl"] # compute only the L1 and KL metrics >>> results = gfne.eval(metrics=metrics)
>>> metrics = "l1,kl" # alternative syntax >>> results = gfne.eval(metrics=metrics)
See Basic concepts for more details about
metrics.- Parameters:
metrics (Union[str, dict, list], optional) – Which metrics to compute, by default
None.
- abstract eval_top_k(it)[source]
Evaluate the
GFlowNetAgent’s top k samples performance.Classes extending this abstract class should implement this method.
- Parameters:
it (int) – Current iteration step.
- Returns:
dict – Dictionary with the following keys schema: .. code-block:: python
- {
“metrics”: {str: float}, “figs”: {str: plt.Figure}, “summary”: {str: float},
}
- eval_and_log(it, metrics=None)[source]
Evaluate the GFlowNetAgent and log the results with its logger.
Will call
self.eval()and log the results using the GFlowNetAgent’s loggerlog_metrics()andlog_plots()methods.- Parameters:
it (int) – Current iteration step.
metrics (Union[str, List[str]], optional) – List of metrics to compute, by default the evaluator’s
metricsattribute.