gflownet.evaluator.abstract
===========================

.. py:module:: gflownet.evaluator.abstract

.. autoapi-nested-parse::

   Abstract evaluator class for GFlowNetAgent.

   .. warning::

       Should not be used directly, but subclassed to implement specific evaluators for
       different tasks and environments.

   See :class:`~gflownet.evaluator.base.BaseEvaluator` for a default,
   concrete implementation of this abstract class.

   This class handles some logic that will be the same for all evaluators.
   The only requirements for a subclass are to implement the
   :meth:`~gflownet.evaluator.abstract.AbstractEvaluator.eval` and
   :meth:`~gflownet.evaluator.abstract.AbstractEvaluator.plot` methods
   which will be called by the
   :meth:`~gflownet.evaluator.abstract.AbstractEvaluator.eval_and_log` method:

   .. code-include :: :meth:`gflownet.evaluator.abstract.AbstractEvaluator.eval_and_log`

   .. code-include :: :func:`gflownet.evaluator.abstract.AbstractEvaluator.eval_and_log`

   .. code-include :: :class:`gflownet.gflownet.abstract.AbstractEvaluator`

   .. code-include :: :func:`gflownet.utils.common.gflownet_from_config`

   .. code-block:: python

           def eval_and_log(self, it, metrics=None):
               results = self.eval(metrics=metrics)
               for m, v in results["metrics"].items():
                   setattr(self.gfn, m, v)

               metrics_to_log = {
                   METRICS[k]["display_name"]: v for k, v in results["metrics"].items()
               }

               figs = self.plot(**results["data"])

               self.logger.log_metrics(metrics_to_log, it, self.gfn.use_context)
               self.logger.log_plots(figs, it, use_context=self.gfn.use_context)

   See :mod:`gflownet.evaluator` for a full-fledged example and
   :mod:`gflownet.evaluator.base` for a concrete implementation of this abstract class.


Attributes
----------

.. autoapisummary::

   gflownet.evaluator.abstract.METRICS
   gflownet.evaluator.abstract.ALL_REQS


Classes
-------

.. autoapisummary::

   gflownet.evaluator.abstract.AbstractEvaluator


Module Contents
---------------

.. py:data:: METRICS

   All metrics that can be computed by a ``BaseEvaluator``.

   Structured as a dict with the metric names as keys and the metric display
   names and requirements as values.

   Requirements are used to decide which kind of data / samples is required to compute the
   metric.

   Display names are used to log the metrics and to display them in the console.

   Implementations of :class:`AbstractEvaluator` can add new metrics to
   this dict by implementing the method
   :meth:`AbstractEvaluator.define_new_metrics`.

.. py:data:: ALL_REQS

   Union of all requirements of all metrics in :const:`METRICS`.

.. py:class:: AbstractEvaluator(gfn_agent=None, **config)

   Abstract evaluator class for :class:`GFlowNetAgent`.

   In charge of evaluating the :class:`GFlowNetAgent`, computing metrics
   plotting figures and optionally logging results using the
   :class:`GFlowNetAgent`'s :class:`Logger`.

   You can use the :meth:`from_dir` or :meth:`from_agent` class methods
   to easily instantiate this class from a run directory or an existing
   in-memory :class:`GFlowNetAgent`.

   Use
   :meth:`~gflownet.evaluator.abstract.AbstractEvaluator.set_agent`
   to set the evaluator's :class:`GFlowNetAgent` after initialization if it was
   not provided at instantiation as ``GflowNetEvaluator(gfn_agent=...)``.

   This ``__init__`` function will call, in order:

   1. :meth:`update_all_metrics_and_requirements` which uses new metrics defined in
      the :meth:`define_new_metrics` method to update the global :const:`METRICS`
      and :const:`ALL_REQS` variables in classes inheriting from
      :class:`AbstractEvaluator`.

   2. ``self.metrics = self.make_metrics(self.config.metrics)`` using
      :meth:`make_metrics`

   3. ``self.reqs = self.make_requirements()`` using :meth:`make_requirements`

   :param gfn_agent: The GFlowNetAgent to evaluate. By default None. Should be set using the
                     :meth:`from_dir` or :meth:`from_agent` class methods.
   :type gfn_agent: GFlowNetAgent, optional
   :param config: The configuration of the evaluator. Will be converted to an OmegaConf
                  instance and stored in the ``self.config`` attribute.
   :type config: dict

   .. attribute:: config

      The configuration of the evaluator.

      :type: :class:`omegaconf.OmegaConf`

   .. attribute:: metrics

      Dictionary of metrics to compute, with the metric names as keys and the
      metric display names and requirements as values.

      :type: dict

   .. attribute:: reqs

      The set of requirements for the metrics. Used to decide which kind of data /
      samples is required to compute the metric.

      :type: set[str]

   .. attribute:: logger

      The logger to use to log the results of the evaluation. Will be set to the
      GFlowNetAgent's logger.

      :type: Logger


   .. py:attribute:: config


   .. py:attribute:: metrics


   .. py:attribute:: reqs


   .. py:property:: gfn

      Get the ``GFlowNetAgent`` to evaluate.

      This is a read-only property. Use the :meth:`set_agent` method to set
      the ``GFlowNetAgent``.

      :returns: :class:`GFlowNetAgent` -- The ``GFlowNetAgent`` to evaluate.

      :raises ValueError: If the ``GFlowNetAgent`` has not been set.


   .. py:method:: set_agent(gfn_agent)

      Set the ``GFlowNetAgent`` to evaluate after initialization.

      It is then accessible through the ``self.gfn`` property.

      :param gfn_agent: The ``GFlowNetAgent`` to evaluate.
      :type gfn_agent: :class:`GFlowNetAgent`


   .. py:method:: define_new_metrics()

      Method to be implemented by subclasses to define new metrics.

      .. admonition:: Example

         .. code-block:: python

             def define_new_metrics(self):
                 return {
                     "my_custom_metric": {
                         "display_name": "My custom metric",
                         "requirements": ["density", "new_req"],
                     }
                 }

      :returns: *dict* -- Dictionary of new metrics to add to the global :const:`METRICS` dict.


   .. py:method:: update_all_metrics_and_requirements()

      Method to be implemented by subclasses to update the global dict of metrics and
      requirements.


   .. py:method:: from_dir(path, no_wandb = True, print_config = False, device = 'cuda', load_final_ckpt = True)
      :classmethod:


      Instantiate a BaseEvaluator from a run directory.

      :param cls: Class to instantiate.
      :type cls: BaseEvaluator
      :param path: Path to the run directory from which to load the GFlowNetAgent.
      :type path: Union[str, os.PathLike]
      :param no_wandb: Prevent wandb initialization, by default True
      :type no_wandb: bool, optional
      :param print_config: Whether or not to print the resulting (loaded) config, by default False
      :type print_config: bool, optional
      :param device: Device to use for the instantiated GFlowNetAgent, by default "cuda"
      :type device: str, optional
      :param load_final_ckpt: Use the latest possible checkpoint available in the path, by default True
      :type load_final_ckpt: bool, optional

      :returns: *BaseEvaluator* -- Instance of BaseEvaluator with the GFlowNetAgent loaded from the run.


   .. py:method:: from_agent(gfn_agent)
      :classmethod:


      Instantiate a BaseEvaluator from a GFlowNetAgent.

      :param cls: Evaluator class to instantiate.
      :type cls: BaseEvaluator
      :param gfn_agent: Instance of GFlowNetAgent to use for the BaseEvaluator.
      :type gfn_agent: GFlowNetAgent

      :returns: *BaseEvaluator* -- Instance of BaseEvaluator with the provided GFlowNetAgent.


   .. py:method:: make_metrics(metrics=None)

      Parse metrics from a dict, list, a string or ``None``.

      - If ``None``, all metrics are selected.
      - If a string, it can be a comma-separated list of metric names, with or without
        spaces.
      - If a list, it should be a list of metric names (keys of :const:`METRICS`).
      - If a dict, its keys should be metric names and its values will be ignored:
        they will be assigned from :const:`METRICS`.

      All metrics must be in :const:`METRICS`.

      :param metrics: Metrics to compute when running the
                      :meth:`.eval` method. Defaults to ``None``, i.e. all metrics in
                      :const:`METRICS` are computed.
      :type metrics: Union[str, List[str]], optional

      :returns: *dict* -- Dictionary of metrics to compute, with the metric names as keys and the
                metric display names and requirements as values.

      :raises ValueError: If a metric name is not in :const:`METRICS`.


   .. py:method:: make_requirements(reqs=None, metrics=None)

      Make requirements for the metrics to compute.

      1. If ``metrics`` is provided, they must be as a dict of metrics.
         The requirements are computed from the ``requirements`` attribute of
         the metrics.

      2. Otherwise, the requirements are computed from the ``reqs`` argument:
          - If ``reqs`` is ``"all"``, all requirements of all metrics are computed.
          - If ``reqs`` is ``None``, the evaluator's ``self.reqs`` attribute is used.
          - If ``reqs`` is a list, it is used as the requirements.

      :param reqs: The metrics requirements. Either ``"all"``, a list of requirements or
                   ``None`` to use the evaluator's ``self.reqs`` attribute.
                   By default ``None``.
      :type reqs: Union[str, List[str]], optional
      :param metrics: The metrics to compute requirements for. If not a dict, will be passed to
                      :meth:`make_metrics`. By default None.
      :type metrics: Union[str, List[str], dict], optional

      :returns: *set[str]* -- The set of requirements for the metrics.


   .. py:method:: should_log_train(step)

      Check if training logs should be done at the current step. The decision is based
      on the ``self.config.train.period`` attribute.

      Set ``self.config.train.period`` to ``None`` or a negative value to disable
      training.

      :param step: Current iteration step.
      :type step: int

      :returns: *bool* -- True if train logging should be done at the current step, False otherwise.


   .. py:method:: should_eval(step)

      Check if testing should be done at the current step. The decision is based on
      the ``self.config.test.period`` attribute.

      Set ``self.config.test.first_it`` to ``True`` if testing should be done at the
      first iteration step. Otherwise, testing will be done aftter
      ``self.config.test.period`` steps.

      Set ``self.config.test.period`` to ``None`` or a negative value to disable
      testing.

      :param step: Current iteration step.
      :type step: int

      :returns: *bool* -- True if testing should be done at the current step, False otherwise.


   .. py:method:: should_eval_top_k(step)

      Check if top k plots and metrics should be done at the current step. The
      decision is based on the ``self.config.test.top_k`` and
      ``self.config.test.top_k_period`` attributes.

      Set ``self.config.test.top_k`` to ``None`` or a negative value to disable top k
      plots and metrics.

      :param step: Current iteration step.
      :type step: int

      :returns: *bool* -- True if top k plots and metrics should be done at the current step, False


   .. py:method:: should_checkpoint(step)

      Check if checkpoints should be done at the current step. The decision is based
      on the ``self.checkpoints.period`` attribute.

      Set ``self.checkpoints.period`` to ``None`` or a negative value to disable
      checkpoints.

      :param step: Current iteration step.
      :type step: int

      :returns: *bool* -- True if checkpoints should be done at the current step, False otherwise.


   .. py:method:: plot(**kwargs)
      :abstractmethod:


      The main method to plot results.

      Will be called by the :meth:`eval_and_log` method to plot the results
      of the evaluation.
      Will be passed the results of the :meth:`eval` method:

      .. code-block:: python

          # in eval_and_log
          results = self.eval(metrics=metrics)
          figs = self.plot(**results["data"])

      :returns: *dict* -- Dictionary of figures to log, with the figure names as keys and the figures
                as values.


   .. py:method:: eval(metrics=None, **plot_kwargs)
      :abstractmethod:


      The main method to compute metrics and intermediate results.

      This method should return a dict with two keys: ``"metrics"`` and ``"data"``.

      The "metrics" key should contain the new metric(s) and the "data" key should
      contain the intermediate results that can be used to plot the new metric(s).

      .. admonition:: Example

         >>> metrics = None # use the default metrics from the config file
         >>> results = gfne.eval(metrics=metrics)
         >>> plots = gfne.plot(**results["data"])

         >>> metrics = "all" # compute all metrics, regardless of the config
         >>> results = gfne.eval(metrics=metrics)

         >>> metrics = ["l1", "kl"] # compute only the L1 and KL metrics
         >>> results = gfne.eval(metrics=metrics)

         >>> metrics = "l1,kl" # alternative syntax
         >>> results = gfne.eval(metrics=metrics)

         See :ref:`evaluator basic concepts` for more details about ``metrics``.

      :param metrics: Which metrics to compute, by default ``None``.
      :type metrics: Union[str, dict, list], optional


   .. py:method:: eval_top_k(it)
      :abstractmethod:


      Evaluate the ``GFlowNetAgent``'s top k samples performance.

      Classes extending this abstract class should implement this method.

      :param it: Current iteration step.
      :type it: int

      :returns: *dict* -- Dictionary with the following keys schema:
                .. code-block:: python

                    {
                        "metrics": {str: float},
                        "figs": {str: plt.Figure},
                        "summary": {str: float},
                    }


   .. py:method:: eval_and_log(it, metrics=None)

      Evaluate the GFlowNetAgent and log the results with its logger.

      Will call ``self.eval()`` and log the results using the GFlowNetAgent's logger
      ``log_metrics()`` and ``log_plots()`` methods.

      :param it: Current iteration step.
      :type it: int
      :param metrics: List of metrics to compute, by default the evaluator's ``metrics``
                      attribute.
      :type metrics: Union[str, List[str]], optional


   .. py:method:: eval_and_log_top_k(it)

      Evaluate the GFlowNetAgent's top k samples performance and log the results with
      its logger.

      :param it: Current iteration step, by default None.
      :type it: int