ML Metrics Dashboard
===============
RapidFire AI offers a browser-based dashboard to automatically visualize all ML metrics and lets
you control runs on the fly from there.
Our current default dashboard is a fork of the popular OSS tool `MLflow `__,
and it inherits much of MLflow's native features.
As of this writing, we support only MLflow for visualization and run metadata management.
But the execution engine and API are *not* tied to MLflow, and it can be easily extended
to support other dashboards such as Trackio, TensorBoard, and Weights & Biases too. We
will add such support in due course and also welcome community contributions on this front.
Tabs in the Dashboard
-----
The main "Experiments" page on the dashboard has 4 main tabs:
* Table
* Chart
* Experiment Log
* Interactive Control (IC) Log
The screenshot below shows the "Table" view of an experiment with all its runs.
Each run represents one model with one set of config knob values, which is standard dashboard semantics.
.. raw:: html
Metrics Plots
----
The screenshot below shows the "Chart" view of an experiment with all its runs.
Each plot corresponds to a metric, spanning :code:`loss` on the training set and evaluation set,
as well all named metrics returned in your :func:`compute_metrics()` function in the trainer config.
We call attention to 3 key aspects of the visualizations here:
* The x-axis "Step" for the mini batch-level plots represents absolute number of minibatches seen by that run. So, if the :code:`batch_size` is different for different runs in your experiment, they will take different numbers of steps and the curves will not line up till the end. This is not a bug but the expected correct behavior.
* The x-axis "Step" for the epoch-level plots represents absolute number of epochs seen by that run. So, if the :code:`epochs` is different for different runs in your experiment, they will take different numbers of steps again as above.
* Please refresh the browser page to get RapidFire AI's metrics reader to pull the latest data entries from the metrics files.
.. raw:: html
The dashboard picks some default colors for all runs, but you can change their colors by
clicking the "color circle" next to the run number in the "Run Name" column.
A color palette will pop up as shown in the screenshot below.
.. raw:: html
Message Logs
------
There are two continually appending message logs on the third and fourth tabs: "Experiment Log" and
"Interactive Control Log", respectively.
All operations you run with RapidFire AI's API will be displayed on the former.
The latter will specifically display all the Interactive Control (IC) operations you do via the IC Ops
panels, as shown on the screenshot below.
.. raw:: html
The full experiment log will also be available as a text file saved on your local directory under the name "rapidfire.log".