ExperimentTracker

Use the ExperimentTracker class to connect to the trackserver and report progress and metrics of experiments.

Example:

from traintrack.client import ExperimentTracker

tracker = ExperimentTracker()

for epoch in range(1, 11):
    tracker.begin_epoch(epoch)

    for i, batch in enumerate(batches):
        # train on a batch
        # ...
        tracker.progress(i+1, n_batches)

    # report metrics for the epoch
    tracker.metric('loss/train', loss_train)
    tracker.metric('loss/valid', loss_valid)
    tracker.metric('acc/valid', acc_valid)
    tracker.end_epoch()
class traintrack.client.ExperimentTracker(experiment_id=None, host='0.0.0.0', port=4242, first_epoch=1, default_log_level='INFO', async_=False)

Experiment tracker client.

The experiment tracker client is used to communicate with a trackserver over ZeroRPC to report experiment configuration, metrics, and progress. The server then sends this information to configured backend services.

Args:
experiment_id (str, optional): identifier for the current experiment
that will be tracked. This is used by the server to uniquely identify the experiment and often backend services to write to log files and databases, etc. If unspecified, it will be generated based on the current date and time.
host (str, optional): the host name the server is running on. Default:
'0.0.0.0'
port (int, optioal): TCP port number that the server is running on.
Default: 4242
first_epoch (int, optional): the number of the epoch that will be
sent to the server when begin_epoch is first called. Useful for resuming stopped experiments. Default: 1
default_log_level (str, optional): default logging level when none is
specified in calls to log. Default: 'INFO'
async_ (bool, optional): whether to send messages to the server
asynchronously. If enabled, method calls will return immediately without waiting on a response from the server. This can be enabled if you are worried about communication with the server slowing down your experiments. Default: False
begin_epoch(epoch=None)

Start a new training epoch

Args:
epoch (int, optional): if specified the given epoch will be sent to
the server. Otherwise the last epoch will be incremented and sent to the server.
begin_task(name=None)

Start a new subtask (e.g. train, validation, etc).

Args:
name (str, optional): task name.
critical(text)

Convenience method to send a logging message with CRITICAL log level to the server.

Args:
test (str): the text to log.
debug(text)

Convenience method to send a logging message with DEBUG log level to the server.

Args:
test (str): the text to log.
description(text)

Report an description of the current experiment.

end_epoch()

End the current epoch.

end_task()

End current subtask.

error(text)

Convenience method to send a logging message with ERROR log level to the server.

Args:
test (str): the text to log.
image(name, image, pixel_order=None)

Report an image (e.g. a set of filters learned our outputs of a segmentation algorithm, etc.). The image will automatically be associated with the current training epoch.

Args:

name (str): name of the image (e.g. 'filters').

image (np.ndarray or PIL.Image): the image to report.

pixel_order (str, optional): the order of the pixels in
the ndarray. Can be 'CHW' for channels, height, width, or 'HHC' for height, width, channels. By default, the image encoding algorithm will attempt to guess based on the dimensions of the ndarray.
info(text)

Convenience method to send a logging message with INFO log level to the server.

Args:
test (str): the text to log.
log(text, level=None)

Send a logging message to the server.

Args:

test (str): the text to log.

level (str, optional): the log level. If unspecified, defaults to
self.default_log_level.
metric(name, value)

Report a (scalar) metric like training loss or validation accuracy. The metric will automatically be associated with the current training epoch.

Args:

name (str): name of the metric (e.g. 'loss/train').

value (float): the value of the metric

parameter(name, value)

Report an experiment parameter or hyperparameter (e.g. learning rate).

Args:

name (str): name of the parameter (e.g. 'lr').

value: value of the parameter being used in the experiment.

progress(completed, total, info=None)

Report progress on the current epoch.

Args:

completed (int): number of items (e.g. batches) completed.

total (int): number of items (e.g. batches) in total.

info (str, optional): extra information to be shown.

warn(text)

Convenience method to send a logging message with WARNING log level to the server.

Args:
test (str): the text to log.