BLOCKS Reference

Machine Learning

Start training (classification/regression Model Generator)

This BLOCK is currently in beta and may become unusable after the official BLOCK is released. Please use the official BLOCK once released.

As a beta version, there is the possibility that some functions may not work as intended. We appreciate your feedback regarding bugs and ways to improve BLOCKS.

Overview

This BLOCK starts a training for a classification or regression type Model Generator.

Properties

Property Explanation
BLOCK name

Configure the name displayed on this BLOCK.

Model

Select the Model Generator that will run the new training.

Training name

Designate a name for the new training.

Training CSV URL

Designate the GCS URL for the training set CSV file.

If leave the Validation CSV URL property blank, BLOCKS will automatically split the training CSV at an 8:2 ratio into the training set and validation set.

Validation CSV URL

Designate the GCS URL for the validation set CSV file.

Max. time until timeout (minutes)

In order to achieve accurate results, the Model Generator can run multiple trials within a training.

With this property, you can set the maximum amount of time the Model Generator can spend per trial before it times out.

You can set this to 0 for no maximum time limit.

The Model Generator will stop a trial before it reaches the max. time until timeout if its results (accuracy/RMSE) worsen.

Max. number of trials

Configure at least 1 trial for the training to run.

The total time for the training will be approximately the max. time until timeout times the max. number of trials, though additional processing may slightly increase the required time.

Machine type

Select the type of machine that will run the training.

  • BASIC:

    Uses the basic type of machine to run the training.

  • BASIC_GPU:

    Uses a machine with a GPU (Graphic Processing Unit) to run the training. Trainings are generally faster, but your GCP fees will be approximately three times higher compared to the BASIC machine type. Depending on the training data, there may be cases when this type is only marginally faster or slower than the BASIC machine type.

  • STANDARD:

    Uses multiple machines to run the training.

BLOCK memos

Make notes about this BLOCK.

Skip header lines

Configure the number of header lines to skip in the training and validation CSV files. Set this to 0 if there are no header lines in the CSV files that need to be skipped.

Ignore error lines

Configure the number of rows with errors to ignore.

Epochs

Configure the number of epochs (the number of training cycles on the training set).

Batch size

Configure the number of records of training data to use per training loop step.

Patience epochs

Configure the number of epochs for early stopping.

Save checkpoints epochs Configure the number of epochs to be used for timing checkpoints for saving parameters.