ML Board Help

ML Board Help

ML Board Help

Introduction

This page will explain each of the ML Board creation and training screens. Clicking on any of the help links within those screens will link to a section of this document.

For more detailed information on how to use the ML Board, refer to the following pages:

What is an ML Board?

ML Boards are an MAGELLAN BLOCKS feature designed to make machine learning simple and accessible for everyone.

There are two basic steps to Machine Learning with BLOCKS: training and prediction.

  1. During the training step, past data is used to train the predictive model.

  2. In the prediction step, the training results are used to make predictions for future data.

The ML Board takes care of the training step, while the ML Board Predict BLOCK on a Bigdata Board takes care of the prediction step.

The training step is referred to as simply “training” within the ML Board. The training data used during this step consists of a training set and a validation set. These are used to train and then optimize the predictive model, leading to more accurate results (the "training results").

The training set and validation set are made from dividing up the data appropriately.

The ML Board uses Google Cloud Machine Learning Engine (Cloud ML Engine) to implement its machine learning functions.

GCP service account settings help

This section applies to Self-Service Plan users only.

Set the GCP service account and enable required APIs on this screen.

GCP サービスアカウント設定

Select a GCP service account

Since the ML Board works by creating an environment within the user’s GCP project, it needs permission to use it. This is made possible with a GCP service account key file.

Select which GCP service account to use with the ML Board from the drop down list.

Enable APIs

Do the following if there is not a checkmark next to the API’s name “Check” button:

  1. Click on the API’s link.
  2. Click the “Enable” button at the top of the Google API Console screen.
  3. Once the “Enable” button changes to “Disable”, close the Google API Console and return to the ML Board setup screen.

Once finished with the above, click the API’s “Check” button and confirm that a checkmark appears for the API.

You may need to wait a bit of time for the checkmark to appear. If it does not appear, try clicking the “Check” button and waiting again. Depending on the circumstances, this process can take a bit of time.

Google Cloud Machine Learning Engine Settings

This section applies to Self-Service Plan users only.

This screen contains settings required for using the Google Cloud Machine Learning Engine service. These only need to be configured once per GCP project that uses ML Boards.

Google Cloud Machine Learning Engine Settings
  1. Click the “Google Cloud Console” link.

  2. Click the “Activate Google Cloud Shell” button () in the upper right of the Google Cloud Platform dashboard.

  3. If you get the following screen, click the “Start Cloud Shell” button.

    Google Cloud Shell

  4. Input gcloud ml-engine init-project into the black portion at the bottom of the Google Cloud Shell screen and press return.

    Respond Y to the message that appears.

  5. Click the X button in the upper-right of the Google Cloud Shell to close it.

  6. Close the Google Cloud Console and return to BLOCKS.

Storage Settings

This section applies to Self-Service Plan users only.

This screen contains settings for the Google Cloud Storage (GCS) bucket and directory that will contain the training results.

ストレージ設定

Select GCS bucket

Prepare a bucket for exclusive use by ML Boards, then select this bucket from the menu on this screen. For best results, use the following settings when creating buckets for ML Boards:

Option Value
Default storage class Regional
Location us-central1

GCS directory settings

One bucket can be used to support multiple ML Boards by creating a different directory within for each Board. Directories do not need to be created in advance, as a new one will be made automatically with the name entered in this setting.

Training Data Settings

Configure settings for the training data on this screen.

トレーニングデータ設定

Training data should be prepared as CSV files with commas for field-separators.

  • To be used for training, the data should consist of a set of at least one “input variable” and a “results value”. In the classification model’s case, the results value refers to the “answer value”. It refers to the “actual results” for the regression model.
  • Align each row as a set of input data and the results value.
  • Order each row with the input data first, followed by the results value.
  • The results values must be of numerical value type.
  • The training set and validation set must formatted identically.

Input variable settings

Enter information for each input variable. The data for each input variable is referred to (starting from left to right) as “Item 1”, “Item 2”, Item 3”, etc.

Do not enter information for the results value.

Click the “Add another item” button and enter a name and the type information for each input variable.

Setting Explanation
Item name Enter a name for each item. Letters, numbers and the underscore symbol (_) may be used.
Type Designate the each item’s data type. The four supported types are: numerical value, month, day, and string. Refer to the chart below for details about each type.

Supported types:

Type Explanation
Numerical value Integers or decimal numbers. When the numerical value type is selected, the number of dimensions can also be configured. If a single item contains multiple numerical values in an enumerated list, dimensions refers to the number of these numerical values. For example, if we have the following data: 98,1.3,0,"A" and we want to treat the 98,1.3 portion as one item, we would choose 2 for the dimensions setting.
Month Integers indicating the month. The range can be either 0–11 or 1–12.
Day Integers indicating the day of the week. The range is 0–6.
String

String-type data. Can select between a “Keyword list” or “Approximate number”.

String type-specific settings Explanation
Keyword list Select this option when the strings have a clear pattern. Enter this pattern as a list separated by commas. For example, if the item contained strings for each blood type, I might enter “A, B, O, AB” for this setting.
Approximate number Select this option when the number of patterns in the strings is not clear. Enter the approximate number of patterns you expect to exist. When there is a clear pattern, you can enter a number in here instead of writing out a keyword list, if preferred.

Number of classes

When creating a classification model, enter the number of classes in the results values here.

Trainings Screen

The following actions can be performed on this screen:

  • Start a training
  • See a list of trainings
  • Check a training’s details
  • Check information about the ML Board
  • Delete the ML Board
ML ボード詳細

Start a training

Click the “Start Training” button to create a new training.

Here you can configure the required information for a new training and start it.

Training list

A list will be displayed when there is at least one training. The following information is contained in the training list:

Information Explanation
Training name Configured in the Start Training settings.
Started Shows the date and time when the training began.
Finished Shows the date and time when the training finished.
Status Shows the training’s status. Possible statuses are: Preparing, Running, Succeeded, Failed, and Stopped.
RMSE/Accuracy Shows the results of evaluating the training. For the classification model, this refers to its accuracy in selecting the correct class. For the regression model, it refers to its RMSE (Root Mean Square Error), a measure of the difference between the actual results and the predicted results.
Details Displays a screen where the training’s details can be confirmed.
Actions Apply a training (set it to be used for making predictions), or stop a training that is currently running. Only one training can be applied at a time.

Show training details

You can check a training’s details by clicking its "Show details" link.

Item Explanation
Training name The name configured in the Start Training screen’s settings.
Started The date and time when the training began.
Finished The date and time when the training finished.
Status Shows the training’s status. Possible statuses are: Preparing, Training, Succeeded, and Failed.
Accuracy/RMSE Shows the results of evaluating the training. For the classification model, this refers to its accuracy in selecting the correct class. For the regression model, it refers to its RMSE (Root Mean Square Error), a measure of the difference between the actual results and the predicted results.
Training description Shows the explanation input in the Start training screen’s settings.
Settings Shows details of what was configured in the training data settings

Confirm the ML Board’s settings

Click the “Setting info” button to bring up information about the ML Board’s settings.

Item Explanation
Board name The Board name configured on the Create ML Board screen.
Type The type selected on the Create ML Board screen.
GCP service account The ID of the GCP service account associated with the ML Board.
Model name The model name of the training results.
Training data settings The information configured on the Training Data Settings screen.

Delete Board

Delete the ML Board by clicking the “Delete Board” button at the bottom of the screen.

Start Training

Configure settings necessary to create and start a new training on this screen.

トレーニング開始

The settings are as follows:

Setting Explanation
Training name Assign a name to the new training
Upload training data

The GCS location where the training data will be saved is shown with the following format: “gs://BUCKETNAME”.

Clicking the link will open the Google Cloud Console in another tab where you can access this GCS location. You will need to sign in using a Google account registered into the GCP access section of the Project settings screen.

This is only displayed for Full Service Plan users.

Training set URL

Designate the GCS URL for the training set (Example: gs://bucketname/filename.csv).

The training set must be stored in GCS.

Validation set URL Designate the GCS URL for the validation set (Example: gs://bucketname/filename.csv). The validation set must be stored in GCS.
Max. time until timeout This setting configures the max amount of time that the ML Board will spend on one trial. Set this to 0 to set no limit.
Max. number of trials Set the maximum trials that the ML Board will run as a number 1 or greater.
Explanation Write an explanation for this training. (optional)

GCP service charges

The ML Board creates an environment in the user’s GCP project that utilizes various GCP services.

As such, GCP service charges will apply separately from MAGELLAN BLOCKS fees. Applicable charges will vary depending on the service. For details, refer to the pricing page for each of the services used by the ML Board