Basic Guide

Basic Guide

Image classification with BLOCKS Machine Learning

Image classification with BLOCKS Machine Learning

This page will explain how to use BLOCKS Machine Learning’s image classification to create a system that judges if photos are of cats or dogs.

Overview of dog or cat image classification

To do image classification with BLOCKS Machine Learning, we’ll use an ML Board and a Big Data Board. The ML Board is where we train a machine learning model (training), and the Big Data Board is where we will make predictions (judge if images are of cats or dogs). Images must be in JPG format for BLOCKS Machine Learning.

Image classification Boards overview

This document will not explain the basics of using BLOCKS. Please refer to the Basic Guide for more details on getting started with BLOCKS.

Getting started

If you do not yet have a BLOCKS account, refer to the Trial Guide to register for a free trial.

We’ve prepared all the necessary data for you to be able to try out BLOCKS Machine Learning right away with this guide. Each item will be explained in detail in the chart below.

Data Explanation
Sample data

This download contains the image files to be used for our Machine Learning image classification. There are separate files for both training and prediction.

  1. Download the sample data

    Click the link on the left to download the sample data. The sample data is a ZIP file containing multiple folders and images.

  2. Extract the downloaded file

    Extract the ZIP file. The extracted folders and files will be organized as shown in the image below.

    Sample data folder structure

    The image files that will be used for training are separated into separate folders for each class. Each folder’s name will be used as the identifier for the class. All image files must be JPEG format.

    Since this example classifies images as dogs or cats, the files are separated by type into a folder named dog and a folder named cat. These folder names (dog and cat) will be used as the identifier for the classes.

  3. Upload to Google Cloud Storage (GCS)

    For image classification, BLOCKS Machine Learning reads files from GCS. Refer to Uploading files to GCS to upload the blocks_ml_image_example folder into GCS.

Sample Flow

This sample Flow uses the ML Board prediction (online) BLOCK to make predictions in the simplest way possible.

  1. Download the sample Flow

    Click the link to the left to download the sample Flow.

  2. Import into a Big Data Board

    Import the downloaded sample Flow into a Big Data Board.

    Refer to the Import page of the Basic Guide for details on how to import the sample Flow.

Training

We’ll first train our image classification model using 100 images each of cats and dogs. In BLOCKS, we perform these trainings on an ML Board.

Follow the steps below to create the ML Board.

  1. Click the Create Board button from the Board list screen.

    Create Board
  2. Select ML Board.

    Select ML Board
  3. Select Image classification type.

    Select image classification type
  4. Enter a name for the Board.

    Set a name for the image classification Board.

Those using the free trial or the Self-Service Plan will see the follow three screen. Follow the instructions on each.

  1. GCP service account settings
  2. Storage settings

Finally, review the confirmation screen and click Finish to create the ML Board.

We’ll now use our new ML Board to carry out a training. Click the “Start Training” button from the ML Board’s details screen.

Image classification start training screen

We will fill out the settings for our training with the information shown in the chart below.

Item Details
Training name

Assign a name for this training.

For example, first training.

Training data upload
(Full Service Plan users only)

For Full Service Plan users, the Google Cloud Storage (GCS) location where the training data will be uploaded to will be shown here with the format gs://BUCKETNAME.

Clicking the link will open the Google Cloud Console in another tab where you can access GCS using the Google account you have registered in the GCP access section of the Project settings screen.

For this sample, we already uploaded the training data in the Getting started section, so we won’t use this link now.

Image folder

Designate the folder containing images for training that we uploaded.

We’ll enter gs://my-bucket/blocks_ml_image_example/training/ here.

Please replace the my-bucket portion of the URL with the bucket name from your own GCS environment.

The image files that will be used for training are separated into separate folders for each class. Each folder’s name will be used as the identifier for the class. All image files must be JPEG format.

Since this example classifies images as dogs or cats, the files are separated by type into a folder named dog and a folder named cat. These folder names (dog and cat) will be used as the identifier for the classes.

Max. time until timeout (minutes)

Configure the maximum amount of time that the training will take.

We’ll leave this at the default value of 180 (3 hours).

Max. number of trials

Set the maximum number of trials to run as either 1 とor 10.

We’ll leave this as the default value, 10.

The approximate training time can be calculated as (Max. time until timeout) × (Max. number of trials). The actual time may be a bit longer due to additional/indirect processing times.

Machine type

Select the type of machine to use for the training.

We’ll leave this as the default value, BASIC.

  • BASIC: Uses the standard machine to run the training.
  • BASIC GPU: Runs the training using a GPU (Graphic Processing Unit) for generally faster results than the BASIC type. However, GCP fees will cost approximately three times as much. Depending on the training data, the speed may not be significantly faster, or may be slower in some cases, compared to the BASIC type.
Explanation (optional)

Enter an explanation for this training (optional).

We’ll leave this blank this time.

Click the Start button and the new training will begin. You can confirm the training’s status from the training list.

Image classification ML Board details screen

Once complete, the training’s status will change to “succeeded” and an Apply button will be shown on the right side. Click this Apply button to use the training to make predictions on a Big Data Board.

Predictions

We’ll perform predictions on a Big Data Board. The Flow we’ll use for this example is as shown below.

Image classification predictive Flow

Predictions are made using the ML Board prediction (online) BLOCK. This BLOCK uses prediction input data that has been stored into a variable with an ML Board to make predictions. For this Flow, we’ll use a Construct Object BLOCK to set a variable as information for the image files that we will use for our predictions.

There are several other ways to set prediction input data other than using the Construct Object BLOCK. Some of these are explained at Methods for making predictions with the ML Board prediction (online) BLOCK. Select whichever method is best for your situation when actually using this BLOCK.

The chart below shows each BLOCK’s property settings (only those that are important/changed from the default setting). The sample Flow we’ve provided contains explanations for each BLOCK’s operations within their BLOCK memos property, but these will be left out of the chart below.

BLOCK
(Category)
Property Value
Construct Object
(Basic)
Results variable

_

Data
Image classification prediction data sample

Please replace the my-bucket portion of the URL with the bucket name from your own GCS environment.

ML Board prediction (online)
(Machine Learning)
GCP service account If you have multiple GCP service accounts, select the service account you would like to use with this BLOCK here.
ML Board Select the ML Board that we just created.
Input variable _.data
Output variable _
Output to log
(Basic)
Output variable _

Click the button within the Start of Flow BLOCK’s properties to execute the Flow.

If the Flow executes successfully, results similar to those shown below will be output to the Logs section at the bottom of the screen. To see this log, click the gray bar titled Logs at the bottom of the screen, then click the Succeeded status link.

{
  "predictions": [
    {
      "labels": [
        "cat",
        "dog"
      ],
      "score": [
        1.0,
        3.764146683238323e-08
      ],
      "key": "gs://my-bucket/blocks_ml_image_example/prediction/sample_01.jpg",
      "label": "cat"
    },
    {
      "labels": [
        "cat",
        "dog"
      ],
      "score": [
        7.805689392625936e-07,
        0.9999991655349731
      ],
      "key": "gs://my-bucket/blocks_ml_image_example/prediction/sample_02.jpg",
      "label": "dog"
    }
  ]
}

Each set of prediction results contains the following: "labels", "score", "key", and "label". The BLOCK returns one set of prediction results for each image. The example above includes the set of results for the first image from lines 3 through 14, and the second image from lines 15 through 26.

See below for explanations of "labels", "score", "key", and "label":

Name Explanation
"labels"

A list of the possible types for the classification. This example has two possible types: "cat" and "dog".

The order of the "labels" list matches with the order of the "score" list.

"score"

The certainty for predicting each type. The following explains the scored for the first set:

  • "cat" corresponds with the "score" 1.0. The model is nearly 100% certain the image is a "cat".
  • "dog" corresponds with the "score" 3.764146683238323e-08. The model is about 0.00000004% confident that the image is of a dog.
"key" The GCS URL of the image used for the prediction. If you designated a "key" value when making the prediction, it will be output here.
"label" The predicted result.

The above results can also be expressed as follows:

Image file Prediction results
Image Classification example cat
sample_01.jpg
  • About 100% certainty that the image is a cat.
  • About 0.00000004% certainty that the image is a dog.
Image Classification example dog
sample_02.jpg
  • About 99.99991655349731% certainty that the image is a dog.
  • About 0.0000008% certainty that the image is a cat.

Summary

Training an image classification model with BLOCKS is as simple as separating images into folders.

A few final notes regarding the image files used with BLOCKS image classification:

  • Images with extreme aspect ratios may not be classified properly.
  • Images used for training should be separated by type into folders.
  • All images within the folders should be JPEG format. Do not create subfolders within the class folders.