Basic Guide

Basic Guide

Image classification with BLOCKS Machine Learning

Image classification with BLOCKS Machine Learning

This page demonstrates how to use BLOCKS to create an image classification Machine Learning model that judges if photos are of cats or dogs.

Overview of dog or cat image classification

We’ll use a Model Generator and a Flow Designer for this guide. We’ll train our model with a Model Generator. Then, we’ll make predictions—judge if images are of cats or dogs—with a Flow Designer. Images must be in JPG format for BLOCKS Machine Learning.

Image classification Boards overview

This document does not explain the basics of using BLOCKS. Please refer to the Basic Guide for more details on getting started with BLOCKS.

Getting started

Refer to the Trial Guide to register for the BLOCKS free trial if you don’t have an account yet.

We’ve prepared sample data you can use to follow this guide. Each item is explained in the following chart:

Data Explanation
Sample data

This download contains folders of image files for training our model and making predictions.

  1. Download the sample data

    Click the link on the left to download the sample data. The sample data is a ZIP file containing multiple folders and images.

  2. Extract the downloaded file

    Extract the ZIP file. The extracted folders and files will be organized as shown in the image below.

    Sample data folder structure

    The image files for training the model are separated into folders for each class (dogs and cats). All image files must be JPEG format.

    Since this example classifies images as dogs or cats, the files are separated by type into a folder named dog and a folder named cat. BLOCKS will use these folder names (dog and cat) as the identifiers for the classes.

  3. Upload to Google Cloud Storage (GCS)

    For image classification, BLOCKS reads training files from Google Cloud Storage (GCS). Refer to GCS Explorer for instructions on using the GCS Explorer to upload the blocks_ml_image_example folder into GCS.

Sample Flow

This sample Flow uses the Model Generator prediction (online) BLOCK to make predictions.

  1. Download the sample Flow

    Click the link to the left to download the sample Flow.

  2. Import into a Flow Designer

    Import the downloaded sample Flow into a Flow Designer.

    Refer to Importing and exporting Flows for details on how to import the sample Flow into a Flow Designer.

Training

We’ll train our image classification model using 100 images each of cats and dogs. In BLOCKS, we can train Machine Learning models using the Model Generator service.

Follow the steps below to create a Model Generator.

  1. Select the Model Generator service. If you haven’t created any Model Generators yet, click Start.

    What is a Model Generator screen
  2. If you already have at least one Model Generator, click Add.

    Model Generator list screen
  3. Select Image classification type.

    Select image classification type
  4. Enter a name for your Model Generator.

    Set a name for the image classification Model Generator.

Those using the free trial or the Self-Service Plan will see the following screens. Follow the instructions on each.

  1. GCP service account settings
  2. Storage settings

Finally, review the confirmation screen and click Finish to create your Model Generator.

We’ll now use our new Model Generator to train a model. Click Start Training in your Model Generator.

Image classification start training screen

Fill out the settings for the training as explained in the following chart:

Item Details
Training name

Assign a name for this training.

For example, first training.

Training data upload
(Full Service Plan users only)

For Full Service Plan users, this shows the location in Google Cloud Storage (GCS) that the training data will be uploaded to. It is shown with the format gs://BUCKETNAME.

Clicking the link will open the Google Cloud Console in another tab where you can access GCS using the Google account you’ve registered in the GCP access section of your project settings.

For this sample, we already uploaded the training data in the Getting started section, so we won’t use this link now.

Image folder

Designate the folder you uploaded that contains the training images.

We’ll enter gs://my-bucket/blocks_ml_image_example/training/ here.

Replace the my-bucket portion of the URL with the bucket name from your own GCS environment.

The training images are separated into folders for each class. Each folder’s name will be used as the identifier for its class. All image files must be JPEG format.

Since this example classifies images as dogs or cats, the files are separated into a folder named dog and a folder named cat.

Training type

Select the training type from the following:

  • Prioritize speed

    This type takes less time to train the model, but may not get the most accurate results.

  • Prioritize accuracy

    This type takes a longer time to train the model, but usually gets more accurate results.

Max. time until timeout (minutes)

Configure the maximum amount of time that the training will take.

We’ll leave this at the default value of 180 (3 hours).

Max. number of trials

Set the maximum number of trials to run as either 1 or 10.

We’ll leave this as the default value, 10.

The approximate training time can be calculated as (Max. time until timeout) × (Max. number of trials).

Enable early stopping

Select whether to Enable or Disable the early stopping option.

Enable this option to have the training stop before the end of the specified training time if it determines that accuracy is unlikely to improve. While this can reduce unnecessary training time, be aware that accurately determining early stopping cannot be guaranteed.

This option is only available if you have selected Prioritize accuracy for your Training type.

Machine type

Select the type of machine that will run the training.

  • Single node (GPU):

    Runs the training using a single machine with a GPU.

  • Distributed nodes (GPU):

    Runs the training using multiple machines with GPUs. This type usually has faster processing speed that the Single node (GPU) type, but GCP usage fees will also increase.

    The Distributed nodes (GPU) machine type is only available if you have selected Prioritize accuracy for your Training type.

Explanation (optional)

Enter an explanation for this training (optional).

We’ll leave this blank this time.

Click Start to start the training. You can view your training’s status in the training list.

Image classification Model Generator details screen

Once complete, the training’s status will change to “Succeeded” and an Apply button will be shown on the right side. Click Apply to set the training to be usable for predictions from the Flow Designer.

Predictions

We’ll make predictions from a Flow Designer with the following Flow:

Image classification predictive Flow

The Model Generator prediction (online) BLOCK will make the predictions. This BLOCK uses the model we applied on our Model Generator along with input data that has been stored into a variable to make predictions. For this Flow, we’ll use a Construct object BLOCK to set a variable as information for our prediction images.

There are several other ways to set prediction input data other than using the Construct object BLOCK. Refer to Predicting with the Model Generator prediction (online) BLOCK for more details on some of these methods. You can use whichever method is best for your situation when using this BLOCK.

The chart below shows each BLOCK’s property settings (only those that are important/changed from the default setting). The sample Flow we’ve provided contains explanations for each BLOCK’s operations within the BLOCK memos property, but these will be left out of the chart below.

BLOCK
(Category)
Property Value
Construct object
(Basic)
Results variable

_

Data
Image classification prediction data sample

Please replace the my-bucket portion of the URL with the bucket name from your own GCS environment.

Model Generator prediction (online)
(Machine Learning)
GCP service account If you have multiple GCP service accounts, select the service account you would like to use with this BLOCK here.
Model Select the Model Generator that you just created.
Input variable _.data
Output variable _
Output to log
(Basic)
Output variable _

Click the button within the Start of Flow BLOCK’s properties to execute the Flow.

If the Flow executes successfully, results similar to those shown below will be output to the logs section at the bottom of the screen. To see this log, click the gray bar titled Logs at the bottom of the screen, then click the Succeeded status link.

{
  "predictions": [
    {
      "labels": [
        "cat",
        "dog"
      ],
      "score": [
        1.0,
        3.764146683238323e-08
      ],
      "key": "gs://my-bucket/blocks_ml_image_example/prediction/sample_01.jpg",
      "label": "cat"
    },
    {
      "labels": [
        "cat",
        "dog"
      ],
      "score": [
        7.805689392625936e-07,
        0.9999991655349731
      ],
      "key": "gs://my-bucket/blocks_ml_image_example/prediction/sample_02.jpg",
      "label": "dog"
    }
  ]
}

Each set of prediction results contains the following: "labels", "score", "key", and "label". The BLOCK returns one set of prediction results for each image. The example above includes the set of results for the first image from lines 3 through 14, and the second image from lines 15 through 26.

See below for explanations of "labels", "score", "key", and "label":

Name Explanation
"labels"

A list of the possible types for the classification. This example has two possible types: "cat" and "dog".

The order of the "labels" list matches with the order of the "score" list.

"score"

The certainty for predicting each type. The following explains the scores for the first set:

  • "cat" corresponds with the "score" 1.0. The model is nearly 100% certain the image is a "cat".
  • "dog" corresponds with the "score" 3.764146683238323e-08. The model is about 0.00000004% confident that the image is of a dog.
"key" The GCS URL of the image used for the prediction. If you designated a "key" value when making the prediction, it will be output here.
"label" The predicted result.

The above results can also be expressed as follows:

Image file Prediction results
Image Classification example cat
sample_01.jpg
  • About 100% certainty that the image is a cat.
  • About 0.00000004% certainty that the image is a dog.
Image Classification example dog
sample_02.jpg
  • About 99.99991655349731% certainty that the image is a dog.
  • About 0.0000008% certainty that the image is a cat.

Summary

Training an image classification model with BLOCKS is as simple as separating images into folders.

A few final notes regarding the image files used with BLOCKS image classification:

  • Images with extreme aspect ratios may not be classified properly.
  • Images used for training should be separated by type into folders.
  • All images within the folders should be JPEG format. Do not create subfolders within the class folders.