Image classification with BLOCKS Machine Learning

Introduction

This page explains how to use the BLOCKS machine learning service for image classification. Some use cases for image classification systems include analyzing products for faults and diagnosing diseases from medical images.

In this tutorial, we will train a model to classify images of cats and dogs, then test using this model to classify new images.

Dog or cat classification overview

General overview of steps

We’ll mainly be using the Model Generator and Flow Designer services. The following images shows a general overview of how each of these services will be used in the image classification:

Overview of BLOCKS services for image classification

We’ll start by training a model in the Model Generator. During this training, we’ll give the Model Generator lots of pictures of cats and dogs. It then “learns” the characteristics of the images that determine which are of cats and which are of dogs. We call the results of this training a model or trained model.

Once we’ve trained a model, we’ll a Flow Designer to make predictions—judge if images are of cats or dogs—for new images that weren’t used in the training.

BLOCKS supports JPEG, PNG, GIF, BMP image files. However, Model Generator trainings with the “prioritize speed” setting only support JPEG files.

For more information about training types, refer to Model Generator Help.

Trying out image classification

Before starting

We recommend using Google Chrome for this tutorial. You can also use Firefox, but one feature (explained in more detail later) is only available in Google Chrome.

We’ve prepared data that you can use for this tutorial. As mentioned previously, you will need data for the Model Generator to train a model that can classify images of cats and dogs. Once the model is trained, you will also need images for predictions. You can download the sample data for this tutorial from the following link:

Data Explanation
Sample Data

This download contains folders of image files for training our model and making predictions.

  1. Download the sample data

    Click the link above to download the sample data as a ZIP file. The ZIP file contains multiple folders and files.

  2. Extract the files

    Extract the data from the ZIP file. The extracted data should be organized as shown in the image below:

    Overview of the sample data folders

    The training data images are separated into folders by class or label (in this case, dog and cat). The Model Generator uses the folder names from the training data as the class labels, so we’ve named the folders with easily understood names.

    Since this tutorial will create a model that classifies images as either cats or dogs, we’ve organized our training images into a folder named dog and a folder named cat and placed 100 images of each into the folders.

    All of the images are JPEG files.

  3. Upload the data to Google Cloud Storage (GCS)

    During the training, the Model Generator will read the training data from GCS. We’ll use the GCS Explorer tool in BLOCKS to upload the folder we extracted (blocks_ml_image_example) to GCS by doing the following.

    First, sign in to BLOCKS.

    Opening the GCS Explorer
    1. Click the menu () in the global navigation bar.
    2. Click GCS Explorer (beta).
    Selecting the GCS service account and bucket in the GCS Explorer
    1. Select a GCP service account
    2. Select the bucket to upload the data into. We’re using a bucket that ends with -data.

    Users on the Self-Service Plan (including the free trial) users, can select to automatically create default buckets including a -data bucket when creating a BLOCKS project.

    If you do not have any buckets, you can create one from the GCP service accounts section of the project settings menu.

    Creating a bucket from the project settings menu

    You can select another bucket if you have already created one. However, its Storage Class must be Regional and its Location must be us-central1.

    Uploading a folder to GCS
    1. Click Upload Folder.

    The upload folder function of the GCS Explorer is not available when using Firefox. If you are using Firefox, you can use the create folder function of the GCS Explorer to create the same folders as the downloaded data in GCS, then upload the text files into those folders. For more information on uploading files in the GCS Explorer, refer to Uploading files to GCS.

    Selecting the folder to upload
    1. Select the blocks_ml_image_example folder you extracted.
    2. Click Upload.
    Confirming your upload
  4. Click Upload.
  5. The data uploaded into GCS

    It will take a bit of time for the sample data folder to finish uploading.

    Once finished, your training and prediction data will be ready to use.

Creating a Model Generator

We’ll now create Model Generator where we can train a model using the images we just uploaded by doing the following:

Switching to the Model Generator
  1. Click the menu icon () in the global navigation bar.
  2. Click Model Generator.

A screen titled What is the Model Generator? will appear if you haven’t created any Model Generators.

What is the Model Generator?
  1. Click Start.

info_outline A message will appear if you do not have enough licenses to create the Model Generator. If you are an admin for your organization, you will see the license purchase screen, where you can purchase an additional Model Generator license to continue. If you are not an admin, you will need to contact your organization’s admins to request that they purchase a license.

The list of Model Generators in your project will appear if any have already been created.

The Model Generator list
  1. Click Add.

info_outline A message will appear if you do not have enough licenses to create the Model Generator. If you are an admin for your organization, you will see the license purchase screen, where you can purchase an additional Model Generator license to continue. If you are not an admin, you will need to contact your organization’s admins to request that they purchase a license.

Selecting the type of Model Generator
  1. Click Image classification model.
  2. Click Next.
Naming the Model Generator
  1. Enter a name for the Model Generator.
  2. Click Next.

Free Trial and Self-Service Plan users should follow the directions on the screen to complete the following two steps:

  1. GCP service account settings
  2. Storage settings
Confirming your Model Generator settings
  1. Confirm that your settings are correct and click Finish.

Training a model

Now that the Model Generator is ready, we’ll use our training data to train a model.

The Model Generator details page
  1. Click Start Training.
Starting a new training for an image classification model
  1. Enter a name for the training.
  2. Click the folder icon ().
Selecting the training data folder
  1. Click the arrow icon () next to the bucket that contains your training data. We’ve used a bucket that ends in -data.
  2. Click the arrow icon () next to blocks_ml_image_example.
  3. Click training/.
  4. Click Select.
Starting the training
  1. Click Start.
Checking the status of an in-progress training

You can check on the status of a training while it is running by looking at the training list.

The training for this tutorial should take about two hours, depending on server circumstances. The status will change to Successful if it finishes successfully.

Once this occurs, we need to set our trained model to be available for making predictions. To do this, we will Apply the training.

Applying a trained model to use for predictions
  1. Click the drop-down arrow (arrow_drop_down)
  2. Click Production.
  3. Click Apply.

For more details about applying to production or testing, refer to the Model Generator Help page’s Training list section.

If your training fails, please try running it again. For help determining the reason for a training’s failure, refer to In case of an error.

If the training fails due to a shortage of GCP resources, try running a new training with the Machine type setting set to Single node.

For help determining the reason for a training’s failure, refer to In case of an error.

Creating a Flow Designer

With the trained model ready, we can now use it to make predictions in a Flow Designer. We’ll use the Flow Templates feature of the Flow Designer to quickly create a Flow for image classification predictions.

Opening the Flow Designer
  1. Click the menu icon () in the global navigation bar.
  2. Click Flow Designer.
What is a Flow Designer?
  1. Click Start.

If you have already created a Flow Designer in your project, you will see the Flow Designer list instead of the “What is a Flow Designer?” screen. In this case, you can click on the name of an existing Flow Designer and use it for the rest of this tutorial. If you have enough licenses and want to use a new Flow Designer, you can click Add in the upper-left corner of the Flow Designer list.

info_outline A message will appear if you do not have enough licenses to create the Flow Designer. If you are an admin for your organization, you will see the license purchase screen, where you can purchase an additional Flow Designer license to continue. If you are not an admin, you will need to contact your organization’s admins to request that they purchase a license.

Naming a Flow Designer
  1. Enter a name for the Flow Designer.
  2. Configure the language setting (for logs).
  3. Configure the time zone setting.
  4. Click Create.

Creating a Flow for making predictions

With the Flow Designer ready, we’ll use the Flow Templates menu to create a Flow for making predictions with the model we trained.

Opening a Flow Designer
  1. Click the name of the Flow Designer you will use.

Your Flow Designer will open in a new tab.

Using the Flow Templates button
  1. Click Flow Templates.
Selecting the type of template
  1. Click Image classification prediction.
  2. Click Next.
Naming the Flow that will be created
  1. Enter a name for the Flow.
  2. Click Next.
Configuring settings for the Model Generator prediction BLOCK
  1. Click on the Model Generator that you created for this tutorial. Ours was named Dog or Cat Classification.
  2. Select Batch prediction.
  3. Click Next.
Input data settings for Flow Templates
  1. Click the folder icon ( ).
Selecting the prediction data from GCS
  1. Click the arrow icon () for the bucket that contains your prediction data. We used a bucket that ends with -data.
  2. Click the arrow icon () for blocks_ml_image_example
  3. Click prediction.
  4. Click Select.
Finishing configuring input data settings
  1. Click Next.
Configuring where to store the results
  1. Select DataEditor for the storage location.
  2. Select Register as new.
  3. Enter a name for identifying the data in the DataEditor. (Example shown above: Dog or cat prediction results)
  4. Enter the dataset that will store the results. (Example shown above: tutorials)
  5. Enter the table that will store the results. (Example shown above: image_classification_results)

DataEditor is a tool for visualizing data stored in BigQuery. You don’t need any specialized knowledge of BigQuery to use the DataEditor, but you do need to configure the dataset and table that your data will be stored in. If you are familiar with spreadsheet software, you can compare a dataset to a workbook and a table to a single sheet.

Setting the number of labels

Scroll down to see the Number of labels setting.

  1. Set the number of labels to 2.
  2. Click Next.
Selecting the tab for the new Flow
  1. Click Create.
Saving the new Flow
  1. Click Save.

Make sure to click Save after creating the Flow. You won’t be able to execute the Flow to make predictions unless you save. If you close the Flow Designer tab or your web browser without saving, the Flow will be lost.

Making predictions

We can now use the Flow to make predictions.

Executing the Flow
  1. Click the menu icon (more_vert) on your Start of Flow BLOCK (We named this BLOCK Dog or cat prediction).
  2. Click Execute Flow.

We can view the Logs section to check on the status of our Flow as it executes.

Viewing logs from the BLOCK menu
  1. Click View Logs.
Checking on the status of a Flow while it executes
  1. Confirm that the Flow’s status is Running.

The Flow will take a bit of time to run.

Confirming that the Flow executed successfully
  1. Wait until the status changes to Finished.

Once this happens, the Flow has successfully executed.

If the Flow fails to execute successfully, refer to In case of an error for help determining the cause of the error.

Checking the prediction results

When we used the Flow Template to create the Flow, we configured for the results of the prediction to be sent to the DataEditor. To check the results in the DataEditor, switch back from the Flow Designer to the BLOCKS tab. We had previously left it on the Flow Designer list page.

Opening the DataEditor
  1. Click the menu icon () in the global navigation bar.
  2. Click DataEditor.
The DataEditor table list
  1. Click the name you configured for the results. We used Dog or cat prediction results.
Switching to the table tab
  1. Click the Table tab.
Viewing the data
  1. Click View data.
The results of the prediction in the DataEditor

The following chart explains the meaning of each column in the results data:

Name Explanation
key The GCS URL for the prediction image file.
label The predicted label. Our labels were dog and cat.
score

The confidence level for predicting the label shown in the label column. This is shown as a number between 01 with 1 signifying 100% confidence.

score_(ascending numbers from 0)

The confidence level for each specific label. This is shown as a number between 01 with 1 signifying 100% confidence.

The above results can also be expressed as follows:

Row Image file Prediction results
1 Sample cat image for predictions
sample_01.jpg
  • About 100% certainty that the image is a cat.
  • The possibility that the image is a dog is about 0.000000000007%.
2 Sample dog image for predictions
sample_02.jpg
  • About 100% certainty that the image is a dog.
  • The possibility that the image is a cat is about 0.000000008%.

In case of an error

If an error occurs during the Model Generator’s training, you can find the error logs by doing the following:

Training list example with an error
  1. Click on the name of the training whose status is Failed.
Checking the error logs of a training
  1. Click Error logs.
  2. If needed, you can click Copy error logs to copy the logs to your clipboard.

If an error occurs while using the Flow Designer, you can find the error logs by checking the Logs panel.

Checking for errors in the log panel of a Flow Designer

Error messages are shown in red.

It’s often helpful to read the logs before and after an error when attempting to determine its cause (click Show error log details).

If you encounter an error that you cannot solve after several attempts, you can contact the BLOCKS support team by clicking your user icon in the right side of the global navigation bar and selecting Contact Us. For errors in a Model Generator, please copy the entire contents of the error logs and include these as a text file when you send your message.

For errors in a Flow Designer, click the Show error log details checkbox in the Logs panel of the Flow designer, then copy the logs. You should also export your Flow as a JSON file and include this as an attachment in your message to BLOCKS support.

For more details on contacting BLOCKS support, refer to the Basic Guide: Contact Us page.

Summary

With BLOCKS, you just need to prepare image files into labelled folders to get started with image classification machine learning.

As a final note, please keep in mind the following about the kinds of images that can be used for image classification in BLOCKS:

  • Images with extreme aspect ratios may not be classified properly.
  • Images used for training should be placed into separate folders by label.
  • Images within the folder can be JPEG, PNG, GIF, or BMP. However, trainings that prioritize speed can only support JPEG files. Do not create subfolders within the class folders.