Basic Guide

Predicting with the Model Generator prediction (online) BLOCK

Predicting with the Model Generator prediction (online) BLOCK

There are several ways to provide input data for making predictions to the Model Generator prediction (online) BLOCK. The Model Generator How-To guides (Classification/Regression) explain a method that uses the Construct object BLOCK.

This page explains the methods listed below. Each example uses the scenario from the Making predictions section of the Model Generator How-To: Classification guide.

The various names used on this page (filenames, bucket names, dataset names, table names, etc.) can be freely configured. However, the input data's item names (including "key") should be set as shown.

Providing input data as files

This method uses input data prepared as YAML or JSON format text files.

In order to use the Model Generator prediction (online) BLOCK, this data will need to be set as a variable with data formatted according to certain specifications.

We will set our data to a variable using the Load to Variable from GCS BLOCK (GCP category). This BLOCK sets a variable as the contents of a file stored in Google Cloud Storage (GCS). When the file's format is YAML or JSON, the variable's data will also be constructed according to that format. This process is shown in the image below.

Input data as a file example

The following example shows how to make predictions using input data prepared as a JSON format text file.

First, create a JSON format text file named iris_predict_data.json as shown below, then upload it to a GCS bucket (bucket name: magellan-sample).

{
  "data": [
    {
      "key": "1",
      "sepal_length": 5.9,
      "sepal_width": 3.0,
      "petal_length": 4.2,
      "petal_width": 1.5
    },
    {
      "key": "2",
      "sepal_length": 6.9,
      "sepal_width": 3.1,
      "petal_length": 5.4,
      "petal_width": 2.1
    },
    {
      "key": "3",
      "sepal_length": 5.1,
      "sepal_width": 3.3,
      "petal_length": 1.7,
      "petal_width": 0.5
    }
  ]
}

Create the following Flow on a Flow Designer:

Example Flow for input data as a file

The Load to Variable from GCS BLOCK reads the contents of the text file. Then, the Model Generator prediction (online) BLOCK makes the predictions. In this example, we send the results to the logs section with an Output to log BLOCK.

Set the properties for each BLOCK as shown below. We’ve only included properties that need to be changed from default values in the chart.

BLOCK Property Value
Load to Variable from GCS Source file GCS URL gs://magellan-sample/iris_predict_data.json
File format JSON
Results variable _
Model Generator prediction (online) Model Generator Select the Model Generator that will be used for the prediction
Input variable _.content.data
Output variable

_

We're using the same variable, _, as the input data for convenience. By doing this, the results data will overwrite the input data.

Output to log Variable to output _

Once you’ve created the Flow, save your Flow Designer and click the button from the Start of Flow BLOCK's properties menu.

The following is a portion of the resulting log:

{
  "predictions": [
    {
      "score": [
        0.017455093562602997,
        0.7145982980728149,
        0.26794660091400146
      ],
      "key": "1",
      "label": 1
    },
    {
      "score": [
        0.0007236730307340622,
        0.40548425912857056,
        0.5937920212745667
      ],
      "key": "2",
      "label": 2
    },
    {
      "score": [
        0.9445222616195679,
        0.05332513898611069,
        0.0021526541095227003
      ],
      "key": "3",
      "label": 0
    }
  ]
}

Providing input data from BigQuery

This method makes predictions using input data prepared as a BigQuery table.

As explained in the previous example, the Model Generator prediction (online) BLOCK requires input data that is stored into a variable. This data must be formatted according to certain specifications.

In this example, we’ll use an Execute query BLOCK to get data from BigQuery and set it as a variable formatted according to the Model Generator prediction (online) BLOCK's specifications.

Preparing input data as a BigQuery table

First, prepare the following table in BigQuery:

Dataset/Table information
Item Value
Dataset name samples
Table name iris_predict_data
Schema
Name Type Mode
key STRING NULLABLE
sepal_length FLOAT NULLABLE
sepal_width FLOAT NULLABLE
petal_length FLOAT NULLABLE
petal_width FLOAT NULLABLE
Data
key sepal_length sepal_width petal_length petal_width
1 5.9 3.0 4.2 1.5
2 6.9 3.1 5.4 2.1
3 5.1 3.3 1.7 0.5

Create the following Flow on a Flow Designer:

Input data from BigQuery example

The Execute query BLOCK takes the input data from BigQuery and stores it into a variable. Next, the Model Generator prediction (online) BLOCK uses the data in this variable to make predictions. In this example, we send the results to the logs section using an Output to log BLOCK.

Set each BLOCK's properties as shown below. The chart only contains properties that need to be changed from their default value:

BLOCK Property Value
Execute Query SQL Syntax Legacy SQL
Query
SELECT
  key,
  sepal_length,
  sepal_width,
  petal_length,
  petal_width
FROM
  samples.iris_predict_data
Results variable _
Model Generator prediction (online) Model Generator Select the Model Generator that will be used to make predictions.
Input variable _
Output variable

_

We're using the same variable, _, as the input data for convenience. By doing this, the results data will overwrite the input data.

Output to log Variable to output _

Once the Flow is ready, save and execute the Flow by clicking the button from the Start of Flow BLOCK's properties menu.

The following is a portion of the resulting log:

{
  "predictions": [
    {
      "score": [
        0.017455093562602997,
        0.7145982980728149,
        0.26794660091400146
      ],
      "key": "1",
      "label": 1
    },
    {
      "score": [
        0.0007236730307340622,
        0.40548425912857056,
        0.5937920212745667
      ],
      "key": "2",
      "label": 2
    },
    {
      "score": [
        0.9445222616195679,
        0.05332513898611069,
        0.0021526541095227003
      ],
      "key": "3",
      "label": 0
    }
  ]
}

Providing input data when externally executing Flows (application/json)

This method uses input data sent when externally executing a Flow to make predictions.

External Flow execution refers to using a Web API to execute a Flow. The method for making predictions sets the Content-Type header to application/json in the HTTP request.

When externally executing a Flow, you can pass data as JSON text and set BLOCKS variables to various values from this data.

For example, you could set the variable var to the value 100 by using a request with its Content-Type header set to application/json and the body of the HTTP request set as the following JSON text:

{"var": 100}

The following JSON text would set the variable _ as prediction input data for Machine Learning:

{
  "_": [
    {
      "key": "1",
      "sepal_length": 5.9,
      "sepal_width": 3.0,
      "petal_length": 4.2,
      "petal_width": 1.5
    },
    {
      "key": "2",
      "sepal_length": 6.9,
      "sepal_width": 3.1,
      "petal_length": 5.4,
      "petal_width": 2.1
    },
    {
      "key": "3",
      "sepal_length": 5.1,
      "sepal_width": 3.3,
      "petal_length": 1.7,
      "petal_width": 0.5
    }
  ]
}

The Flow Designer’s Flow for this example would be like the following:

Providing input data when externally executing a flow (new version): Example Flow

This simple Flow uses a Model Generator prediction (online) BLOCK to make predictions, then outputs the results using an Output to log BLOCK.

The following chart shows the properties for each BLOCK. It only includes properties that need to be changed from their default values.

BLOCK Property Value
Start of Flow ID predict_iris
Model Generator prediction (online) Model Generator Select the Model Generator that will be used for the prediction
Input variable _
Output variable

_

The variable that will store the results of the prediction. Since we are using _ again, the results will overwrite the input data that was previously stored in the variable.

Output to log Variable to output _

The following is an example Unix curl command for executing the Flow:

curl -H 'Authorization: Bearer 951***a16' \
  -H 'Content-Type: application/json' \
  -d '{"_": [{"key": "1", "sepal_length": 5.9, "sepal_width": 3.0, "petal_length": 4.2, "petal_width": 1.5}, {"key": "2", "sepal_length": 6.9, "sepal_width": 3.1, "petal_length": 5.4, "petal_width": 2.1}, {"key": "3", "sepal_length": 5.1, "sepal_width": 3.3, "petal_length": 1.7, "petal_width": 0.5}]}' \
  https://***.magellanic-clouds.net/flows/predict_iris.json

A response like the following will be returned if the Flow executes successfully. The 1 portion of the "job_id" will change each time the Flow is executed, so it doesn’t matter if there is a different value than shown here.

{"result":true,"job_id":1}

The following is a portion of the resulting Flow execution log:

`
{
  "predictions": [
    {
      "score": [
        0.017606934532523155,
        0.9281915426254272,
        0.05420156940817833
      ],
      "key": "1",
      "label": 1
    },
    {
      "score": [
        0.0002792126906570047,
        0.2558387219905853,
        0.7438820004463196
      ],
      "key": "2",
      "label": 2
    },
    {
      "score": [
        0.9730234146118164,
        0.026976602151989937,
        6.675784813836572e-09
      ],
      "key": "3",
      "label": 0
    }
  ]
}

Providing input data when externally executing Flows (x-www-form-urlencoded)

This method sends input data as parameters when externally executing a Flow to make predictions.

External Flow execution refers to using a Web API to execute a Flow. The method for making predictions introduced here passes input data with the Content-Type header set to x-www-form-urlencoded in the HTTP request.

You can reference parameters that have been sent to a Flow by using variables with the same names as the parameters. We can prepare our data using these variables within the Construct object BLOCK.

For example, see the following Construct object BLOCK and its Data property:

Example of using variables in the Construct object BLOCK

If we send the parameters var_x=5.9 and var_y=3.0 when we execute this Flow from an external source, the values will change as follows:

Example of using variables with the Construct object BLOCK
  • ${var_x} becomes "5.9", the value of the parameter var_x.
  • var_y becomes "3.0", the value of the parameter var_y.
    Variable is handled as a String.

Parameter values are all handled as strings.

By referencing variables with parameters, we can change the data created each time the Flow executes.

Setting the BLOCK's Data property to some types (numerical value, month, day) will cause it to expect numbers rather than strings. However, passing numbers that have been converted to strings, such as "5.9" or "0", will not cause any problems.

The Model Generator prediction (online) BLOCK converts strings into numerical values for items where it expects a numerical value. However, strings like "one" or "abc" that cannot be converted into a number will not work and will cause the prediction to fail.

We’ll create the following Flow in a Flow Designer for our Machine Learning prediction:

Example of sending input data when externally executing Flow

The Construct object BLOCK will use parameters sent to the Flow when it was externally executed to create the input data. The Model Generator prediction (online) BLOCK then uses this input data to make predictions. The Output to log BLOCK prints a log containing the prediction results.

The following chart shows each BLOCK’s property settings. It only contains properties that must be changed from their default values:

BLOCK Property Value
Start of Flow ID predict_iris
Construct object Results variable _
Data
Key Type Value
key String ${key}
sepal_length String ${sepal_length}
sepal_width String ${sepal_width}
petal_length String ${petal_length}
petal_width String ${petal_width}

This example uses Variable expansion to set each the values as the parameters sent to the Flow. We've used the same names for the input data's items (key) and parameters in order to keep things simple.

Model Generator prediction (online) Model Generator Select the Model Generator that will be used to make predictions.
Input variable _
Output variable

_

We're using the same variable, _, as the input data for convenience. By doing this, the results data will overwrite the input data.

Output to Log Variable to ouput _

The following Unix type curl command executes a Flow and passes input data parameters (the -d "xxx=#" portion):

curl -H "Authorization: Bearer 951***a16" \
  -d "key=1"            \
  -d "sepal_length=5.9" \
  -d "sepal_width=3.0"  \
  -d "petal_length=4.2" \
  -d "petal_width=1.5"  \
  https://***.magellanic-clouds.net/flows/predict_iris.json

A response similar to the one below will come after the Flow successfully executes. The 1 for the "job_id" portion changes each time the Flow executes, so it's not a problem if you try out the example on this page and receive a different number.

{"result":true,"job_id":1}

The following is a portion of the resulting log:

{
  "predictions": [
    {
      "score": [
        0.017455093562602997,
        0.7145982980728149,
        0.26794660091400146
      ],
      "key": "1",
      "label": 1
    }
  ]
}