BLOCKS Reference

Block Reference

Machine Learning

ML Board Batch Predict

Trainings made prior to 2017/4/12 will become unusable after 2017/6/1. Please use trainings made after 2017/4/12 for making predictions.

This BLOCK makes predictions using an ML Board's training with input variable data. It is designed for predictions with large amounts of input data.

Batch predictions are done by reading input variable data from text files stored in Google Cloud Storage (GCS). Prediction results are output as text files to a GCS folder. As a general rule, the results are split into several files.

バッチ予測概略図

Batch predictions are comparatively slower than those performed by the ML Board Predict BLOCKS. However, the ML Board Batch Predict BLOCK is more effective for making predictions when using large amounts of data.

This BLOCK is currently in beta. Be aware that the beta version of this BLOCK will become unavailable post official release.
*Please make use of the official BLOCK once released.

Due to its nature as a beta release, there is the possibility that some functions may not execute properly. We appreciate feedback from users, through the BLOCKS Forum or direct contact, regarding bugs or ways to improve BLOCKS.

You must apply a training on an ML Board to make predictions with this BLOCK.


Prepare input variable data as JSON open_in_new format text files such as the one shown below:

{"key": "1", "sepal_length": 5.9, "sepal_width": 3.0, "petal_length": 4.2, "petal_width": 1.5}
{"key": "2", "sepal_length": 6.9, "sepal_width": 3.1, "petal_length": 5.4, "petal_width": 2.1}
{"key": "3", "sepal_length": 5.1, "sepal_width": 3.3, "petal_length": 1.7, "petal_width": 0.5}
  • Each line is one JSON object (format: {...}).
  • Objects are separated by new lines.
  • Gather input variable data for an item into a single JSON object.
  • JSON objects are organized as multiple “Name” and “Value” pairs.
    • In each pair, the name and value are separated by a :
    • The left of the : is the name, while the right side of the : is the value (name:value).
  • Values can be set as the following three types:
    • Numerical values: 1, 23.45, etc. (Numerical value, month, and day data)
    • Strings: "abc", "xyz", etc. Strings are enclosed by " marks. (String data)
    • Arrays: Sets of data enclosed within [ and ], such as [1, 2, 3] or [4, 5.6, 7.0]. (Numerical value data that has been assigned multiple dimensions)
  • Each JSON object (input variable data) should include the name "key". It’s value should be set as a string that uniquely identifies each object.

Prediction results are output as JSON format text files to a GCS folder. These are named as shown below (the XXXXX and YYYYY portions change based on the number of files output):

prediction.results-XXXXX-of-YYYYY
  • XXXXX: A number starting with 0 that represents the file’s index number (example: 00000, 00001, etc.)
  • YYYYY: The total number of files (00001, 00003, etc.)

The following example shows prediction results for a classification-type model:

{"score": [9.230815578575857e-08, 0.007054927293211222, 0.9929450154304504], "key": "2", "label": 2}
  • Results are output with one JSON object (format: {...} on each line.
  • Objects are separated by new lines.
  • Each item’s prediction results are contained within separate JSON objects.
    "score" The prediction’s level of certainty for each classification.
    In this example, the certainty for class 0 is 0.000009231%, class 1 is 0.705492729%, and class 2 is 99.294501543%
    "key" The "key" value given during the prediction.
    "label" The predicted class.

The following example shows prediction results for a regression-type model:

{"output": 10304.1962890625, "key": "20170103"}
  • Results are output with one JSON object (format: {...} on each line.
  • Objects are separated by new lines.
  • Each item’s prediction results are contained within separate JSON objects.
    "output" The predicted value.
    "key" The "key" value given during the prediction.

Property Explanation
BLOCK name Configure the name displayed on this BLOCK.
GCP service account Select the GCP service account to use with this BLOCK.
ML Board Select the ML Board to use for this prediction.
Input GCS URL

Designate the GCS URL for the text file containing the input variable data.

For example, if the bucket blocks-sample contains a text file with input variable data named sample.json, this would be set as gs://blocks-sample/sample.json

Output GCS URL

Designate a GCS URL for the folder that will contain the results of the prediction.

For example, for the results to be stored into a folder named results within a bucket named blocks-sample, this would be set as gs://blocks-sample/results/

A new folder will be created automatically if the assigned folder does not already existed.
When using an existing folder, new files will overwrite older ones if they have the same name.