BLOCKS Reference

BigQuery

Split table by column values

Overview

This BLOCK uses splits a BigQuery table into multiple tables based on the values in a column that you specify.

It starts by getting the types of values in the Split-basis column (col) by using the query SELECT Split-basis column FROM Source table GROUP BY col LIMIT 1000;. Next, it uses the query SELECT * FROM Source table WHERE col=val; to export each row of the source table into a new table whose names equals the value (val) in the Split-basis column.

  • The value you enter into the Split-basis column property will be converted into a string that can be used as a table ID. Any characters that cannot be used in table IDs will be removed, and the resulting string will be used as the suffix for the Tables produced from the split. Serial numbers will be added to any duplicate results.
  • An error will occur if there are over 999 different values in the Split-basis column.
  • Pre-existing tables are overwritten.

Properties

Property Explanation
BLOCK name Configure the name displayed on this BLOCK.
GCP service account

Select the GCP service account to use with this BLOCK.

Source dataset

Designate the ID of the dataset containing the table that will be split.

Supports % format characters and variable expansion.

Source table

Designate the ID of the table that will be split based on a column's values.

Supports % format characters and variable expansion.

Split-basis column

Designate the name (STRING type) of the column to be used as the basis for splitting the table.

Dataset to store resulting tables

Designate the ID of the dataset that will store the resulting tables.

Supports % format characters and variable expansion.

Tables produced from the split

Designate a prefix for the IDs of the tables that will be produced from splitting the Source table. The complete table ID will be determined using this prefix and the value of the Split-basis column property (unusable characters will be excluded).

Supports % format characters and variable expansion.

Query priority level

Select the query priority level from the following:

  • Interactive: Executes immediately, limited simultaneous queries.
  • Batch: Executes when resources are available, unlimited simultaneous queries.
BLOCK memos Make notes about this BLOCK.
Reattempts in case of errors Configure the number of attempts to try in case of a request error.
Minimum timeout interval Set the number of seconds to wait for results. If results are not returned during this interval, the time will be doubled for each reattempt until the time set in the Maximum timeout interval property is reached.
Maximum timeout interval Indicate the maximum number of seconds to wait for results. The timeout interval will start with the value set in the Minimum timeout interval property and double with each reattempt until reaching the value set here.

この情報は役に立ちましたか?