BLOCKS Reference

Block Reference

BigQuery

Split table by column values

This BLOCK uses the values in a specified column to split one BigQuery table into multiple tables.

In concrete terms, it starts by getting the types of values in the Split-basis column (col) by using SELECT Split-basis column FROM Source table GROUP BY col LIMIT 1000;. Next, it uses SELECT * FROM Source table WHERE col=val; to export each row of the source table into a new table whose names equals the value (val) in the Split-basis column.

  • The value entered into the Split-basis column property will be converted into a string that can be used as a table ID. Any characters that cannot be used in table IDs will be removed, and the resulting string will be used as the suffix for the Tables produced from the split. Serial numbers are also added to any duplicate results.
  • An error will occur if there are over 999 different values in the Split-basis column.
  • Pre-existing tables are overwritten.
Property Explanation
BLOCK name Configure the name displayed on this BLOCK.
GCP service account Select the GCP service account to use with this BLOCK.
Source dataset

Designate the ID of the dataset containing the table that will be split.

[% format character addressable] [variable expansion addressable]

Source table

Designate the ID of the table that will be split based on a column's values.

[% format character addressable] [variable expansion addressable]

Split-basis column Designate the name (STRING type) of the column to be used as the basis for splitting the table.
Dataset to store resulting tables

Designate the ID of the dataset that will store the resulting tables.

[% format character addressable] [variable expansion addressable]

Tables produced from the split

Designate a prefix for the IDs of the tables that will be produced from splitting the "Source table". The complete table ID will be determined using this prefix and the value of the "Split-basis column" property (unusable characters will be excluded).

[% format character addressable] [variable expansion addressable]

BLOCK memos Make notes about this BLOCK.
Query priority level

Select the query priority level.

  • Interactive (executes immediately, limited simultaneous queries)
  • Batch (executes when resources are available, unlimited simultaneous queries)
Reattempts in case of errors Configure the number of attempts to try in case of a BigQuery error or time out.
Minimum timeout interval Set the number of seconds to wait for results. If results are not returned during this interval, the time will be doubled for each reattempt until the time set in the “Maximum timeout interval” property is reached.
Maximum timeout interval Indicate the maximum number of seconds to wait for results. The timeout interval will start with the value set in the "Minimum timeout interval" property and double with each reattempt until reaching the value set here.