BLOCKS Reference

BigQuery

Load to single table from GCS

Overview

This BLOCK loads data from a file stored in Google Cloud Storage (GCS) into a BigQuery table.

Properties

Property Explanation
BLOCK name Configure the name displayed on this BLOCK.
GCP service account Select the GCP service account to use with this BLOCK.
Source file GCS URL

Designate the GCS URL of the file containing the data that will be sent to a BigQuery table. This URL should be formatted as gs://bucketname/objectname.

You can designate multiple files by using an asterisk in the GCS URL (gs://bucketname/objectname*.csv). The asterisk will be replaced with any string greater than 0 characters, so any matching filenames will be read.

Destination dataset

Designate the ID of the BigQuery dataset containing the table that data will be sent to.

Supports % format characters and variable expansion.

Destination table

Designate the ID of the BigQuery table that data will be sent to.

Supports % format characters and variable expansion.

Schema settings

Designate schema for the destination table. This property can be skipped when the source data file contains JSON format data.

You can set your schema directly in JSON by clicking Edit as JSON.

In cases of non-empty tables

Select the action to perform when the destination table already contains data.

  • Append: Appends new data to the table.
  • Overwrite: Overwrites the table with the new data.
  • Error: An error occurs if the table is not empty.
BLOCK memos Make notes about this BLOCK.
Reattempts in case of errors Configure the number of attempts to try in case of a BigQuery error or time out.
Minimum timeout interval Set the number of seconds to wait for results. If results are not returned during this interval, the time will be doubled for each reattempt until the time set in the Maximum timeout interval property is reached.
Maximum timeout interval Indicate the maximum number of seconds to wait for results. The timeout interval will start with the value set in the Minimum timeout interval property and double with each reattempt until reaching the value set here.
File format

Choose a file format from those in GCS. Permissible formats are as follow.

  • CSV
  • NEWLINE_DELIMITED_JSON
  • DATASTORE_BACKUP
CSV delimiter character

Select the delimiter character used for CSV files.

  • Comma
  • Tab
  • Pipe
  • Other

If you choose Other, specify a delimiter character in the accompanying field.

Number of skipped rows Configure the number of lead rows to skip for CSV files.
Permit rows with insufficient fields Select whether or not to permit rows with insufficient fields for CSV files.
Designate quotation marks Designate the character used for quotation marks for CSV files.
Allow line breaks within quoted fields Select whether or not allow quoted fields to contain line breaks.
Max number of bad rows Configure how many bad rows to allow before resulting in an error.
Ignore extra fields Define whether or not to ignore extra fields.
Trigger file URL Designate a URL that BLOCKS will use to check if a file has been saved before starting the data transfer. If left blank, the data transfer will start without checking if a file has been saved.
Number of checks Configure the maximum number of times to check if a file has been saved to the trigger file URL.
Time between checks Configure how many seconds to wait between checks on whether a file has been saved to the trigger file URL.

この情報は役に立ちましたか?