Speech recognition using selected model (beta)
This BLOCK uses a pre-made speech recognition models from the Google Cloud Speech-to-Text open_in_new service to transcribe videos, phone conversations, and the like.
info We recommend reading Google’s Best Practices open_in_new guide for effectively using the Cloud Speech-to-Text service before using this BLOCK.
warning Self-Service Plan users:
You must enable the Google Cloud Speech-to-Text API to use this BLOCK. Refer to Basic Guide > Hints > Enabling Google APIs for details.
error This BLOCK is currently in beta and may become usable after the official version is released. Please switch to using the official version at that time.
As a beta version, some features may not work as intended. We appreciate your feedback regarding bugs and ways to improve MAGELLAN BLOCKS.
|BLOCK name||Configure the name displayed on this BLOCK.|
|GCP service account||Select the GCP service account to use with this BLOCK.|
|Audio file GCS URL||Designate the GCS URL of the audio file that will be analyzed.|
Select which of the pre-made models will be used to analyze the audio file.
Designate the variable that will store the text transcription of the audio file.
For details about this BLOCK’s output, refer to Output specifications > Speech recognition.
Designate the encoding of the audio file to be transcribed. The following encodings can be selected:
FLAC and LINEAR16 are recommended as the best encoding types for voice recognition. For further details on each encoding and how to convert types, refer to Audio encoding for Google Cloud Speech-to-Text API.
Designate the sampling rate of the audio file to be transcribed. Sampling rates can range between 8,000 – 48,000 Hertz (Hz).
For best results, Google recommends using audio with a 16000 Hz sampling rate.
Select the language of the audio to be transcribed.
See Language Support open_in_new for a full list of supported languages.
|BLOCK memos||Make notes about this BLOCK.|
When the audio data is converted into text, multiple recognition alternatives can be returned. This property sets the maximum number of these alternative results within a range of 0 to 30.
Setting this to 0 or 1 will return a maximum of 1 result.
|Profanity filter||Activating this property will turn on the profanity filter, thus removing any swear words from the resultant text data.|
|Contextual word/phrase hints||Provide any words or phrases that might strengthen the Speech-to-Text API’s recognition accuracy.|