BLOCKS Reference

Output specifications

Speech recognition

This page details the type of results returned by the Speech recognition BLOCK.

Example results

The example below shows results from using the Speech recognition BLOCK to analyze an audio file. The file contains a reading of a sentence from the MAGELLAN BLOCKS website, "BLOCKS is a service that gives anyone the power to use amazing services like those Google provides."

{
  "results": [
    {
      "alternatives": [
        {
          "transcript": "blocks is a service that gives anyone the power to use amazing services like those Google provides",
          "confidence": 0.92188913
        },
        {
          "transcript": "blocks as a service that gives anyone the power to use amazing services like those Google provides"
        },
        {
          "transcript": "lots of a service that gives anyone the power to use amazing services like those Google provides"
        }
      ]
    }
  ],
  "gcs_url": "gs://magellan-iot-sample/sample.flac",
  "timestamp": 1479717281.0
}

Results from the Speech recognition BLOCK are returned as JSON format data.

Specifications

The specifications of the JSON data returned by the Speech recognition are as follows:

{
  "results": [
    {
      "alternatives": [
        {
          "transcript": <string>,
          "confidence": <number>
        }
      ]
    }
  ],
  "gcs_url": <string>,
  "timestamp": <number>
}
Name Item
"results" The results of transcribing the audio data.
"alternatives" A list of alternative transcription results. The number of these alternatives returned can be set within the Speech recognition BLOCK's "Max alternatives" property, and can range from zero to thirty.
"transcript" The text data (string) transcribed from the audio data.
"confidence" The level of confidence (ranging from 0.0 to 1.0) that the audio was transcribed accurately. A higher value shows greater confidence. Generally, this is only returned with high-confidence text results.
"gcs_url" A string that refers to the GCS URL of the stored audio data.
"timestamp" A numerical value showing the date/time that the audio data was transcribed. This is shown in the UNIX time format. For example, 18:34:44 on October 24, 2016 would be displayed as 1477301684.0.

These specifications were recorded based on the October 2016 version of Method: speech.syncrecognize | Google Cloud Platform

この情報は役に立ちましたか?