Tutorial: Google Cloud Speech API with Service Account

This tutorial explains how to use the Google Cloud Speech API with Google Apps Script. We’ll use a Service Account to authenticate the application to the Cloud Speech API and the source audio file is stored in a Google Cloud Storage bucket.

The application uses the asynchronous speech recognition mode since the input audio is longer than a minute.

Step 1: Enable Cloud Speech API

Create a new Google Apps Script project, go to Resources > Cloud Platform Project to open the associated project in the Google Developers Console. Go to Libraries and enable the Cloud Speech API.

service-account-key.png

Step 2: Create Google Service Account

Go to the Credentials tab, create credentials and choose Service Account from the drop down. Set the service account role as project owner and save the JSON private key file to your Google Drive.

Step 3: Run the Code

Paste this code in your Google Apps Script editor. Remember to change the location of the audio file in Google Cloud Storage and the location of the service account key in Google Drive.

/* 

Written by Amit Agarwal
email: amit@labnol.org
web: https://digitalinspiration.com
twitter: @labnol

*/

// Get the service account private keys from Google Drive
function getServiceAccountKeys() {
    var fileLink = "https://drive.google.com/open?id=ctrlq....";
    var fileId = fileLink.match(/[\w-]{25,}/)[0];
    var content = DriveApp.getFileById(fileId).getAs("application/json").getDataAsString();
    return JSON.parse(content);
}

// Create the Google service
function getGoogleCloudService() {
    var privateKeys = getServiceAccountKeys();
    return OAuth2.createService('GoogleCloud:' + Session.getActiveUser().getEmail())
        // Set the endpoint URL.
        .setTokenUrl('https://accounts.google.com/o/oauth2/token')
        // Set the private key and issuer.
        .setPrivateKey(privateKeys['private_key'])
        .setIssuer(privateKeys['client_email'])
        // Set the property store where authorized tokens should be persisted.
        .setPropertyStore(PropertiesService.getScriptProperties())
        // Set the scope. 
        .setScope('https://www.googleapis.com/auth/cloud-platform');
}

// Initialize an async speech recognition job
function createRecognitionJob() {
    var service = getGoogleCloudService();
    if (service.hasAccess()) {
        var accessToken = service.getAccessToken();
        var url = "https://speech.googleapis.com/v1/speech:longrunningrecognize";
        var payload = {
            config: {
                languageCode: "en-US"
            },
            audio: {
                uri: "gs://gcs-test-data/vr.flac"
            }
        };
        var response = UrlFetchApp.fetch(url, {
            method: 'POST',
            headers: {
                Authorization: 'Bearer ' + accessToken
            },
            contentType: "application/json",
            payload: JSON.stringify(payload)
        });
        var result = JSON.parse(response.getContentText());
        Utilities.sleep(30 * 1000);
        getTranscript(result.name, accessToken);
    }
}

// Print the speech transcript to the console
function getTranscript(name, accessToken) {
    var url = "https://speech.googleapis.com/v1/operations/" + name;
    var response = UrlFetchApp.fetch(url, {
        method: 'GET',
        headers: {
            Authorization: 'Bearer ' + accessToken
        }
    });
    var result = JSON.parse(response.getContentText());
    Logger.log(JSON.stringify(result, null, 2));
}

Authorize the code and, if all the permissions are correctly setup, you should see the audio transcript in your console window as shown below.

cloud-speech-api.png