Google – Aigur Client

Name

googleTextToSpeech

Description

Uses Google Cloud Text-to-Speech to convert text to speech.

API Key

Uses the googleapis API key:

index.ts

const aigur = createClient({
  apiKeys: {
    googleapis: process.env.GOOGLE_API_KEY
  }
})

Example

index.ts

import { googleTextToSpeech } from '@aigur/client';
//...
flow.node(googleTextToSpeech, () => ({
  text: 'Hello world'
})) // --> {audio: base64String}

Input

Property	Type	Required	Description	Default Value
text	string	Yes	The text you want to turn to speech.
speakingRate	number	No	Speaking rate/speed, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast. If unset(0.0), defaults to the native 1.0 speed. Any other values < 0.25 or > 4.0 will return an error.	1
pitch	number	No	Speaking pitch, in the range [-20.0, 20.0]. 20 means increase 20 semitones from the original pitch. -20 means decrease 20 semitones from the original pitch.	0
encoding	enum('MP3', 'FLAC', 'LINEAR16', 'MULAW', 'AMR', 'AMR_WB', 'OGG_OPUS', 'SPEEX_WITH_HEADER_BYTE', 'WEBM_OPUS')	No	The encoding determines the output audio format that we'd like.	MP3
voice.language	string	No	The language (and potentially also the region) of the voice expressed as a BCP-47 language tag, e.g. "en-US".	en-US
voice.name	enum('en-US-Standard-A', 'en-US-Standard-C', 'en-US-Standard-D', 'en-US-Standard-E', 'en-US-Standard-F', 'en-US-Standard-G', 'en-US-Standard-H', 'en-US-Standard-I', 'en-US-Standard-J', 'en-US-Studio-M', 'en-US-Studio-O', 'en-US-Wavenet-A', 'en-US-Wavenet-B', 'en-US-Wavenet-C', 'en-US-Wavenet-D', 'en-US-Wavenet-E', 'en-US-Wavenet-F', 'en-US-Wavenet-G', 'en-US-Wavenet-H', 'en-US-Wavenet-I', 'en-US-Wavenet-J', 'en-US-News-K', 'en-US-News-L', 'en-US-News-M', 'en-US-News-N', 'en-US-Standard-A', 'en-US-Standard-B', 'en-US-Standard-C', 'en-US-Standard-D', 'en-US-Standard-E', 'en-US-Standard-F', 'en-US-Standard-G', 'en-US-Standard-H', 'en-US-Standard-I', 'en-US-Standard-J')	No	The name of the voice.	en-US-Neural2-C

Output

Property	Type
audio	string (base64)

GPT3TurboStream WhisperAPI