Voice
TextToSpeech
Google

Name

googleTextToSpeech

Description

Uses Google Cloud Text-to-Speech to convert text to speech.

API Key

Uses the googleapis API key:

index.ts

const aigur = createClient({
apiKeys: {
googleapis: process.env.GOOGLE_API_KEY
}
})

Example

index.ts

import { googleTextToSpeech } from '@aigur/client';
//...
flow.node(googleTextToSpeech, () => ({
text: 'Hello world'
})) // --> {audio: base64String}

Input

Property Type Required Description Default Value
text string Yes The text you want to turn to speech.
speakingRate number No Speaking rate/speed, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast. If unset(0.0), defaults to the native 1.0 speed. Any other values < 0.25 or > 4.0 will return an error. 1
pitch number No Speaking pitch, in the range [-20.0, 20.0]. 20 means increase 20 semitones from the original pitch. -20 means decrease 20 semitones from the original pitch. 0
encoding enum('MP3', 'FLAC', 'LINEAR16', 'MULAW', 'AMR', 'AMR_WB', 'OGG_OPUS', 'SPEEX_WITH_HEADER_BYTE', 'WEBM_OPUS') No The encoding determines the output audio format that we'd like. MP3
voice.language string No The language (and potentially also the region) of the voice expressed as a BCP-47 language tag, e.g. "en-US". en-US
voice.name enum('en-US-Standard-A', 'en-US-Standard-C', 'en-US-Standard-D', 'en-US-Standard-E', 'en-US-Standard-F', 'en-US-Standard-G', 'en-US-Standard-H', 'en-US-Standard-I', 'en-US-Standard-J', 'en-US-Studio-M', 'en-US-Studio-O', 'en-US-Wavenet-A', 'en-US-Wavenet-B', 'en-US-Wavenet-C', 'en-US-Wavenet-D', 'en-US-Wavenet-E', 'en-US-Wavenet-F', 'en-US-Wavenet-G', 'en-US-Wavenet-H', 'en-US-Wavenet-I', 'en-US-Wavenet-J', 'en-US-News-K', 'en-US-News-L', 'en-US-News-M', 'en-US-News-N', 'en-US-Standard-A', 'en-US-Standard-B', 'en-US-Standard-C', 'en-US-Standard-D', 'en-US-Standard-E', 'en-US-Standard-F', 'en-US-Standard-G', 'en-US-Standard-H', 'en-US-Standard-I', 'en-US-Standard-J') No The name of the voice. en-US-Neural2-C

Output

Property Type
audio string (base64)