Speech Synthesis API

The Speech Synthesis API is a great API, very suitable for trying out new interfaces and letting the browser talk to you

Speech synthesis API is a powerful tool provided by modern browsers.

Launched in 2014 and nowWidely adoptedAnd it is available in Chrome, Firefox, Safari and Edge. IE is not supported.

Browser support for the Speech Synthesis API

It is part of the Web Speech API,Speech recognition API, Although it is currently only supported in experimental mode on Chrome.

I recently used it to provide alerts on pages that monitor certain parameters. When one of the numbers went up, I was frightened by the computer speakers.

getting Started

useSpeech Synthesis APIStay on one line:

speechSynthesis.speak(new SpeechSynthesisUtterance('Hey'))

Copy and paste it into your browser console, and your computer will speak!

API

The API exposes multiple objects towindowpurpose.

SpeechSynthesisUtterance

SpeechSynthesisUtteranceRepresents a voice request. In the above example, we passed a string for it. That is the message that the browser should read aloud.

After obtaining the voice object, you can make some adjustments to edit the voice attributes:

const utterance = new SpeechSynthesisUtterance('Hey')
  • utterance.rate: Set the speed, acceptable [0.1-10], the default is 1
  • utterance.pitch: Set the pitch, accept [0-2], the default is 1
  • utterance.volume: Set the volume, accept [0-1], the default is 1
  • utterance.lang: Set the language (the value uses the BCP 47 language tag, for exampleen-USorit-IT)
  • utterance.text: You can pass it as an attribute instead of setting it in the constructor. Text up to 32767 characters
  • utterance.voice: Set the sound (see below for this content)

example:

const utterance = new SpeechSynthesisUtterance('Hey')
utterance.pitch = 1.5
utterance.volume = 0.5
utterance.rate = 8
speechSynthesis.speak(utterance)

Set sound

The browser has a different number of sounds available.

To view the list, use the following code:

console.log(`Voices #: ${speechSynthesis.getVoices().length}`)

speechSynthesis.getVoices().forEach(voice => { console.log(voice.name, voice.lang) })

Voices in Firefox

This is one of the cross-browser issues. The above code can be used in Firefox, Safari (maybe Edge, but I did not test it), butDoesn't work in Chrome. Chrome needs to handle sounds in a different way, and needs to call a callback after the sound is loaded:

const voiceschanged = () => {
  console.log(`Voices #: ${speechSynthesis.getVoices().length}`)
  speechSynthesis.getVoices().forEach(voice => {
    console.log(voice.name, voice.lang)
  })
}
speechSynthesis.onvoiceschanged = voiceschanged

After calling the callback, we can usespeechSynthesis.getVoices().

I think this is because Chrome (if there is a network connection) checks other languages in Google servers:

Chrome Languages

If there is no internet connection, the number of languages available is the same as Firefox and Safari. When the network is enabled, other languages can be used, but the API can also work offline.

Cross-browser implementation to get language

Because of this difference, we need a way to abstract it to use the API. This example makes this abstraction:

const getVoices = () => {
  return new Promise(resolve => {
    let voices = speechSynthesis.getVoices()
    if (voices.length) {
      resolve(voices)
      return
    }
    speechSynthesis.onvoiceschanged = () => {
      voices = speechSynthesis.getVoices()
      resolve(voices)
    }
  })
}

const printVoicesList = async () => { ;(await getVoices()).forEach(voice => { console.log(voice.name, voice.lang) }) }

printVoicesList()

See glitch

Use custom language

The default sound is English.

You can use any language you want by setting the utterancelangproperty:

let utterance = new SpeechSynthesisUtterance('Ciao')
utterance.lang = 'it-IT'
speechSynthesis.speak(utterance)

Use other sounds

If more than one sound is available, you may want to select another sound. For example, the default Italian voice is female, but maybe I want a male voice. This is the second voice we got from the voice list.

const lang = 'it-IT'
const voiceIndex = 1

const speak = async text => { if (!speechSynthesis) { return } const message = new SpeechSynthesisUtterance(text) message.voice = await chooseVoice() speechSynthesis.speak(message) }

const getVoices = () => { return new Promise(resolve => { let voices = speechSynthesis.getVoices() if (voices.length) { resolve(voices) return } speechSynthesis.onvoiceschanged = () => { voices = speechSynthesis.getVoices() resolve(voices) } }) }

const chooseVoice = async () => { const voices = (await getVoices()).filter(voice => voice.lang == lang)

return new Promise(resolve => { resolve(voices[voiceIndex]) }) }

speak(‘Ciao’)

See glitch

The value of language

These are some examples of languages you can use:

  • Arabic (Saudi Arabia)➡️ar-SA
  • Chinese (China)➡️zh-CN
  • Chinese (Hong Kong Special Administrative Region of China)➡️zh-HK
  • Chinese (Taiwan)➡️zh-TW
  • Czech Republic (Czech Republic)➡️cs-CZ
  • Danish (Denmark)➡️da-DK
  • Dutch (Belgium)➡️nl-BE
  • Dutch (Netherlands)➡️nl-NL
  • English (Australia)➡️en-AU
  • English (Ireland)➡️en-IE
  • English (South Africa)➡️en-ZA
  • English (UK)➡️en-GB
  • English (United States)➡️en-US
  • Finnish (Finland)➡️fi-FI
  • French (Canada)➡️fr-CA
  • French (France)➡️fr-FR
  • German (Germany)➡️de-DE
  • Greek (Greece)➡️el-GR
  • Hindi (India)➡️hi-IN
  • Hungarian (Hungary)➡️hu-HU
  • Indonesian (Indonesia)➡️id-ID
  • Italian (Italy)➡️it-IT
  • Japanese (Japan)➡️ja-JP
  • Korean (South Korea)➡️ko-KR
  • Norwegian (Norway)➡️no-NO
  • Polish (Poland)➡️pl-PL
  • Portuguese (Brazil)➡️pt-BR
  • Portuguese (Portugal)➡️pt-PT
  • Romanian (Romania)➡️ro-RO
  • Russian (Russia)➡️ru-RU
  • Slovakia (Slovakia)➡️sk-SK
  • Spanish (Mexico)➡️es-MX
  • Spanish (Spain)➡️es-ES
  • Swedish (Sweden)➡️sv-SE
  • Thailand (Thailand)➡️th-TH
  • Turkish (Turkey)➡️tr-TR

Mobile

On iOS, the API can run, but it must be triggered by user operation callbacks (such as responses to tap events) to provide users with a better experience and avoid unexpected sounds from the phone.

You can’t like it on a desktop computer. On a desktop computer, your web page can make your text come out of nowhere.

Download mine for freeJavaScript beginner's manual


More browser tutorials: