Slowing Down Flite

June 4, 2020

Festival is a “text to speech” (TTS) system. You give it some text, and it will read it back to you. Developed by the folks at University of Edinburgh the system has been refined over the years so that the sound of the voice you are hearing can be made remarkably lifelike.

It’s a good program, but it’s a substantial package and puts a good load on the system. A derivative, called Flite (Festival-lite), developed at Carneigie Mellon University provides almost the same level of performance, but is much, er, ’liter'.

I’ve been using flite(1) for a while to prod me to get off the computer and exercise, and it’s good for that. Install the FreeBSD package on your system (sudo pkg install flite) and let’s give it a whirl.

Running flite is fairly easy - it takes text input on stdin, and reads it back. If you don’t like the default voices, there are some parameters you can use to change the timbre, pitch, and other audiogenic effects.

Here’s a basic example of using flite in a script, with the default voice.

#!/bin/sh

export VCMD="/usr/local/bin/flite"

export SAYWHAT="Hello Jim.  Time to get moving.  I'll check with you again in 45 minutes."

echo $SAYWHAT | $VCMD

Sounds terrible, right? And it’s not the kind of voice that’s going to really motivate me to get out of my chair. So let’s customize the voice and give it a bit more gravitas.

I’ve selected the “awb” voice, named after one of the original Festival developers Dr. Alan W. Black. It’s got a nice Scottish burr. I’ve also adjusted a couple of parameters:

#!/bin/sh
export VCMD="/usr/local/bin/flite -voice /home/jpb/src/flite/voices/cmu_us_awb.flitevox  \
     --setf duration_stretch=.85 \
     --setf int_f0_target_mean=85 "  

export SAYWHAT="Hello Jim.  Time to get moving.  I'll check with you again in 45 minutes."


echo $SAYWHAT | $VCMD

Much better.

One of the recurring questions around flite (and TTS systems in general) is “How do I slow down, or pause, the voice in Flite?” Well, after a bit of experimentation, I’ve discovered a way to do that.

Figure 1 shows our customized flite voice performing a simple count to ten.

#!/bin/sh

export VCMD="/usr/local/bin/flite -voice /home/jpb/src/flite/voices/cmu_us_awb.flitevox  \
     --setf duration_stretch=.85 \
     --setf int_f0_target_mean=85 "  

export SAYWHAT="One Two Three Four Five Six Seven Eight Nine Ten."

echo $SAYWHAT | $VCMD

(Note: flite can only output .wav files. You can convert them to .mp4 and .webm with FFmpeg.)

Figure 1: flite without any modification

Figure 1: flite without any modification

Using Audacity - the fine audio recorder and editor program , we can see the output along a timeline. It gallops right along. The clip is done in about 2.6 seconds.

Experimenting with punctuation in my script, I found that when using the escaped double quote, \" , I could get flite to slow down.

export SAYWHAT="One \" Two \" Three \" Four \" Five \" Six \" Seven \" Eight \" Nine \" Ten."
Figure 2: flite with 1 escaped double quote

Figure 2: flite with 1 escaped double quote

Adding more escaped double quotes slowed it down even further.

export SAYWHAT="One \" \" Two \" \" Three \" \" Four \" \" Five \" \" Six \" \" Seven \" \" Eight \" \" Nine \" \" Ten."
Figure 3: flite with 2 escaped double quotes

Figure 3: flite with 2 escaped double quotes

Adding up to thirty escaped double quotes I was able to get a very long pause between two words (shown with Unix line continuation ‘' character):

export SAYWHAT="One \" \" \" \
 \" \" \" \
 \" \" \" \
 \" \" \" \
 \" \" \" \
 \" \" \" \
 \" \" \" \
 \" \" \" \
 \" \" \" \
 \" \" \" \
Two"
Figure 4: flite with 30 escaped double quotes

Figure 4: flite with 30 escaped double quotes

But while adding extra escaped double quotes worked in a script, it doesn’t seem to have the same effect when inserted in a text file. The first one does have an effect, but the second, third, etc. don’t give any additional delay.

echo "One \" \" \" Two \" \" \" Three \" \" \" Four  \" \" \"Five." > f.f

flite < f.f

I guess I’ll have to stay at the computer to figure this out and ignore this guy for now:

#!/bin/sh

export VCMD="/usr/local/bin/flite -voice /home/jpb/src/flite/voices/cmu_us_awb.flitevox  \
     --setf duration_stretch=.85 \
     --setf int_f0_target_mean=85 "  

export SAYWHAT="Hello Jim.  Glad to see you again.  We both want to get moving.  I'll check with you in 45 minutes."

echo $SAYWHAT | $VCMD

export NEXTDATE=""

while :
do

  NEXTDATE=`date -j -v+45M "+%l:%M %p"`
  echo "Next reminder at: ${NEXTDATE}"
  echo "See you again at ${NEXTDATE}" | $VCMD

  sleep 2700

  SAYWHAT="Excuse me Jim.  You should take a break for a few minutes right now."

  echo $SAYWHAT | $VCMD

  sleep 300

  SAYWHAT="Ok Jim.  That was five minutes. Good to go."

  echo $SAYWHAT | $VCMD

done