Install instructions for English US Matthew

Download or clone this repository, e.g. git clone https://github.com/avimar/freeswitch-sounds-polly.git
Ensure the new folder is available: mkdir /usr/share/freeswitch/sounds/en/us/matthew
Copy Matthew-neural (or Matthew-standard) to your sounds path, e.g. debian: cp -r freeswitch-sounds-polly/Matthew-neural /usr/share/freeswitch/sounds/en/us/matthew
- For a mix of neural for phrases and standard for single-words use this:
- cd freeswitch-sounds-polly;
- rsync -av --exclude-from=better-as-standard.txt Matthew-neural/ /usr/share/freeswitch/sounds/en/us/matthew;
- rsync -av --exclude-from=better-as-neural.txt Matthew-standard/ /usr/share/freeswitch/sounds/en/us/matthew;
To ensure not just phrases and sounds, but also mod_say_en uses the new voice, make sure it allows you to specify a path dynamically. Edit /usr/share/freeswitch/lang/en/en.xml and remove sound-prefix="$${sound_prefix}". There doesn't seem to be any downside. The default is still set in vars.xml
To switch the default voice:
- For your entire system -- edit /etc/freeswitch/vars.xml
  - replace <X-PRE-PROCESS cmd="set" data="sound_prefix=$${sounds_dir}/en/us/callie"/>
  - with <X-PRE-PROCESS cmd="set" data="sound_prefix=$${sounds_dir}/en/us/mattew"/>
- For just one channel, set:
  - <action application="set" data="sound_prefix=$${sounds_dir}/en/us/matthew-reg" />

Information

Goal: Generate FreeSWITCH sound files using Amazon Polly.

Motivation:

If you have a whole sound set, you can generate new sounds when you want that sound consistent with the others
Why now?
- Polly licence allows you to re-use the files
  
  Q. Can I use the service for generating static voice prompts that will be replayed multiple times? Yes, you can. The service does not restrict this and there are no additional costs for doing so. https://aws.amazon.com/polly/faqs/
  
  You can cache and save Polly’s speech audio to replay offline or redistribute. https://docs.aws.amazon.com/whitepapers/latest/aws-overview/machine-learning.html
- Voices, especially Neural, are pretty high quality
- Multiple languages with the same API
- Pretty cheap

Limitations:

Highest quality sounds they produce is only 24000, whereas FreeSWITCH sounds come with 48000.

Contributing:

Audiophile/knowledge: How to get the best audio quality?
- 24000 only comes from OGG/MP3, but we want WAV/PCM for raw audio -- we should convert -- from mp3 or ogg?
- Is downsampling 16000 PCM to 8000 with sox just as good as re-generating at 8000?
- If you're an audiophile or know how FreeSWITCH handles audio, check the various files in the conference folder.
FreeSWICH testing: Test mod_say for numbers and currencies.
- Do the files flow together?
- If so, how do we fix it? And automate it? Maybe the non-neural ones are better for numbers?

How to generate more audio:

TODO:

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
Hans-standard		Hans-standard
Matthew-neural		Matthew-neural
Matthew-standard		Matthew-standard
Matthew-test-conversion		Matthew-test-conversion
other voices		other voices
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
better-as-neural.txt		better-as-neural.txt
better-as-standard.txt		better-as-standard.txt
generate.js		generate.js
load-phrases-xml.js		load-phrases-xml.js
package.json		package.json
phrases-class.js		phrases-class.js
phrases_de.xml		phrases_de.xml
phrases_en.xml		phrases_en.xml