1. Home
  2. Knowledge Base
  3. Installation
  4. Speech to text conversion
  1. Home
  2. Knowledge Base
  3. Management
  4. Speech to text conversion

Speech to text conversion

Using a simple plugin and SurgeMail 7.4f-5 or later you can automatically convert incoming messages containing voice messages into text. The Google API is used for this.

Installation instructions:

Install ffmpeg !!!

Install python 3.7 or later (if not already installed). See: https://www.python.org/

python3 --version

Install the google speech api.

Windows: pip install --upgrade google-cloud-speech  
Linux: pip3 install --upgrade google-cloud-speech
Linux: pip3 install --upgrade google-api-python-client

Download the two scripts you need here:

wget https://netwinsite.com/ftp/surgemail/speech.tar.gz
cd /usr/local/surgemail
gunzip speech.tar.gz
tar -xvf speech.tar

Extract and place in your surgemail folder. Then change the line in speech_run.cmd (for windows) to point to your python installation:

APATH=c:\anaconda

On linux: chmod +x speech_cmd.sh speech_submit.py

Add settings to surgemail.ini

Windows: g_speech_cmd "\surgemail\speech_run.cmd"

Linux: g_speech_cmd "/usr/local/surgemail/speech_run.sh"

Limit the conversion to messages from a particular source adderss:

g_speech_from "*@xyz.com"

Create google cloud account

  1. Create a google cloud project and grant access to the speech to text API. https://console.cloud.google.com
  2. Go to the top of the page, use the drop down menu to create a new 'project'.
  3. Go to the top again, and select the new project
  4. Click on 'API's and services' then 'Credentials' then +Create Credentials
  5. Choose 'Service Account'
  6. Select role 'Service Owner'
  7. Create credentials, save in surgemail folder.
  8. Click on Google Cloud Platform 'top left'
  9. Click on API's and Services/ Library
  10. Search for 'speech', select Clound speech to text api
  11. Click 'enable'

Testing it

First run the python test script to see if your configuration is valid and credentials in your speech.json file are correctly setup.

windows:  speech_run.cmd speech_sample.wav 
Linux: ./speech_run.sh speech_sample.wav

Now examine speech_sample.wav.txt to see if it was created. If not examine speech_sample_wav.err

Things you will need to fix!

  • Add full path to ffmpeg command in speech_submit.py if command not in path
  • Ensure the google api is installed for user 'mail'.

Next send a message with an attached .wav file, then grep the logs for 'speech' to see how it went. The wav file must be of the correct format (mono wav file).

Was this article helpful?

Related Articles