* Asterisk POC‎ > ‎

8) speech to text

A simple CURL request can do with google STT (here, speech supposed to be in french : fr-FR): 

curl -H "Content-Type: audio/x-flac; rate=16000" "https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=fr-FR" -F myfile="@msg0001.flac" -k -o text.txt

You might have to convert sound to mono/16KHz : 
ffmpeg -i msg0001.wav -ac 1 -ar 16000 -ss 0:0:0 -t 0:0:5 msg0001-o.wav

Same in Python : 

#!/usr/bin/env python2
# -*- coding: utf-8 -*-

import httplib
import json
import sys

def speech_to_text(audio):
    conn = httplib.HTTPSConnection('www.google.com')
    conn.request("POST", '/speech-api/v1/recognize?xjerr=1&client=chromium&lang=fr', audio, {"Content-type": "audio/x-flac; rate=16000"} )
    response = conn.getresponse()
    data = response.read()
    jsdata = json.loads(data)
    return jsdata["hypotheses"][0]["utterance"]

if __name__ == "__main__":
    if len(sys.argv) != 2 or "--help" in sys.argv:
        print "Usage: stt.py <flac-audio-file>"
        with open(sys.argv[1], "r") as f:
            speech = f.read()
        text = speech_to_text(speech)
        print text