A simple CURL request can do with google STT (here, speech supposed to be in french : fr-FR): curl -H "Content-Type: audio/x-flac; rate=16000" "https://www.google.com/speech-api/v1/recognize?xjerr=1&client=chromium&lang=fr-FR" -F myfile="@msg0001.flac" -k -o text.txt You might have to convert sound to mono/16KHz : ffmpeg -i msg0001.wav -ac 1 -ar 16000 -ss 0:0:0 -t 0:0:5 msg0001-o.wav Same in Python : #!/usr/bin/env python2 # -*- coding: utf-8 -*- import httplib import json import sys def speech_to_text(audio): conn = httplib.HTTPSConnection('www.google.com') conn.request("POST", '/speech-api/v1/recognize?xjerr=1&client=chromium&lang=fr', audio, {"Content-type": "audio/x-flac; rate=16000"} ) response = conn.getresponse() data = response.read() jsdata = json.loads(data) return jsdata["hypotheses"][0]["utterance"] if __name__ == "__main__": if len(sys.argv) != 2 or "--help" in sys.argv: print "Usage: stt.py <flac-audio-file>" sys.exit(-1) else: with open(sys.argv[1], "r") as f: speech = f.read() text = speech_to_text(speech) print text |
* Asterisk POC >