About
Rhasspy was created and is currently maintained by Michael Hansen.
Special thanks to:
Motivation
A typical voice assistant (Alexa, Google Home, etc.) solves a number of important problems:
- Deciding when to record audio (wake word)
- Listening for voice commands (command listener)
- Transcribing command/question (speech to text)
- Interpreting the speaker's intent from the text (intent recognition)
- Fulfilling the speaker's intent (intent handling)
Rhasspy provides offline, private solutions to problems 1-4 using off-the-shelf tools. These tools are:
- Wake word
- Command listener
- Speech to text
- Intent recognition
For problem 5 (fulfilling the speaker's intent), Rhasspy works with external home automation software, such as Home Assistant's built-in automation capability or a Node-RED flow.
For each intent you define, Rhasspy emits a JSON event that can do anything Home Assistant can do (toggle switches, call REST services, etc.). This means that Rhasspy will do very little out of the box compared to other voice assistants, but there are also be no limits to what can be done.
Supporting Tools
The following tools/libraries help to support Rhasspy:
- eSpeak (text to speech)
- flair (intent recognition)
- Flask (web server)
- flite (text to speech)
- fuzzywuzzy (fuzzy string matching)
- Kaldi (speech to text)
- MaryTTS (text to speech)
- Montreal Forced Aligner (acoustic models)
- Mycroft Adapt (intent recognition)
- Mycroft Precise (wake word)
- Phonetisaurus (word pronunciations)
- PicoTTS (text to speech)
- Pocketsphinx (speech to text, wake word)
- porcupine (wake word)
- PyAudio (microphone)
- pyjsgf (JSGF grammar parsing)
- Python 3
- OpenFST (intent recognition)
- Opengrm (language modeling)
- Rasa NLU (intent recognition)
- sphinxtrain (acoustic model tuning)
- snowboy (wake word)
- Sox (WAV conversion)
- Vue.js (web UI)
- webrtcvad (voice activity detection)