Intent Recognition
After your voice command has been transcribed by the speech to text system, the next step is to recognize your intent. The end result is a JSON event with information about the intent.
Available intent recognition systems are:
The following table summarizes the trade-offs of using each intent recognizer:
System | Ideal Sentence count | Training Speed | Recognition Speed | Flexibility |
---|---|---|---|---|
Fsticuffs | 1M+ | very fast | very fast | ignores unknown words |
Fuzzywuzzy | 100-1K | fast | very fast | fuzzy string matching |
Snips NLU | 1K-100K | moderate | very fast | handles unseen words/entities |
RasaNLU | 1K-100K | very slow | moderate | handles unseen words |
Mycroft Adapt | 100-1K | moderate | fast | ignores unknown words |
Flair | 1K-100K | very slow | moderate | handles unseen words |
MQTT/Hermes
Rhasspy receives intent recognition requests on the hermes/nlu/query
topic. Successful recognitions are published to hermes/intent/<intentName>
, and unsuccessful recognitions to hermes/nlu/intentNotRecognized
The format of these messages adheres to the Hermes protocol.
You can react to these intent recognitions in your own programs, for example using the rhasspy-hermes-app library.
Fsticuffs
Uses the rhasspy-nlu library to recognize only those sentences that Rhasspy was trained on. While less flexible than the other intent recognizers, fsticuffs
can be trained and perform recognition over millions of sentences in milliseconds. If you only plan to recognize voice commands from your training set (and not unseen ones via text chat), fsticuffs
is the best choice.
Add to your profile:
"intent": {
"system": "fsticuffs",
"fsticuffs": {
"intent_graph": "intent.json",
"ignore_unknown_words": true,
"fuzzy": true
}
}
By default, fuzzy mathing is enabled (fuzzy
is true). This allows fsticuffs
to be less strict when matching text, skipping over any words in the profile's stop_words.txt
, and handling repeated words gracefully. Words must still appear in the correct order according to sentences.ini
, but additional words will not cause a recognition failure.
When ignore_unknown_words
is true, any word outside of sentences.ini
is silently ignored. This allows a lot more sentences to be accepted, but may cause unexpected results when used with arbitrary input from text chat.
Implemented by rhasspy-nlu-hermes
Fuzzywuzzy
Finds the closest matching intent by using the rapidfuzz library between the text and all of the training sentences you provided. Works best when you have a small number of sentences (dozens to hundreds) and need some resiliency to spelling errors (i.e., from text chat).
Add to your profile:
"intent": {
"system": "fuzzywuzzy",
"fuzzywuzzy": {
"examples_json": "intent_examples.json"
}
}
Implemented by rhasspy-fuzzywuzzy-hermes
Snips NLU
Uses Snips NLU to flexibly recognize sentences in the following languages: de, en, es, fr, it, ja, ko, pt_br, pt_pt, zh.
Add to your profile:
"intent": {
"system": "snips",
"snips": {
"language": "",
"engine_dir": "snips/engine",
"dataset_file": "snips/dataset.yaml"
}
}
If intent.snips.language
is not specified, the profile's language
is used. The engine_dir
and dataset_file
properties control where in your profile directory the generated engine and YAML dataset files are stored during training.
Number ranges are automatically converted into snips/number
entities. All tags are considered Snips slots. If a Rhasspy slots list is contained within the tag, the name of the Rhasspy $slot
will become the Snips entity name. Otherwise, the tag name is used and shared across intents.
Implemented by rhasspy-snips-nlu-hermes
RasaNLU
Recognizes intents remotely using a Rasa NLU server. You must install a Rasa NLU server somewhere that Rhasspy can access. Works well when you have a large number of sentences (thousands to hundreds of thousands) and need to handle sentences and words not seen during training. This needs Rasa 1.0 or higher.
Add to your profile:
"intent": {
"system": "rasa",
"rasa": {
"examples_markdown": "intent_examples.md",
"project_name": "rhasspy",
"url": "http://localhost:5005/"
}
}
Set intent.rasa.config_yaml
to the name of a file in your profile directory if you want to use a custom configuration during training. If unset, the default configuration is:
language: "en"
pipeline: "pretrained_embeddings_spacy"
where "en" is replaced with your profile's language or the value of intent.rasa.language
.
Installing Rasa NLU
If you have Docker, Rasa NLU can be run with (only on the Linux/amd64 architecture):
docker run -it -v "$(pwd):/app" -p 5005:5005 rasa/rasa:latest-spacy-en run --enable-api
Your Rasa NLU server should now be accessible at http://localhost:5005. Models will be saved in the models
directory (relative to your current directory).
Implemented by rhasspy-rasa-nlu-hermes
Mycroft Adapt
Not supported yet in 2.5!
Recognizes intents using Mycroft Adapt. Works best when you have a medium number of sentences (hundreds to thousands) and need to be able to recognize sentences not seen during training (no new words, though).
Add to your profile:
"intent": {
"system": "adapt",
"adapt": {
"stop_words": "stop_words.txt"
}
}
The intent.adapt.stop_words
text file contains words that should be ignored (i.e., cannot be "required" or "optional").
Flair
Not supported yet in 2.5!
Recognizes intents using the flair NLP framework. Works best when you have a large number of sentences (thousands to hundreds of thousands) and need to handle sentences and words not seen during training.
Add to your profile:
"intent": {
"system": "flair",
"flair": {
"data_dir": "flair_data",
"max_epochs": 25,
"do_sampling": true,
"num_samples": 10000
}
}
By default, the flair recognizer will generate 10,000 random sentences (num_samples
) from each intent in your sentences.ini file. If you set do_sampling
to false
, Rhasspy will generate all possible sentences and use them as training data. This will produce the most accurate models, but may take a long time depending on the complexity of your grammars.
A flair TextClassifier
will be trained to classify unseen sentences by intent, and a SequenceTagger
will be trained for each intent that has at least one tag. During recognition, sentences are first classified by intent and then run through the appropriate SequenceTagger
model to determine slots/entities.
Remote HTTP Server
Uses a remote Rhasppy server to do intent recognition. POSTs the text to an HTTP endpoint and receives an intent as JSON. An empty intent.name
property of the returned JSON object indicates a recognition failure.
Add to your profile:
"intent": {
"system": "remote",
"remote": {
"url": "http://my-server:12101/api/text-to-intent"
}
}
If you want to also POST to an endpoint during training, add to your profile:
"training": {
"system": "auto",
"intent": {
"remote": {
"url": "http://my-server/intent-training-endpoint"
}
}
}
If training.intent.remote.url
is set, Rhasspy will POST the intent graph generated by rhasspy-nlu to your endpoint as JSON. No response is expected, though an HTTP error code indicates that training has failed.
Implemented by rhasspy-remote-http-hermes
Home Assistant Conversation
Not supported yet in 2.5!
Sends transcriptions from speech to text to Home Assistant's conversation API. If the response contains speech, Rhasspy can optionally speak it.
Add to your profile:
"intent": {
"system": "conversation",
"conversation": {
"handle_speech": true
}
}
When handle_speech
is true
, Rhasspy will forward the returned speech to your text to speech system.
The settings from your profile's home_assistant
section are automatically used (URL, access token, etc.).
Because Home Assistant will already handle your intent (probably using an intent script), Rhasspy will always generate an empty intent with this recognizer.
Command
Recognizes intents from text using a custom external program. Your program should return a JSON object that describes the recognized intent; something like:
{
"intent": {
"name": "ChangeLightColor",
"confidence": 1.0
},
"entities": [
{ "entity": "name",
"value": "bedroom light" },
{ "entity": "color",
"value": "red" }
],
"text": "set the bedroom light to red"
}
An empty intent.name
property indicates a recognition failure.
Add to your profile:
"intent": {
"system": "command",
"command": {
"program": "/path/to/program",
"arguments": []
}
}
If you want to also call an external program during training, add to your profile:
"training": {
"system": "auto",
"intent": {
"command": {
"program": "/path/to/training/program",
"arguments": []
}
}
}
If training.intent.command.program
is set, Rhasspy will call your program with the intent graph generated by rhasspy-nlu provided as JSON on standard input. No response is expected, though a non-zero exit code indicates a training failure.
The following environment variables are available to your program:
$RHASSPY_BASE_DIR
- path to the directory where Rhasspy is running from$RHASSPY_PROFILE
- name of the current profile (e.g., "en")$RHASSPY_PROFILE_DIR
- directory of the current profile (whereprofile.json
is)
See text2intent.sh for an example program.
Implemented by rhasspy-remote-http-hermes
Dummy
Disables intent recognition.
Add to your profile:
"intent": {
"system": "dummy"
}