You can utilize RAI Human-Robot Interaction (HRI) package to converse with your robots. This package allows you to simply chat with your robot, or to give it tasks and receive feedback and reports. You have the following options:
- Voice communication using ASR and TTS models (OpenAI Whisper)
- Text communication using Streamlit
If your environment is noisy, voice communication might be tricky. In noisy environments, it is better to use text channel.
The general architecture follows the diagram above. Text is captured from the input source, transported to the HMI, processed according to the given tools and robot's rules, and then sent to the output source.
In the voice interface, the input source is a microphone, while the output source is a speaker. The input is processed using the OpenAI Whisper model (cloud-based, paid) or with the local model, while the output can be produced using OpenTTS (Apache-2.0, depending on the model used) or ElevenLabs (cloud-based, paid).
The text interface is implemented directly in RAI_HMI using Streamlit. The GUI closely follows standard chat-like conversations, with built-in support for tool integration.