Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility to continue a spoken conversation when assist needs more info. #1

Open
sanderkooger opened this issue Dec 29, 2023 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@sanderkooger
Copy link

Hey big fan of the new integrations in home assistant.

TLDR: When user chooses to use an AI agent (OpenAI, Mixtral, etc) to do things in the house, the agent often has questions. With voice commands, it's currently not possible to continue the conversation, It would be a great improvement to add a function that an AI could trigger if it needs more information.

However, from what I have been reading, This would require a change in the workflow of the protocol, am I correct?

/what would it entail to allow the assistant itself to continue the conversation without having to shout out the wake word again an restating all that has been said before?

@jekalmin

@synesthesiam synesthesiam self-assigned this Jan 9, 2024
@synesthesiam synesthesiam added the enhancement New feature or request label Jan 9, 2024
@synesthesiam
Copy link
Contributor

I've got a start on this, but there is more work to do. I've extended the intent/handle related events with a context dictionary that will be used to hold conversational context.

Another piece that's missing is something in the response events (e.g., Intent, Handled) indicating that a follow-up response from the other end is required or possible. This could be as simple as a boolean, but I'd like to consider more options before committing.

@Shulyaka
Copy link

The response events would need context as well, because they will need to pass the conversation_id somehow.

@sdetweil
Copy link

the new response event also triggers text to speech to inform the user of the new input request.. this changes the flow from before where tts was the end. so the tts event needs info for the state manager to return to the asr
and mic turn on audio forwarding again.

thanks for bringing this up.. my intent was to use wyoming under smart mirror to replace the on platform snowboy with the docker container(and that dragged in hotword detection, and vad and asr...)

i had built in conversational support for an Alexa and Google Assistant plugins..(or anything) but now that amazon has killed software only Alexa's I hadn't used it much anymore..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants