home-assistant-voice

Home Assistant Voice Preview Setup

This repo will guide you through a local Docker-based HA setup with Voice Assistant support.

image

ha-vp-llm_v2.mp4

Initial setup (Ubuntu)

git pull https://github.com/sskorol/home-assistant-voice.git && cd home-assistant-voice
docker compose pull
  • You can comment/remove the mosqutto service if you don’t use MQTT.
  • If you plan to use Ollama locally, you may want to explicitly install it and map the .ollama folder to the container.
  • Check supported Whisper/Piper environment vars to pull the correct models based on your language preference (EN is the default).
  • Piper voices can be tested here.
  • GPU is enabled by default for Whisper/Piper/Ollama images. You can remove deploy blocks if you plan to run on the CPU. Note that this guide doesn’t cover the CUDA setup. Follow the official NVIDIA tutorials for that.
docker compose up -d

Voice Assistant PE

Follow this guide to connect your hardware. Note that if you are as lucky as me and don’t have a stable BLE device on your PC, you’ll likely get infinite Bluetooth errors on your HA instance, preventing you from establishing the initial connection with Voice Assistant. I recommend you do not waste time, and use your cell phone to connect to the HA instance on your PC and set up Voice Assistant via mobile BLE.

Whisper / Piper / Ollama

  • Whisper and Piper can be added to HA via Wyoming Protocol. As HA shares the network with your host OS, ASR / TTS services will be accessible via localhost on 10300 and 10200 ports. Screenshot

  • Ollama will also be accessible on http://localhost:11434. If you’ve already downloaded models locally, you’ll immediately see them as we mapped the model folder with the container. Screenshot

I recommend starting with Llama 3.2. Don’t forget to tune the prompt and enable assistance:

Screenshot

Just so you know, if you want to add a new model, you need to add it as a separate service.

  • Now you can see newly added services in the Voice Assistant config:

Screenshot

  • Expose your smart home devices to voice assistant (alias will serve as a short command to access your device):

Screenshot

  • Ensure your ESPHome is linked with Voice Assist PE and connected to the newly configured virtual Voice Assistant in HA:

Screenshot

  • Test ASR / TTS and chatting capabilities via web UI:

Screenshot

Now you are ready to interact with your devices via Voice PE hardware and have casual conversations with Llama 3.2. Well, not really…

Known Issues

You’ll likely almost immediately face TTS failures on long responses from LLMs. There are several similar bugs raised across HA repos about this.

One workaround you can apply is patching the AudioReader timeout in the VA PE firmware code. Note that to build the firmware, you must provide your Wi-Fi credentials in secrets.yaml. You should also update the repo reference to ensure your changes are applied. Otherwise, the code will be pulled from the original dev branch.

Fork https://github.com/esphome/home-assistant-voice-pe.git.

git clone https://github.com/YOUR_USERNAME/home-assistant-voice-pe.git && cd home-assistant-voice-pe

Update secrets.yaml:

nano secrets.yaml
# Paste your WiFi creds:
wifi_ssid: "..."
wifi_password: "..."
  • Apply the above patch.
  • Push changes to your fork.
  • Update external_components URL to point to your fork.
  • Build and flash the updated firmware:
docker run --rm -v "$(pwd)":/config -it ghcr.io/esphome/esphome run --device IP_OF_YOUR_VA_PE home-assistant-voice.yaml

Also note that changes in containers’ environment variables (e.g., new models or voices) require a complete stop. New values won’t be automatically picked up on restart.

Visit original content creator repository https://github.com/sskorol/home-assistant-voice

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *