warning

This tutorial is a community contribution and is not supported by the Open WebUI team. It serves only as a demonstration on how to customize Open WebUI for your specific use case. Want to contribute? Check out the contributing tutorial.

Integrating `openai-edge-tts` 🗣️ with Open WebUI

What is `openai-edge-tts`?

OpenAI Edge TTS is a text-to-speech API that mimics the OpenAI API endpoint, allowing for a direct substitute in scenarios where you can define the endpoint URL, like with Open WebUI.

It uses the edge-tts package, which leverages the Edge browser's free "Read Aloud" feature to emulate a request to Microsoft / Azure in order to receive very high quality text-to-speech for free.

Sample the voices

How is it different from 'openedai-speech'?

Requirements

Docker installed on your system
Open WebUI running

⚡️ Quick start

The simplest way to get started without having to configure anything is to run the command below

docker run -d -p 5050:5050 travisvn/openai-edge-tts:latest

This will run the service at port 5050 with all the default configs

Setting up Open WebUI to use `openai-edge-tts`

Open the Admin Panel and go to Settings -> Audio
Set your TTS Settings to match the screenshot below
Note: you can specify the TTS Voice here

Screenshot of Open WebUI Admin Settings for Audio adding the correct endpoints for this project

info

The default API key is the string your_api_key_here. You do not have to change that value if you do not need the added security.

And that's it! You can end here

Please ⭐️ star the repo on GitHub if you find OpenAI Edge TTS useful

Running with Python

Usage details

Endpoint: `/v1/audio/speech` (aliased with `/audio/speech`)

Generates audio from the input text. Available parameters:

Required Parameter:

input (string): The text to be converted to audio (up to 4096 characters).

Optional Parameters:

model (string): Set to "tts-1" or "tts-1-hd" (default: "tts-1").
voice (string): One of the OpenAI-compatible voices (alloy, echo, fable, onyx, nova, shimmer) or any valid edge-tts voice (default: "en-US-AvaNeural").
response_format (string): Audio format. Options: mp3, opus, aac, flac, wav, pcm (default: mp3).
speed (number): Playback speed (0.25 to 4.0). Default is 1.0.

tip

You can browse available voices and listen to sample previews at tts.travisvn.com

Example request with curl and saving the output to an mp3 file:

curl -X POST http://localhost:5050/v1/audio/speech \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your_api_key_here" \
  -d '{
    "input": "Hello, I am your AI assistant! Just let me know how I can help bring your ideas to life.",
    "voice": "echo",
    "response_format": "mp3",
    "speed": 1.0
  }' \
  --output speech.mp3

Or, to be in line with the OpenAI API endpoint parameters:

curl -X POST http://localhost:5050/v1/audio/speech \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your_api_key_here" \
  -d '{
    "model": "tts-1",
    "input": "Hello, I am your AI assistant! Just let me know how I can help bring your ideas to life.",
    "voice": "alloy"
  }' \
  --output speech.mp3

And an example of a language other than English:

curl -X POST http://localhost:5050/v1/audio/speech \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your_api_key_here" \
  -d '{
    "model": "tts-1",
    "input": "じゃあ、行く。電車の時間、調べておくよ。",
    "voice": "ja-JP-KeitaNeural"
  }' \
  --output speech.mp3

Additional Endpoints

POST/GET /v1/models: Lists available TTS models.
POST/GET /v1/voices: Lists edge-tts voices for a given language / locale.
POST/GET /v1/voices/all: Lists all edge-tts voices, with language support information.

info

The /v1 is now optional.

Additionally, there are endpoints for Azure AI Speech and ElevenLabs for potential future support if custom API endpoints are allowed for these options in Open WebUI.

These can be disabled by setting the environment variable EXPAND_API=False.

🐳 Quick Config for Docker

You can configure the environment variables in the command used to run the project

docker run -d -p 5050:5050 \
  -e API_KEY=your_api_key_here \
  -e PORT=5050 \
  -e DEFAULT_VOICE=en-US-AvaNeural \
  -e DEFAULT_RESPONSE_FORMAT=mp3 \
  -e DEFAULT_SPEED=1.0 \
  -e DEFAULT_LANGUAGE=en-US \
  -e REQUIRE_API_KEY=True \
  -e REMOVE_FILTER=False \
  -e EXPAND_API=True \
  travisvn/openai-edge-tts:latest

note

The markdown text is now put through a filter for enhanced readability and support.

You can disable this by setting the environment variable REMOVE_FILTER=True.

Additional Resources

For more information on openai-edge-tts, you can visit the GitHub repo

For direct support, you can visit the Voice AI & TTS Discord

🎙️ Voice Samples

Play voice samples and see all available Edge TTS voices

Troubleshooting

Connection Issues

"localhost" Not Working from Docker

If Open WebUI runs in Docker and can't reach the TTS service at localhost:5050:

Solutions:

Use host.docker.internal:5050 instead of localhost:5050 (Docker Desktop on Windows/Mac)
On Linux, use the host's IP address, or add --network host to your Docker run command
If both services are in Docker Compose, use the container name: http://openai-edge-tts:5050/v1

Example Docker Compose for both services on the same network:

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    environment:
      - AUDIO_TTS_ENGINE=openai
      - AUDIO_TTS_OPENAI_API_BASE_URL=http://openai-edge-tts:5050/v1
      - AUDIO_TTS_OPENAI_API_KEY=your_api_key_here
    networks:
      - webui-network

  openai-edge-tts:
    image: travisvn/openai-edge-tts:latest
    ports:
      - "5050:5050"
    environment:
      - API_KEY=your_api_key_here
    networks:
      - webui-network

networks:
  webui-network:
    driver: bridge

Testing the TTS Service

Verify the TTS service is working independently:

curl -X POST http://localhost:5050/v1/audio/speech \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer your_api_key_here" \
  -d '{"input": "Test message", "voice": "alloy"}' \
  --output test.mp3

If this works but Open WebUI still can't connect, the issue is network-related between containers.

No Audio Output in Open WebUI

Check that the API Base URL ends with /v1
Verify the API key matches between both services (or remove the requirement)
Check Open WebUI container logs: docker logs open-webui
Check openai-edge-tts logs: docker logs openai-edge-tts (or your container name)

For more troubleshooting tips, see the Audio Troubleshooting Guide.

Integrating `openai-edge-tts` 🗣️ with Open WebUI

What is `openai-edge-tts`?

Requirements

⚡️ Quick start

Setting up Open WebUI to use `openai-edge-tts`

Please ⭐️ star the repo on GitHub if you find OpenAI Edge TTS useful

🐍 Running with Python

1. Clone the Repository

2. Set Up a Virtual Environment

3. Install Dependencies

4. Configure Environment Variables

5. Run the Server

6. Test the API

Endpoint: `/v1/audio/speech` (aliased with `/audio/speech`)

Additional Endpoints

🐳 Quick Config for Docker

Additional Resources

🎙️ Voice Samples

Troubleshooting

Connection Issues

"localhost" Not Working from Docker

Testing the TTS Service

No Audio Output in Open WebUI

What is openai-edge-tts?​

Requirements​

⚡️ Quick start​

Setting up Open WebUI to use openai-edge-tts​

Please ⭐️ star the repo on GitHub if you find OpenAI Edge TTS useful​

🐍 Running with Python​

1. Clone the Repository​

2. Set Up a Virtual Environment​

3. Install Dependencies​

4. Configure Environment Variables​

5. Run the Server​

6. Test the API​

Endpoint: /v1/audio/speech (aliased with /audio/speech)​

Additional Endpoints​

🐳 Quick Config for Docker​

Additional Resources​

🎙️ Voice Samples​

Troubleshooting​

Connection Issues​

"localhost" Not Working from Docker​

Testing the TTS Service​

No Audio Output in Open WebUI​

What is `openai-edge-tts`?

Requirements

⚡️ Quick start

Setting up Open WebUI to use `openai-edge-tts`

Please ⭐️ star the repo on GitHub if you find OpenAI Edge TTS useful

🐍 Running with Python

1. Clone the Repository

2. Set Up a Virtual Environment

3. Install Dependencies

4. Configure Environment Variables

5. Run the Server

6. Test the API

Endpoint: `/v1/audio/speech` (aliased with `/audio/speech`)

Additional Endpoints

🐳 Quick Config for Docker

Additional Resources

🎙️ Voice Samples

Troubleshooting

Connection Issues

"localhost" Not Working from Docker

Testing the TTS Service

No Audio Output in Open WebUI