Qwen3 TTS

A voice studio for presets, cloning, and design.

Qwen3 TTS supports preset voices, multilingual generation, reference voice cloning, and prompt-based voice design.

Preset voices

Voice design

Three ways to create speech

Multilingual control

Use auto language detection or choose a supported language explicitly.

Preset voices

Start quickly with built-in Qwen3 voices for different use cases.

Voice design

Create a voice direction from descriptive text when a preset is not enough.

Preset voices

Generate speech with a selected Qwen3 voice.

Pick a language and voice, then add optional style direction for tone and delivery.

Text *

0 Credits estimated (100 characters = 1 credit)0/3000

Language

Voice *

Style instruction

Use cases

Built for production voice workflows.

Use these pages as a focused audio workspace for scripts, product media, education content, and repeatable brand narration.

Multilingual content

Generate speech for videos, apps, courses, and support content across supported languages.

Preset voice production

Use built-in voices for repeatable narration when teams need a stable voice choice and quick turnaround.

Custom voice direction

Design a new voice with natural language when preset voices do not match the brand or content format.

Reference voice cloning

Upload a voice sample to create speech that follows the reference speaker for campaigns or character work.

Workflow

A simple path from script to usable audio.

The generation panel handles task creation, status polling, preview, and download. The supporting content helps you prepare better inputs before submitting.

Pick a creation path

Start with a preset voice, clone from reference audio, or describe a new voice in Voice Design.

Set language behavior

Use auto detection for mixed scripts, or choose a language explicitly when pronunciation needs more control.

Tune the delivery

Add style instructions or voice descriptions to guide tone, emotion, role, and production context.

Mode comparison

Mode	Best for	Inputs	Output
Text to Speech	Fast production with preset voices	Text, language, voice, style instruction	Preset voice speech audio
Voice Design	Creating a voice direction from a prompt	Text, voice description, language	Designed voice speech audio

Input guide

Language selection

Choose auto for mixed-language scripts. Pick a specific language for predictable pronunciation.

Preset voices

Try several built-in voices before designing a custom voice; preset voices are the fastest path to stable output.

Voice design prompt

Describe speaker age, tone, accent, pacing, scene, and intended audience in one compact prompt.

Examples

Copy-ready prompts for stronger first results.

Use these examples as starting points, then adapt them to your project, audience, and delivery style.

Style instruction

Warm, confident, and conversational, suitable for a SaaS onboarding video.

Voice design

A calm middle-aged female host with neutral accent, bright tone, and professional podcast pacing.

Multilingual script

Keep each sentence clean and avoid unnecessary symbols when mixing English, Chinese, and product names.

FAQ

Common questions before generating.

A short reference for choosing the right mode and preparing inputs before sending a task to the backend.

Which Qwen3 mode should I start with?

Start with Text to Speech if a preset voice fits. Use Voice Clone for a reference speaker, and Voice Design when you need a custom voice direction.

What should I write in a voice design prompt?

Include speaker age, tone, accent, pace, emotion, and usage context. A precise prompt usually gives more controllable results.

Can Qwen3 handle multiple languages?

Yes. The page exposes auto language detection and supported explicit language choices from the backend request model.