.webp&w=3840&q=95)
7 tips for creating a professional-grade voice clone in ElevenLabs
Learn how to create professional-grade voice clones with ElevenLabs using these 7 essential tips.
Introducing Eleven v3 (alpha)
Try v3Eleven v3 is the most expressive Text to Speech model
We're pleased to reveal Eleven v3 (alpha) — the most expressive Text to Speech model.
This research preview brings unprecedented control and realism to speech generation with:
Eleven v3 (alpha) requires more prompt engineering than previous models — but the generations are breathtaking.
If you’re working on videos, audiobooks, or media tools — this unlocks a new level of expressiveness. For real-time and conversational use cases, we recommend staying with v2.5 Turbo or Flash for now. A real-time version of v3 is in development.
Eleven v3 is available today on our website. Public API access is coming soon. For early access, please contact sales.
Use of the new model in the ElevenLabs app is 80% off until the end of June. Sign up here.
Since launching Multilingual v2, we’ve seen voice AI adopted in professional film, game development, education, and accessibility. But the consistent limitation wasn’t sound quality — it was expressiveness. More exaggerated emotions, conversational interruptions, and believable back-and-forth were difficult to achieve.
Eleven v3 addresses this gap. It was built from the ground up to deliver voices that sigh, whisper, laugh, and react — producing speech that feels genuinely responsive and alive.
Feature | What it unlocks |
---|---|
Audio tags | Inline control of tone, emotion, and non-verbal reactions |
Dialogue mode | Multi-speaker conversations with natural pacing and interruptions |
70+ languages | Full coverage of high-demand global languages |
Deeper text understanding | Better stress, cadence, and expressivity from text input |
Audio tags live inline with your script and are formatted with lowercase square brackets. You can see more about audio tags in our prompting guide for v3 in the docs.
Professional Voice Clones (PVCs) are currently not fully optimized for Eleven v3, resulting in potentially lower clone quality compared to earlier models. During this research preview stage it would be best to find an Instant Voice Clone (IVC) or designed voice for your project if you need to use v3 features. PVC optimization for v3 is coming in the near future.
For example, you could prompt: “[whispers] Something’s coming… [sighs] I can feel it.” Or for more expressive control, you can combine multiple tags:
1 | “[happily][shouts] We did it! [laughs].” |
Eleven v3 is supported in our existing Text to Speech endpoint. Additionally, we introduce a new Text to Dialogue API endpoint. Provide a structured array of JSON objects — each representing a speaker turn — and the model generates a cohesive, overlapping audio file:
1 | [ |
2 | {"speaker_id": "scarlett", "text": "(cheerfully) Perfect! And if that pop-up is bothering you, there’s a setting to turn it off under Notifications → Preferences."}, |
3 | {"speaker_id": "lex", "text": "You are a hero. An actual digital wizard. I was two seconds from sending a very passive-aggressive support email."}, |
4 | {"speaker_id": "scarlett", "text": "(laughs) Glad we could stop that in time. Anything else I can help with today?"} |
5 | ] |
6 |
The endpoint automatically manages speaker transitions, emotional changes, and interruptions.
Learn more here.
Plan | Launch promo | At the end of June |
---|---|---|
UI (self-serve) | 80% off (~5× cheaper) | Same as Multilingual V2 |
UI (enterprise) | 80% off business plan pricing | Business plan pricing |
To enable v3:
API access and support in Studio are coming soon. For early access, please contact sales.
Eleven v3 (alpha) requires more prompt engineering than our previous models. When it works the output is breathtaking but the reliability and higher latency means it’s not suitable for real-time and conversational use cases. For these, we recommend Eleven v2.5 Turbo/Flash.
For more, refer to the full v3 documentation and FAQ.
We’re excited to see how you bring v3 to life across new use cases — from immersive storytelling to cinematic production pipelines.
Learn how to create professional-grade voice clones with ElevenLabs using these 7 essential tips.
Learn how to create a beat from scratch.
Powered by ElevenLabs Conversational AI