Today we’re adding a new voice cloning model to Vozo’s Translate & Dub: VoiceNATIVE.
It joins VoiceREAL, our existing model, giving you two distinct options depending on what your content needs. Here’s why that distinction matters.
The Trade-Off at the Heart of Voice Cloning
Every voice cloning model has to balance two competing goals, and how it strikes that balance shapes the final result.
A model that stays close to the original speaker captures their emotional delivery and speaking rhythm well. But it also carries over their phonetic characteristics. A Chinese creator’s video dubbed into English, for example, may retain speech patterns from the source language and end up sounding like accented English. For personal vlogs or entertainment content, that kind of authenticity is often exactly what you want. For a professional ad or an e-learning course targeting a local market, it can feel out of place.
A model that prioritizes native-sounding delivery in the target language solves that — but at a cost. The further it moves toward a local accent, the less it resembles the original speaker. The emotional closeness starts to fade.
Neither is wrong. They just serve different content.
Meet VoiceREAL and VoiceNATIVE
VoiceREAL is built for content where the speaker’s presence matters. It preserves emotional delivery, speaking rhythm, and voice characteristics as closely as possible. It works best for dramas, vlogs, and entertainment media.
VoiceNATIVE is built for content where local credibility matters. It delivers a natural target-language accent without carrying over the phonetic traits of the source language. It works best for ads, e-learning courses, and explainer videos.
Both models are now available in Translate & Dub under Voice Cloning Model. If you’re not sure which fits your content, the Auto option will select the most appropriate model for you.
What This Means for Your Workflow
Until now, every dubbed video went through the same voice cloning process regardless of content type. VoiceREAL and VoiceNATIVE give you the ability to match the cloning approach to the actual goal of your video — whether that’s preserving the speaker’s identity or reaching a local audience with a natural-sounding voice.
We’re excited to see what you make with it.