Voice-over or subtitles: which should I use for video translation?

Voice-over or subtitles: which should I use for video translation?

Foreign language voice-over recording involves a number of steps from script translation and locating voice talent to the actual recording and file delivery. Subtitling adds foreign language captions at the bottom of the screen to mirror or paraphrase what is being said in the video. And text-to-speech (TTS) technologies are rapidly improving. Should you use voice-over or subtitles?

Recording by a voice talent will usually be significantly more expensive than TTS or subtitling. If human voice-overs are more expensive does that mean they are better? The short answer is “Not always.” Depending upon a variety of factors, voice-overs, subtitling, or a mixture of both may give you the results you need.

Purpose and content of video

Whether you choose voice-overs or subtitles depends on the purpose and content of the video.

Marketing and promotional videos

For a marketing or promotional video, a human voice-over is usually preferable because it is more personal. Just as you would choose a persuasive voice for your original video, you can choose voices that will resonate with the foreign-language target audience. This works best if your video has an invisible narrator. If your video contains individuals speaking on-screen, it’s expensive to lip-sync or dub the voice-over. But in that situation subtitles will convey the meaning and audiences will still hear a speaker’s vocal inflections and emotions.

Training or explanatory videos

For e-learning modules or how-to videos, the personal voice may not be as important since the content of the video matters more than the mood of it. This can be an argument for subtitles. However, there are definite exceptions.

If a training video simply demonstrates something on screen while the audio describes it, subtitles might work. But what if there are explanatory text/titles on the screen as well? Viewers may find it hard to read the subtitles as well as the rest of the text. In addition, for technical training, the procedure being taught might require close attention. If reading subtitles will divert the attention of the viewer, choose voice-overs. If you aren’t worried about emotional engagement, TTS is an option. More on that below.

Audience preferences

Take into account the cultural preferences of your audience. In some countries there are strong preferences for either voice-over or subtitles. For training and e-learning videos, the different types of learning methods prevalent in the target country should be considered.  Also think about how the audience will watch the video. If the audience primarily uses mobile, subtitles will either be too small to read, or they’ll obscure too much of the visual content. If the audience is not highly literate, voice-overs are the best choice. A cultural consultant can guide you in these matters.


Ideally, you would base your choice on which mode works best, but cost matters too. You might already be facing costs involved in localizing your video that don’t relate to the audio, such as translating on-screen text or graphics with embedded text. The fact that you cannot afford the gold standard of video localization does not mean that you and your localization partner cannot come up with a plan that will meet your needs. The best way to minimize your costs and maintain the greatest latitude for choice is provide your localization partner with these important components:

  • A timed script for the video
  • The source file for the video
  • Associated source files
    • Music and sound effects
    • On-screen text
    • Images
    • Animations

Lacking some of these files will likely incur additional costs for re-creation. Make sure your video production company delivers all of these files along with the finished video file.

Text-to-speech (TTS)

Text-to-speech technologies can produce some amazing audio results, but they still sound robotic. If the purpose of your content is primarily informative, as with training and demonstrations, TTS is a great option. If you need to elicit an emotional response in your audience, you’d be taking a risk with TTS. 

TTS will inevitably mispronounce names, products, and technical terms. To ensure good quality audio, a linguist will experiment with phonetic spellings of unusual words until the audio comes out correctly.

Voice-over or subtitles: in summary

If you still aren’t sure whether to use voice-overs or subtitles, give us a call. The team at Scriptis can help you create multilingual content that connects.