Back to Blog

How to Add Translated Captions to OBS (And Why Voice Dubbing Goes Further)

Matt McElligott

Matt McElligott

February 27, 20265 min read159 views

How to Add Live Translated Captions to OBS (And Why Voice Dubbing Goes Further)

You set up captions. You spent an hour configuring the plugin. You went live.

Then you watched your international viewers sit in chat for two minutes and leave.

Caption-only translation has a real problem. Viewers can read your words. But reading a fast-moving stream while watching gameplay is exhausting. Most people do not bother. They change the channel.

This guide covers your real options for OBS translation: what each one actually does, how to set them up, and how to decide which one fits your goals.

What Are Your Options for OBS Translation

There are two fundamentally different things you can do with OBS translation.

Option 1: Captions. A text overlay appears on screen showing a translated version of what you said. Viewers read it. Tools: LocalVocal, Polyglot OBS.

Option 2: Voice dubbing. A second audio track is generated in another language, in real time. You route that audio to a separate stream destination. Viewers on that channel hear you speaking their language. Tools: StreamFluent.

These are not the same thing. One gives your viewers text. The other gives them a stream they can actually watch.

How to Set Up Real-Time Captions in OBS

If captions are what you want, two free plugins work well.

LocalVocal is the most popular. It runs OpenAI Whisper locally on your machine, transcribes your voice in real time, and displays translated captions as a text source in OBS. No subscription. No cloud. It processes everything on your CPU.

How to set it up:

  1. Download LocalVocal from the OBS plugin repository.
  2. Drag the plugin files into your OBS plugins folder.
  3. Restart OBS.
  4. Add a new source: LocalVocal Caption Source.
  5. Select your microphone input and target language.
  6. Position the caption text source on your scene.

It works. If your machine has a capable GPU, transcription is fast and reasonably accurate.

Polyglot OBS works similarly. It also runs local speech recognition and outputs translated captions. Free, open source, good community support.

Both tools are legitimate. If you want captions and nothing else, either one gets the job done.

Why Voice Dubbing Is Different

Here is the thing nobody talks about honestly.

Native English speakers do not follow Twitch streams by reading subtitles. Neither do native Spanish speakers, French speakers, or Korean speakers. That is not how people watch live content.

Think about how you watch streams in your own language. You are playing a game, half-watching, listening. You do not read every sentence of text that flashes on the bottom of the screen.

Now imagine you are a Spanish speaker who found a French streamer with captions. You read the first few lines. The game is fast. The streamer is talking quickly. The captions lag a little. You stop reading. You open a different tab.

That is what happens to most viewers. Captions tell them translation exists. They do not make them stay.

Audio is different. If the streamer sounds like they are speaking your language, you stay. You do not have to work to follow the content. It sounds like a normal stream.

That is what voice dubbing does.

How StreamFluent Works in OBS

StreamFluent is an OBS plugin for live voice dubbing. It installs the same way as any other OBS plugin.

Once installed and configured, new audio sources appear in your OBS mixer automatically. They are labeled things like "SF Dub. English" or "SF Dub. Spanish." Each one is a real-time dubbed version of your voice in that language.

You take that audio source and route it to a separate stream output. That output goes to a second Twitch channel, YouTube channel, or anywhere else that accepts an RTMP stream.

Your main stream runs as normal. Your international channel gets live dubbed audio. Same stream. Two outputs. Two audiences.

The specifics:

  • Setup time: about 10 minutes.
  • Latency: under 1 second end-to-end.
  • CPU overhead: under 2%. Processing happens in the cloud.
  • Languages supported: 32.
  • Free tier: 3 hours included, no credit card required.

Your main audience hears you in your language. Your English, Spanish, French, or Portuguese audience hears you in theirs.

Which Option Is Right for You

This comes down to what you are actually trying to do.

Use LocalVocal if:

  • You want captions on your main stream.
  • You are not trying to build a second-language audience.
  • You have a powerful enough machine to run Whisper locally.
  • Budget is the main constraint.

LocalVocal is free, it works well, and it does exactly what it says. If captions are the goal, it is a great choice.

Use StreamFluent if:

  • You want international viewers to actually watch your stream, not read it.
  • You want to grow an audience in a second language.
  • You want a separate English, Spanish, or other language channel that runs alongside your main stream.
  • You want lower CPU overhead since the heavy lifting happens in the cloud.

The core difference: LocalVocal gives your viewers text. StreamFluent gives them a stream.

If you want to test it, there are 3 free hours to start. No credit card needed. Setup takes about 10 minutes.

Start Free

Comments

Not displayed publicly