Skip to content

Voice Cloning and Multi-Character Dubbing

Part 1: Video-Based Voice Cloning

In the Custom Video Translation feature, you can select F5-TTS/GPT-SoVITS/CosyVoice/Chatterbox/clone-voice etc. in the dubbing channel. By selecting the clone character, the original audio from the video will be used as the reference for dubbing, resulting in a voice-over that matches the original timbre.

On the main interface, select the 'clone' character to proceed with voice cloning dubbing.

  • F5-TTS: Supports Chinese and English dubbing.
  • CosyVoice: Supports Chinese, English, Japanese, Korean, German, Spanish, French, Italian, and Russian dubbing.
  • GPT-SoVITS: Supports Chinese, Japanese, English, Korean, and Cantonese dubbing.
  • Chatterbox: Supports Arabic, German, English, Spanish, French, Hebrew, Hindi, Italian, Japanese, Korean, Malay, Polish, Portuguese, Russian, Swedish, Turkish, and Chinese dubbing.

image.png

Please set the minimum speech duration to 3000-4000 milliseconds; otherwise, voice cloning is highly likely to fail.

Voice cloning uses the original audio segment from the video corresponding to the subtitle as the reference audio. Therefore, please ensure the subtitle duration is at least 3 seconds; otherwise, cloning is highly likely to fail. You can adjust this by going to Menu -> Tools -> Advanced Options -> Whisper Speech Recognition Settings -> set the 'Minimum Speech Duration' to a value greater than or equal to 3000, and the 'Maximum Speech Duration' to a value greater than or equal to 8.

Part 2: Subtitle-Based Multi-Character Dubbing

Starting from v3.74, the "Multi-Character Subtitle Dubbing" feature has been added. Click the Multi-Character Subtitle Dubbing button in the left toolbar. In the pop-up window, import the SRT subtitle file that needs dubbing, then assign a character to each subtitle line to achieve multi-character voice-over.