This article compiles various channels for translation, dubbing, and speech recognition, categorized into two main types: free and paid.
It also recommends optimal combinations based on the usage environment (such as whether a VPN is used), ensuring you can find the right tools in different situations.
Purely Free Solutions

Translation Channels

No VPN, No Proxy
- First choice: DeepSeek/Zhipu AI as the translation channel. Apply for accounts like "DeepSeek" or "Zhipu AI", obtain an SK key, and fill it into the "DeepSeek or Zhipu AI" section in the translation settings. The second choice is Microsoft Translator.
With VPN, With Proxy
- First choice: GeminiAI Translation, followed by Google Translate.
Dubbing Channels

- First choice: "edge-TTS", free and requires no setup, supports all languages.
- When the target language is Chinese, first choose dubbing channels like "GPT-SoVITS", "F5-TTS", "CosyVoice", etc.
- When the target language is other languages, first choose edge-TTS.
Speech Recognition Channels

When the video language is Chinese
- First choice: "Aliyun FunASR", which is Alibaba's funasr series of Chinese models, performing better than whisper.
- Second choice: faster-whisper or openai-whisper (local), select the "large-v2" model, and choose "Overall Recognition" for the speech segmentation mode.
- For single-line characters in Chinese, Japanese, and Korean, the default is to split every 20 characters into one subtitle line, which can be modified as needed.
When the video language is English or other languages
- First choice: faster-whisper or openai-whisper (local), select the "large-v2" or "large-v3-turbo" model, with the speech segmentation mode set to "Overall Recognition".
When the video language is a less common language
- First choice: Gemini Large Model Recognition, with the speech segmentation mode set to "Overall Recognition".
Note: Gemini is not available in all countries. If prompted that the current country is not supported, please switch VPN nodes, it is recommended to choose Singapore or Japan nodes. Google Translate can also be chosen.
Purely Paid Solutions
If pursuing higher translation quality, you can choose third-party paid APIs.
Translation Channels
- OpenAI ChatGPT (latest models), Gemini, 302.AI, domestic AI (such as DeepSeek, Zhipu AI).
Dubbing Channels
- AzureTTS, ByteDance Volcano Voice Synthesis, Elevenlabs.io, OpenAI-TTS.
Speech Recognition Channels
- For Chinese videos, first choice: ByteDance Volcano Subtitle Generation.
- For videos in other languages, it is recommended to use faster-whisper or openai-whisper (local) and Deepgram.com.
Best Combination Without Using a VPN
- Translation Channels: Domestic AI (such as DeepSeek, Zhipu AI), Microsoft Translator.
- Dubbing Channels: AzureTTS, edge-TTS, GPT-SoVITS, F5-TTS, CosyVoice, QwenTTS.
- Speech Recognition: faster-whisper or openai-whisper (local), select the "large-v2" or "large-v3-turbo" model, choose "Overall Recognition" for the speech segmentation mode, and check "Chinese Re-segmentation".
Best Combination Without Payment/VPN Restrictions
- Translation Channels: OpenAI ChatGPT latest series models, GeminiAI, DeepSeek, Google Translate, Microsoft Translator.
- Dubbing Channels: AzureTTS/edge-TTS, ByteDance Volcano Voice Synthesis, Elevenlabs.io, OpenAI-TTS, GPT-SoVITS, F5-TTS, CosyVoice, QwenTTS.
- Speech Recognition: faster-whisper or openai-whisper (local) / ByteDance Volcano Subtitle Generation / Aliyun FunASR.
Easiest and Simplest Combination (No Proxy, No Configuration Required)

- Translation Channels: Microsoft Translator (if you have a VPN and know how to use it, you can choose Google Translate).
- Dubbing Channels: edge-TTS.
- Speech Recognition: faster-whisper (local)
Best Speech Recognition Channels for Chinese Pronunciation Videos

- ByteDance Volcano Subtitle Generation
- Aliyun FunASR.
- faster-whisper (local, large-v2/large-v3-turbo model)
- openai-whisper (local, large-v2/large-v3-turbo model)
Best Speech Recognition Channels for Videos in Other Languages
- Gemini Large Model Recognition
- faster-whisper
- openai-whisper (local, large-v2/large-v3-turbo model)
Translation Channels with the Best Performance
- OpenAI ChatGPT latest series models / Gemini
- Domestic AI Translation
- Google / DeepL
- Microsoft Translator / Tencent Translator / Baidu Translator
