Skip to content

CosyVoice Open Source Address: https://github.com/FunAudioLLM/CosyVoice

Using in Video Translation Software

The official webui.py included with CosyVoice3 cannot be used for integration because its audio component is stream-based, causing API calls to return m3u8 files instead of wav audio.

Please follow the steps below:

  1. Deploy the official project, confirm that webui.py can be started, and successfully complete one dubbing operation in the UI interface. Then download the modified webui.py file, overwrite the official one, and restart the service. Download link: https://github.com/jianchang512/stt/releases/download/0.0/cosyvoice3-webui-py.zip
  2. For Windows systems, you can directly use the integrated package. Download link: https://pan.baidu.com/s/1g1dSIfyX0wLhtPtQOMX-tA?pwd=1234 or https://github.com/jianchang512/stt/releases/download/0.0/cosyvoice3-0.5B_20251216.7z

Starting and Using in pyVideoTrans

  1. First, update your pyVideoTrans software to the latest version.
  2. Ensure the CosyVoice project is deployed and webui.py is running. You should be able to open the interface at http://127.0.0.1:8000 in your browser.
  3. Open the video translation software, go to Settings (top left) -> CosyVoice: Enter the webui.py address. The default is http://127.0.0.1:8000.
  4. Fill in the reference audio and corresponding text.
Reference Audio Format:

Each line is split into two parts by the `#` symbol. The first part is the path to the wav audio file, and the second part is the corresponding text content for that audio. Multiple lines can be added.

The wav audio duration must be less than 10 seconds. The audio files must be placed in the `f5-tts` directory of this pyVideoTrans project. You can just enter the filename here.
The audio must be in wav format.

Example:

1.wav#Hello dear friend
2.wav#Hello friends
  1. After filling in the information, select CosyVoice as the Dubbing Channel on the main interface, and choose the corresponding role. The clone role is used to replicate the original video's voice tone.

Notes

  • The first time you use it, the model will be automatically downloaded from modelscope.cn, which may take a while. Please be patient.