Parakeet-API: High-Performance Local Speech Transcription Service

The parakeet-api project is a local speech transcription service based on the NVIDIA Parakeet-tdt-0.6b model. It provides an interface compatible with the OpenAI API and a clean Web UI, allowing you to easily and quickly convert any audio or video file into high-precision SRT subtitles. It is also compatible with pyVideoTrans v3.72+.

Project Open Source Address: https://github.com/jianchang512/parakeet-api

Windows All-in-One Package Download:

Download Link: https://pan.baidu.com/s/1cCUyLw93hrn40eFP_qV6Dw?pwd=1234

Usage: After extracting, double-click 启动.bat (Start.bat). Wait until the interface shown below appears and the browser opens automatically, indicating a successful launch. Successful Launch Interface

Usage with pyVideoTrans

Parakeet-API can be seamlessly integrated with the video translation tool pyVideoTrans (version v3.72 and above).

Ensure your parakeet-api service is running locally.
Open the pyVideoTrans software.
In the menu bar, select Speech Recognition(R) -> Nvidia parakeet-tdt.
In the pop-up configuration window, set the "http address" to: http://127.0.0.1:5092/v1
Click "Save", and you can start using it.

Source Code Deployment

🛠️ Installation and Configuration Guide

This project supports Windows, macOS, and Linux. Please follow the steps below for installation and configuration.

Step 0: Set up Python 3.10 Environment

If you don't have Python 3 installed locally, please install it following this tutorial: https://pvt9.com/_posts/pythoninstall

Step 1: Prepare FFmpeg

This project uses ffmpeg for audio/video format preprocessing.

Windows (Recommended):
1. Download from FFmpeg GitHub Repository. After extracting, you will get ffmpeg.exe.
2. Place the downloaded ffmpeg.exe file directly in the root directory of this project (at the same level as the app.py file). The program will automatically detect and use it; no environment variable configuration is needed.
macOS (using Homebrew):
bash
```
brew install ffmpeg
```
1

Linux (Debian/Ubuntu):

bash

sudo apt update && sudo apt install ffmpeg

Step 2: Create Python Virtual Environment and Install Dependencies

Download or clone this project's code to your local computer (recommended to place it in a folder with an English or numeric name on a non-system drive).
Open a terminal or command-line tool and navigate to the project root directory (on Windows, simply type cmd in the folder address bar and press Enter).
Create a virtual environment: python -m venv venv
Activate the virtual environment:
- Windows (CMD/PowerShell): .\venv\Scripts\activate
- macOS / Linux (Bash/Zsh): source venv/bin/activate
Install dependencies:
- If you do NOT have an NVIDIA GPU (CPU only):
  bash
```
pip install -r requirements.txt
```
  1
- If you have an NVIDIA GPU (using GPU acceleration): a. Ensure you have the latest NVIDIA Driver and the corresponding CUDA Toolkit installed. b. Uninstall any existing old PyTorch versions: pip uninstall -y torch c. Install PyTorch matching your CUDA version (using CUDA 12.6 as an example):
  bash
```
pip install torch --index-url https://download.pytorch.org/whl/cu126
```
  1

Step 3: Start the Service

In the terminal with the virtual environment activated, run the following command:

bash

python app.py

You will see prompts indicating the service is starting. The first run will download the model (approx. 1.2GB), please be patient.

If a bunch of warnings appear, you can ignore them.

Successful Launch Interface

🚀 How to Use

Method 1: Using the Web Interface

Open in your browser: http://127.0.0.1:5092
Drag and drop or click to upload your audio/video file.
Click "Start Transcription", wait for processing to complete, and you can view and download the SRT subtitle below.

Method 2: API Call (Python Example)

You can easily call this service using the openai library.

python

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:5092/v1",
    api_key="any-key",
)

with open("your_audio.mp3", "rb") as audio_file:
    srt_result = client.audio.transcriptions.create(
        model="parakeet",
        file=audio_file,
        response_format="srt"
    )
print(srt_result)

Parakeet-API: High-Performance Local Speech Transcription Service ​

Windows All-in-One Package Download: ​

Usage with pyVideoTrans ​

Source Code Deployment ​

🛠️ Installation and Configuration Guide ​

Step 0: Set up Python 3.10 Environment ​

Step 1: Prepare FFmpeg ​

Step 2: Create Python Virtual Environment and Install Dependencies ​

Step 3: Start the Service ​

🚀 How to Use ​

Method 1: Using the Web Interface ​

Method 2: API Call (Python Example) ​