STT Studio

Transcription with diarization and timestamps.

Speech Workspace

Convert video to audio, get a transcription, and ask questions in one workspace

The interface is tuned for fast results: minimal extra actions, clear progress, and convenient export and copy for every block.

Browser FFmpeg fallback Diarization + timestamps Q&A on results

How it works

  1. 1Upload an audio/video file
  2. 2Check parameters and start processing
  3. 3Review results, export, and ask questions

Upload

Audio / Video

Drag a file here

Or click to choose from your device

MP3, WAV, FLAC, MP4, MOV, AVI, WEBM

For video: we try to extract audio in the browser (WAV 16k mono), and upload the original file on failure.

No file selected

Parameters

Speaker diarization
Disable if you only need words and SRT

WhisperX-like profile: speaker is assigned by maximum interval overlap.

Advanced speaker settings

Leave fields empty if you need auto-estimation. Default num_speakers = 4.

Actions

Download JSON

Progress

Preparing... 0%
Prepare
Upload
Process
Done

Results

Done
Duration
Processing time
Language
Speakers

Raw transcription

Speaker intervals

Speaker Start End
No intervals to display.

Merged speaker speech

Speaker Start End Text
No segments to display.

Speaker SRT

Word timestamps

Word Start End
No word timestamps.

Q&A on transcription