STT Studio

Realtime ASR for voice control in online products.

Voice Commands

Dictate a command and get recognized text instantly, without a queue and without diarization

This page uses a dedicated synchronous endpoint `/asr/recognize`: low latency, fast feedback, and a response format that is easy to integrate into frontend voice interfaces.

No queue Up to 5 minutes Word timestamps

Typical integration flow

  1. 1Browser speech recording (`MediaRecorder`)
  2. 2`POST /asr/recognize` with an audio file
  3. 3Parse `text/words` and execute the command in your product

Session

Realtime

If `API_KEY` is not configured on the server, you can leave this field empty.

Dictation

Microphone
Ready 00:00 / 05:00

After stopping, the recording is automatically sent for recognition.

Waiting for recording or audio file upload.

File upload

Audio Upload
No file selected

Recognized text

Audio duration -
Recognition time -
Language -
Words 0
Recognition result will appear here...

Word-level timestamps

Word Start End
Word timestamps will appear here after recognition.