STT Studio - Realtime ASR Demo

Voice Commands

Dictate a command and get recognized text instantly, without a queue and without diarization

This page uses a dedicated synchronous endpoint `/asr/recognize`: low latency, fast feedback, and a response format that is easy to integrate into frontend voice interfaces.

No queue Up to 5 minutes Word timestamps

Typical integration flow

1Browser speech recording (`MediaRecorder`)
2`POST /asr/recognize` with an audio file
3Parse `text/words` and execute the command in your product

Session

Realtime

API key

language (optional)

If `API_KEY` is not configured on the server, you can leave this field empty.

Dictation

Microphone

Ready 00:00 / 05:00

After stopping, the recording is automatically sent for recognition.

Waiting for recording or audio file upload.

File upload

Audio Upload

No file selected

Recognized text

Audio duration -

Recognition time -

Language -

Words 0

Recognition result will appear here...

Word-level timestamps

Word	Start	End
Word timestamps will appear here after recognition.