The Speech API provides automatic speech segmentation and prosodic annotation for multiple Indian languages. The API accepts a wave file and corresponding text and returns a word-level segmentation with ToBI-style tonal labels.
POST /speech_api
Include the following authorization key in the request header:
| Parameter | Type | Description |
|---|---|---|
| languages | Form field | Language of the speech input (e.g., tamil, english, hindi) |
| wave_file | File (.wav) | Input speech waveform (16 kHz preferred) |
| text_file | File (.txt) | Corresponding transcription text |
The languages parameter must be set to one of the
following supported values:
curl -X POST https://speech.snuchennai.edu.in/speech_api \
-H "Authorization: W1VfgQ9GeVxfpCj79mFiCX0tsKFWSpqM" \
-F "text_file=@text_file.txt" \
-F "wave_file=@audio.wav" \
-F "languages=hindi"
The API returns a single line of word-level segmentation with tonal labels:
ustaad (M_LHH) ko (S_LHH) txribyuutx (B_LHL) kii (S_L)
tarah (S_LLH) likhaa (S_LLH) gayaa (S_HLL) wyomeesh (B_H)
shukl (S_LHL) kaa (S_hat) gady (S_HLH) unkii (S_HHL)
kalaa (S_LHL) ko (S_LHL) saaqskrqtik (M_H) raajniiti (B_HLH)
kee (S_L) pratirodh (M_LHH) kee (S_LLH) saqkeetoq (M_LHH)
kii (S_L) tarah (S_LHH) deekhtaa (S_LHL) hei (S_HLL)