The Speech API provides automatic speech segmentation and prosodic annotation for multiple Indian languages. The API accepts a wave file and corresponding text and returns a word-level segmentation with ToBI-style tonal labels.

Endpoint

POST /speech_api

Authorization

Include the following authorization key in the request header:

W1VfgQ9GeVxfpCj79mFiCX0tsKFWSpqM

Request Parameters

Parameter Type Description
languages Form field Language of the speech input (e.g., tamil, english, hindi)
wave_file File (.wav) Input speech waveform (16 kHz preferred)
text_file File (.txt) Corresponding transcription text

Supported Languages

The languages parameter must be set to one of the following supported values:

Assamese Bengali Bodo Dogri English Gujarati Hindi Kannada Konkani Maithili Malayalam Manipuri Marathi Nepali Odia Punjabi Rajasthani Sanskrit Tamil Telugu

Example cURL Request

curl -X POST https://speech.snuchennai.edu.in/speech_api \
-H "Authorization: W1VfgQ9GeVxfpCj79mFiCX0tsKFWSpqM" \
-F "text_file=@text_file.txt" \
-F "wave_file=@audio.wav" \
-F "languages=hindi"
        

Response Output

The API returns a single line of word-level segmentation with tonal labels:

ustaad (M_LHH) ko (S_LHH) txribyuutx (B_LHL) kii (S_L)
tarah (S_LLH) likhaa (S_LLH) gayaa (S_HLL) wyomeesh (B_H)
shukl (S_LHL) kaa (S_hat) gady (S_HLH) unkii (S_HHL)
kalaa (S_LHL) ko (S_LHL) saaqskrqtik (M_H) raajniiti (B_HLH)
kee (S_L) pratirodh (M_LHH) kee (S_LLH) saqkeetoq (M_LHH)
kii (S_L) tarah (S_LHH) deekhtaa (S_LHL) hei (S_HLL)