Speech API Documentation

The Speech API provides automatic speech segmentation and prosodic annotation for multiple Indian languages. The API accepts a wave file and corresponding text and returns a word-level segmentation with ToBI-style tonal labels.

Endpoint

POST /speech_api

Authorization

Include the following authorization key in the request header:

W1VfgQ9GeVxfpCj79mFiCX0tsKFWSpqM

Request Parameters

Parameter	Type	Description
languages	Form field	Language of the speech input (e.g., tamil, english, hindi)
wave_file	File (.wav)	Input speech waveform (16 kHz preferred)
text_file	File (.txt)	Corresponding transcription text

Supported Languages

The languages parameter must be set to one of the following supported values:

Assamese Bengali Bodo Dogri English Gujarati Hindi Kannada Konkani Maithili Malayalam Manipuri Marathi Nepali Odia Punjabi Rajasthani Sanskrit Tamil Telugu

Example cURL Request

curl -X POST https://speech.snuchennai.edu.in/speech_api \
-H "Authorization: W1VfgQ9GeVxfpCj79mFiCX0tsKFWSpqM" \
-F "text_file=@text_file.txt" \
-F "wave_file=@audio.wav" \
-F "languages=hindi"

Response Output

The API returns a single line of word-level segmentation with tonal labels: