Please use this identifier to cite or link to this item: http://10.1.7.192:80/jspui/handle/123456789/4867
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSoni, Sapna-
dc.date.accessioned2014-08-21T10:14:19Z-
dc.date.available2014-08-21T10:14:19Z-
dc.date.issued2014-06-01-
dc.identifier.urihttp://hdl.handle.net/123456789/4867-
dc.description.abstractA speech corpus (or spoken corpus) is a database of audio files and text transcriptions. In speech technology, speech corpora are used, among other things, to create acoustic models (which can then be used with a speech recognition engine). Speech corpus is central element for training an acoustic model used in a speech recognition engine. In Linguistics, spoken corpora are used to do research into Phonetics, Conversation analysis, Dialectology and other _elds. Creation of speech corpus is a laborious, expensive and time-consuming task. Recording of speech _les is done manually by many speakers. Then do the transcriptions which is the process of converting the speech to it's corresponding text. Typically there are two types of Speech Corpora: Read Speech - for example, speakers are asked to read Book excerpts Broadcast news Lists of words Sequences of numbers Spontaneous Speech - for examples Dialogs - between two or more people Narratives - a person telling a story Map-tasks - one person explains a route on a map to another Appointment-tasks - two people try to find a common meeting time based on individual schedules Building speech recognition application for resource deficient languages is a challenge because of unavailability of speech corpus. This work proposes a mechanism to develop an inexpensive speech corpus for low resource Indian languages by exploiting existing collections of online speech data to build a frugal speech corpus.en_US
dc.publisherInstitute of Technologyen_US
dc.relation.ispartofseries12MICT41;-
dc.subjectComputer 2012en_US
dc.subjectProject Report 2012en_US
dc.subjectComputer Project Reporten_US
dc.subjectProject Reporten_US
dc.subject12MICTen_US
dc.subject12MICT41en_US
dc.subjectICTen_US
dc.subjectICT 2012en_US
dc.subjectCE (ICT)en_US
dc.titleDevelopment of Frugal Speech Corpus for Low Resource Indian Languagesen_US
dc.typeDissertationen_US
Appears in Collections:Dissertation, CE (ICT)

Files in This Item:
File Description SizeFormat 
12MICT41.pdf12MICT413.46 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.