Please use this identifier to cite or link to this item: http://10.1.7.192:80/jspui/handle/123456789/11344
Title: Voice Activity Detection using Statistical and Machine Learning Approaches
Authors: Bhadiyadra, Dhwani
Keywords: Computer 2020
Project Report
Computer Project Report
Project Report 2020
20MCE
20MCED
20MCED03
CE (DS)
DS 2020
Issue Date: 1-Jun-2022
Publisher: Institute of Technology
Series/Report no.: 20MCED03;
Abstract: Voice activity detection (VAD) deals with the problem of effectively recognizing voice regions in audio. With the increase in usage of voice-driven applications, the use of voice/speech activity detection is only increasing. In this paper, a method for voice activity detection is proposed in audios or recordings where the voice is acting as both foreground and background noise. The goal is to develop a single model that can detect voiced regions in both the abovementioned cases. The proposed model is tested on two datasets, one is public dataset, where voice is acting as foreground noise and another is a private dataset, where voice is acting as background noise. Here a detailed comparison of VAD using statistical and machine learning approaches has been carried out, and it has been concluded that the machine learning approach is better with 86.38 % testing accuracy and 94% testing sensitivity on the TIMIT corpus (Public dataset) and 73.27% testing accuracy and 79.67% testing sensitivity on Philips Lumea recordings (Private dataset) using the random forest machine-learning algorithm.
URI: http://10.1.7.192:80/jspui/handle/123456789/11344
Appears in Collections:Dissertation, CE (DS)

Files in This Item:
File Description SizeFormat 
20MCED03.pdf20MCED031.63 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.