Audio Classification using Deep Neural Network

Shah, Viral

Please use this identifier to cite or link to this item: http://10.1.7.192:80/jspui/handle/123456789/8832

Title:	Audio Classification using Deep Neural Network
Authors:	Shah, Viral
Keywords:	EC 2016 Project Report Project Report 2016 EC Project Report EC (ES) Embedded Systems Embedded Systems 2016 16MEC 16MECE 16MECE21
Issue Date:	1-Jun-2018
Publisher:	Institute of Technology
Series/Report no.:	16MECE21;
Abstract:	Current surveillance systems can be designed to detect unusual activities in the surveillance video, such as object detection, movement tracking, and activity monitoring. However, they have a limitation in recognizing unusual audio associated with video as an example - a gunshot in the background on the scene. In such cases, identification of such abnormal audio is important for surveillance and safety. This thesis work focuses on recognition and classification of audio signals and by implementation of an audio classification model. This model is based on con- volution neural networks and has been trained by UrbanSound8K and Data speech commands data sets. The model extracts spectrograms from input audio signal and gives it as an input to the convolution neural network. The output of the network is the predicted label for corresponding audio. This model can classify the abnormal environmental sounds like a gunshot. The model can also predict small keywords like yes, no, up, and similar common audio words. This classification has been carried out on 2 different networks. One using VGG16 which is an image classification model. VGG-16 model achieved an audio classification accuracy of 100% for the cleaned data and around 80% for the noisy data. From the classification output of VGG16, it was concluded that neural net- works designed for image classification can also be employed for audio classification. The second model was a 2-layer convolution neural network, which gave audio clas- sification accuracy of 93.00%. The contribution of this thesis is an exploration of deep learning approach for task of audio classification.
URI:	http://10.1.7.192:80/jspui/handle/123456789/8832
Appears in Collections:	Dissertation, EC (ES)

Files in This Item:

File	Description	Size	Format
16MECE21.pdf	16MECE21	3.16 MB	Adobe PDF	View/Open

Show full item record

IR @ Nirma University