filmov
tv
Direction of Arrival with Neural Networks project demonstration
![preview_player](https://i.ytimg.com/vi/A0v6j9b_b4o/maxresdefault.jpg)
Показать описание
Demonstration video of my B.Eng. project
The scope of this project is to implement a real-time deep neural network based direction of arrival (DOA) estimation algorithm, that can localize multiple sound sources, in an embedded computer. The symmetry of a 6 microphone uniform circular array (UCA) was exploited by finding a reference microphone closest to a source using beamforming and inputting the generalized cross correlation with phase transform (GCC-PHAT) matrix, for all microphone pair combinations, in a multilayer perceptron (MLP) and a convolutional neural network (CNN) that were trained to predict the DOA in 60 degrees. This way all 360 degrees are covered but the classification task of the MLP is reduced to 60 classes relaxing the training data requirement.
A GUI was made that includes a main menu, a rotation controller page, a data collection page and the DOA estimation page. The main menu allows you to go to the other pages. The rotation controller page allows you to input the desired degree and sends the command to the arduino to rotate the automatic platform to that degree. The data collection page allows you to input the starting angle, the stopping angle,the angle resolution and the number of audio samples to record in every degree and proceeds to automatically record a .wav dataset and the .csv files with the GCC data used in neural network training. The DOA estimation page is a demonstration of the DOA algorithm with a graph that indicates the DOA, a scale that alters the energy threshold of the voice activity of detection, a choice of the 360 degree algorithm or the 60 degree and beamforming implementation and a tickbox that allows the inclusion of the MUSIC algorithm.
Song used for testing : Hey Jude (Remastered 2015) by the Beatles
The scope of this project is to implement a real-time deep neural network based direction of arrival (DOA) estimation algorithm, that can localize multiple sound sources, in an embedded computer. The symmetry of a 6 microphone uniform circular array (UCA) was exploited by finding a reference microphone closest to a source using beamforming and inputting the generalized cross correlation with phase transform (GCC-PHAT) matrix, for all microphone pair combinations, in a multilayer perceptron (MLP) and a convolutional neural network (CNN) that were trained to predict the DOA in 60 degrees. This way all 360 degrees are covered but the classification task of the MLP is reduced to 60 classes relaxing the training data requirement.
A GUI was made that includes a main menu, a rotation controller page, a data collection page and the DOA estimation page. The main menu allows you to go to the other pages. The rotation controller page allows you to input the desired degree and sends the command to the arduino to rotate the automatic platform to that degree. The data collection page allows you to input the starting angle, the stopping angle,the angle resolution and the number of audio samples to record in every degree and proceeds to automatically record a .wav dataset and the .csv files with the GCC data used in neural network training. The DOA estimation page is a demonstration of the DOA algorithm with a graph that indicates the DOA, a scale that alters the energy threshold of the voice activity of detection, a choice of the 360 degree algorithm or the 60 degree and beamforming implementation and a tickbox that allows the inclusion of the MUSIC algorithm.
Song used for testing : Hey Jude (Remastered 2015) by the Beatles