top of page
Underwater Creatures

Current Research: 
Classification of Underwater Mammals

Uncovering the Latest Findings

Current and Final Research: Welcome

Implementing the Classifier

Current and Future Applications

Person Analyzing Data

Classifier Overview

April 12, 2022

To create our mammal classifier, we exercised the classification learner app within Matlab. The app is quite convenient in that we were able to upload our data, choose a classification algorithm, and adjust any minor settings within the algorithm program itself. The main benefit to using this app is that it is very easy to test many different classification algorithms. We plan to use the Decision Tree and Support Vector Machine families of algorithms.

Image by Michael Dziedzic

Classification Algorithm Selection

April 12, 2022

The Decision Tree family of algorithms could potentially be a good algorithm for our data set as we believe it might be feasible to group each species into a well defined rule. Generally, Decision Trees excel in classification of data that can conform to binary rules (if rule one, else rule two). We hypothesize that data such as single-sided amplitude spectrum peaks could work well in this model as rules can be created based on the number of peaks at certain frequencies. One downside of Decision trees is the tendency to overfit the training data. Overfitting decreases the accuracy of the model because the classifier becomes biased to the noise fluctuation within the data. Since our data set tends to have samples that are nosy, the model will be influenced by that noise.   


The Support Vector Machine family of algorithms could also work well with our data. With this algorithm, it is expected to excel because our data is highly dimensional. Our data contains 723 data points across three features which creates a highly dimensional space. The SVM algorithm contains multiple subcategories such as linear, quadratic, cubic and Gaussian. With each of these subcategories, the dimensionality of the vectors can be adjusted to fit the data in an optimal way. Another positive to using the SVM is that we will be able to adjust and protect against overfitting by changing the kernel or inner settings with the algorithm: kernel scale and box level constraint. The kernel scale can be lowered for very spread out data (highly dimensional)  or increased for concentrated data (low dimensional). The box level constraint is used to increase the penalty for an incorrect training categorization which results in more strictly separated data. As we have 10 species, we predict that a higher box level will likely lead to a higher training accuracy because it will more strictly define species that could be very similar.

Image by J K

Testing Layout

April 12, 2022

Initially, our plan will be to test the classifier with more general categories of whale and dolphin audio signals. This stage will utilize all three of our audio features to train the classifier: spectral centroid, Mel-frequency cepstral coefficients, and single-sided amplitude spectrum peaks. Narrowing down to three features increases the probability that the classifier will be able to develop some kind of pattern within the data to create general classification rules. Next, our general mammal categories will be split into the 10 sub-species of whales and dolphins. The samples will be unfiltered, high-pass filtered, our designed targeted moving average filter, and a Wiener filter. Our goal with this more targeted filter test is to increase subspecies classification accuracy to the level of whale-dolphin accuracy through the filtering of the inputted audio signals.

Current and Final Research: Research

Testing Steps

Categorize data in terms of whale and dolphin:

  • Run 3 trials (Unfiltered, High-Pass, Mega-Filter)

  • For each trial, use all features (Centroid,Mel,Peaks)

  • For classification, use Decision Tree and SVM

  • Obtain Training accuracy

  • Test the classifier and obtain test accuracy

Categorize data in terms of 10 species:

  • Run 3 trials (Unfiltered, High-Pass, Mega-FIlter)

  • For each trial, test combined features

  • For classification, use Decision Tree and SVM

  • Obtain Training accuracy

  • Test the classifier and obtain test accuracy

Current and Final Research: List

Classifier Data

Data Collection and Data Augmentation

The selected data utilized during our project was from an online audio database: Watkins Marine Mammal Sound Database, Woods Hole Oceanographic Institution, New Bedford Whaling Museum. This database contains over 60 marine mammal species and approximately 15,000 labeled samples. All of these samples vary in length and different sampling rates. The levels of noise prominent within these samples also varied which encouraged us to design a general filter which would allow for the most noise reduction in the majority of samples. For the data utilized in the algorithm, we choose to follow the 80-20 rule which is commonly used in machine learning applications; 80% of our data will be used to train and 20% will be used to test. For each of the 10 species, 50 samples are used for training and 10 samples are used for testing.


When collecting training data and testing data, we utilized data augmentation to increase the number of data samples we have without increasing the number of samples we obtained from the online database. Data augmentation is the process of shifting, rotating, and reflecting to increase the number of data samples. Data augmentation is useful as it can increase richness and depth of the training model which will increase the accuracy and robustness. For our project, we linearly shifted our data samples 0.25s forward in time to introduce a new set of 0.5s samples. By doing this, we nearly double our data set size which aims to increase our classifiers accuracy.


In addition to data augmentation, we also utilized k-fold cross validation during classifier training to create additional samples. Cross validation is another technique of resampling to train on a limited data set. With our data, we chose 5-fold as the 50 training samples would be broken into 10 groups of 5. These groups are then combined and tested in a variety of combinations to find the model with the highest accuracy. After research, we conclude that choosing a K value between 5-10 is the best practice for finding a reasonable model. Additionally, choosing an ideal K value can help reduce the bias of the model which increases the accuracy when testing with new samples. The combination of cross validation and data augmentation provides ample data to train an accurate and low bias classifier. 

Current and Final Research: Text

Classifier Results

April 25, 2022

Screen Shot 2022-04-25 at 13.21.15.png

Table 1: Final Species Classifier Results

After implementing both the training and testing data, our classifier returned some promising results. We compared three different filters which were selected based on their performance through initial audio comparison. These filters were designed and applied based on listening to the raw audio sample and comparing with the filter audio samples by ear. It is important to note that just listening by ear does not allow us to account for all of the frequencies contained in the sample. Beginning with the unfiltered trials, their statistics prove to have higher levels of accuracy, and that is due to the fact that there is prominent excess noise audible throughout the entire signal. This then allows for a higher level of detection from the classifier. This makes this method inaccurate for our overall goal, as the classifier will not detect different mammal species, but whether or not the device was taking in underwater signals. The highpass yielded similar results as the algorithm was biased towards the audible high frequencies. This filter eliminated lower frequencies and dropped the accuracy greatly on the lower frequency speaking whales. Next, when applying the targeted moving average filter, there was high success in eliminating the noise prominent in the lower frequencies, but not in the higher frequencies. This filter yielded acceptable results, however, still not the most accurate as the signal’s lower frequencies were averaged, therefore distorting the mammal voice to an extent. Finally, we had the most success in applying the Wiener filter. Even though this filter produced some of our lowest prediction scores, they are considered the most accurate, based on the amount of noise we were able to eliminate by applying the Wiener filter. We can expect the Wiener filter’s performance to be the most robust amongst all of our filters as its noise reduction was the greatest at all frequencies. This results in a classifier that is trained heavily on the audio features pertaining to the mammal itself instead of the noise in the surrounding environment.

Screen Shot 2022-04-25 at 19.32.21.png

Table 2: Dolphin vs. Whale Results

Table 2 illustrates the classification of the two different main mammal groups: dolphin and whale. This data supports our hypothesis that the SVM remains to be the most accurate of the classifiers and yielding realistic values.

Current and Final Research: Research

Classification Algorithm Performance

​

From our trials, we can conclude that the SVM algorithm consistently outperformed the Decision Tree algorithm. The reasoning behind this is that our data does not fit well with binary decisions. As our data is more complex and high in variability, it is likely hard to define binary rules that can classify in a consistent manner. The SVM algorithm’s ability to work effectively in highly dimensional spaces and flexibility in kernel selection, aided in its ability to classify at a higher training and testing accuracy across all filter trials.

Current and Final Research: Text

Confusion Matrix

SVM Species Classification

The figure to the right represents a confusion matrix for the different 10 species after applying the Wiener Filter. The values in the matrix determine the training accuracy out of the 500 samples. The matrix shows that the Humpback Whale was the most accurately determined out of all of the species. Our hypothesis is that since the Humpback Whale has such a low frequency voice, the classifier is making its decision based on the lack of high frequencies present in the signal. If given more time for the project, the confusion matrix information could be used to determine which features we could add or remove to more accurately determine each of the species. This would allow for further modifications to the training algorithm and an increase in overall accuracy.

CM_Weiner_SVM.png

SVM Mammal Classification

The confusion matrix to the right shows results for the algorithm's ability to classify between the two mammals: dolphin and whale. The data shows that the accuracy of two species is quite similar with dolphins being detected with a slightly higher accuracy level. The matrix correlates to the accuracy of the SVM classification. Since it is only comparing between the two mammals there is less variability, yielding more equivalent results.

CM_WD_Weiner_SVM.png
Current and Final Research: Research

Conclusion Summary

After testing and collecting our results we have formed some conclusions.

Current and Final Research: List

Whales and Dolphins contain unique characteristics in the frequency domain.

The unique characteristics of mammals can be extracted by using DSP tools such as FFT and Spectrogram Analysis.

High Pass filters sacrifice classification accuracy for noise elimination since lower frequencies contain important audio feature characteristics.

When filtering, a Wiener filter works best for limiting noise across all frequencies.

Unfiltered data produces a high accuracy, but is likely biased by noise.

In general, the SVM classifier outperforms a Decision Tree for our inputted data.

Generalization of classification labels can increase accuracy results.

Accuracy of classification can be sacrificed for an expected increase in noise robustness.

April 26, 2022

In conclusion, our project successfully built a prototype of an underwater mammal identification device. This device is able to take in underwater signals and filter the audible signal to reduce the noise and echoes. Our results returned from the classifier were a bit counterintuitive as the lower the accuracy rate, the less biased our algorithm proved to be. Since the Wiener filter was able to detect and eliminate the most noise from our audio signals, it therefore would provide the most accurate of the non-biased decisions. Since other filters applied were unable to fully remove the underwater noise and echos, those features were prominent during most if not the entire signal, therefore skewing the results from the classifier.


The process for selecting our most accurate classification algorithm included course tools such as the FFT, Spectrogram analysis, and filtering. We expanded upon filters learned in class and carried out research to formulate our own and find an additional one which yielded the most accurate results, Wiener Filter. In addition, we utilized non-course tools such as data augmentation, audio feature extraction and machine learning classification. More specifically, we explored and compared the SVM and Decision Tree algorithms.

​

Looking back on the project, if we were to do it again, we would look for a larger data set to give more variability to the classifier. Finding a successful filter sooner, definitely would have allowed for more time in the classifier stage, which could have increased our accuracy by exposing the algorithm to additional features. Experimenting more with the features would have helped to determine which ones yielded the most promising results and could have further improved our research.


All in all, this project allowed us to apply our in-course knowledge to different areas of signal processing while learning additional out-of-course tools which helped to support our hypothesis. The additional research carried out helped to supplement the material covered in class to further deepen our understanding of signal processing.

Current and Final Research: Text
Image by NOAA
Current and Final Research: Image
bottom of page