Skip to content

lxrswdd/Speech-Emotion-Recognition

Repository files navigation

Speech Emotion Recognition Project

General

  • A model to classify the emotions of speeches
  • Features were extracted by modified pyAudioAnalysis library
  • Preproccess the features.

Feature Extraction

pyAudioAnalysis library is modified by the addition of functions in order to extract original features from .wav files and present them in 3D arrays.

  • To use the modified codes, one shall overwrite the package installed with the files:MidTermFeatures.py and audioTrainTest.py

Modifications to MidTermFeatures.py

  • directory_feature_extraction_no_avg

    This function is able to extract features from a directory without averaging each file.

  • multiple_directory_feature_extraction_no_avg

    This function is able to extract features from multiple directories without averaging each file.

  • directory_feature_extraction_no_avg_3D

    This function aims to extract audio features from a directory and turn a 3D array in terms of (batch,step,features)

  • multiple_directory_feature_extraction_no_avg_3D
    Multi-directories extraction for 3D array.

Window selection

In order to determine the window size, window step and window number.
`read_audio_length` file is executed to read the audios' length in the directories and visualize the length by plotting a histogram.

Other functions

Other functions defined and used were listed in ultil.py

Models

All the models used during for the project were listed in the model_training.py There are three attention based models. Residual attention, Multiplicative attention and MultiHead Attention.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published