People suffering from hearing impairment often have difficulties participating in conversations in so-called cocktail party scenarios where multiple individuals are simultaneously talking. Although advanced algorithms exist to suppress background noise in these situations, a hearing device also needs information about which speaker a user actually aims to attend to. The voice of the correct (attended) speaker can then be enhanced through this information, and all other speakers can be treated as background noise. Recent neuroscientific advances have shown that it is possible to determine the focus of auditory attention through noninvasive neurorecording techniques, such as electroencephalography (EEG). Based on these insights, a multitude of auditory attention decoding (AAD) algorithms has been proposed, which could, combined with appropriate speaker separation algorithms and miniaturized EEG sensors, lead to so-called neurosteered hearing devices. In this article, we provide a broad review and a statistically grounded comparative study of EEG-based AAD algorithms and address the main signal processing challenges in this field.
Individuals with hearing loss allocate cognitive resources to comprehend noisy speech in everyday life scenarios. Such a scenario could be when they are exposed to ongoing speech and need to sustain their attention for a rather long period of time, which requires listening effort. Two well-established physiological methods that have been found to be sensitive to identify changes in listening effort are pupillometry and electroencephalography (EEG). However, these measurements have been used mainly for momentary, evoked or episodic effort. The aim of this study was to investigate how sustained effort manifests in pupillometry and EEG, using continuous speech with varying signal-to-noise ratio (SNR). Eight hearing-aid users participated in this exploratory study and performed a continuous speech-in-noise task. The speech material consisted of 30-second continuous streams that were presented from loudspeakers to the right and left side of the listener (±30° azimuth) in the presence of 4-talker background noise (+180° azimuth). The participants were instructed to attend either to the right or left speaker and ignore the other in a randomized order with two different SNR conditions: 0 dB and -5 dB (the difference between the target and the competing talker). The effects of SNR on listening effort were explored objectively using pupillometry and EEG. The results showed larger mean pupil dilation and decreased EEG alpha power in the parietal lobe during the more effortful condition. This study demonstrates that both measures are sensitive to changes in SNR during continuous speech.
Auditory attention identification methods attempt to identify the sound source of a listener's interest by analyzing measurements of electrophysiological data. We present a tutorial on the numerous techniques that have been developed in recent decades, and we present an overview of current trends in multivariate correlation-based and model-based learning frameworks. The focus is on the use of linear relations between electrophysiological and audio data. The way in which these relations are computed differs. For example, canonical correlation analysis (CCA) finds a linear subset of electrophysiological data that best correlates to audio data and a similar subset of audio data that best correlates to electrophysiological data. Model-based (encoding and decoding) approaches focus on either of these two sets. We investigate the similarities and differences between these linear model philosophies. We focus on (1) correlation-based approaches (CCA), (2) encoding/decoding models based on dense estimation, and (3) (adaptive) encoding/decoding models based on sparse estimation. The specific focus is on sparsity-driven adaptive encoding models and comparing the methodology in state-of-the-art models found in the auditory literature. Furthermore, we outline the main signal processing pipeline for how to identify the attended sound source in a cocktail party environment from the raw electrophysiological data with all the necessary steps, complemented with the necessary MATLAB code and the relevant references for each step. Our main aim is to compare the methodology of the available methods, and provide numerical illustrations to some of them to get a feeling for their potential. A thorough performance comparison is outside the scope of this tutorial.
Sleep scoring is used as a diagnostic technique in the diagnosis and treatment of sleep disorders. Automated sleep scoring is crucial, since the large volume of data should be analyzed visually by the sleep specialists which is burdensome, time-consuming tedious, subjective, and error prone. Therefore, automated sleep stage classification is a crucial step in sleep research and sleep disorder diagnosis. In this paper, a robust system, consisting of three modules, is proposed for automated classification of sleep stages from the single-channel electroencephalogram (EEG). In the first module, signals taken from Pz-Oz electrode were denoised using multiscale principal component analysis. In the second module, the most informative features are extracted using discrete wavelet transform (DWT), and then, statistical values of DWT subbands are calculated. In the third module, extracted features were fed into an ensemble classifier, which can be called as rotational support vector machine (RotSVM). The proposed classifier combines advantages of the principal component analysis and SVM to improve classification performances of the traditional SVM. The sensitivity and accuracy values across all subjects were 84.46% and 91.1%, respectively, for the five-stage sleep classification with Cohen’s kappa coefficient of 0.88. Obtained classification performance results indicate that, it is possible to have an efficient sleep monitoring system with a single-channel EEG, and can be used effectively in medical and home-care applications.
We still have very little knowledge about how our brains decouple different sound sources, which is known as solving the cocktail party problem. Several approaches; including ERP, time-frequency analysis and, more recently, regression and stimulus reconstruction approaches; have been suggested for solving this problem. In this work, we study the problem of correlating of EEG signals to different sets of sound sources with the goal of identifying the single source to which the listener is attending. Here, we propose a method for finding the number of parameters needed in a regression model to avoid overlearning, which is necessary for determining the attended sound source with high confidence in order to solve the cocktail party problem.
Nema pronađenih rezultata, molimo da izmjenite uslove pretrage i pokušate ponovo!
Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo
Saznaj više