Logo
User Name

Emina Alickovic

Associate Professor, Linköping University

Društvene mreže:

Institucija

Linköping University
Associate Professor
Joshua P. Kulasingham, H. Innes-Brown, Martin Enqvist, E. Alickovic

Visual Abstract

Heidi B. Borges, Johannes Zaar, E. Alickovic, C. B. Christensen, P. Kidmose

Objective This study aims to investigate the potential of speech-reception-threshold (SRT) estimation through electroencephalography (EEG) based envelope reconstruction techniques with continuous speech stimuli. Additionally, we aim to investigate the influence of the stimuli signal-to-noise ratio (SNR) on the temporal response function (TRF). Methods The analysis included data from twenty young normal-hearing participants, who listened to audiobook excerpts with varying background noise levels, while data were recorded from 66 scalp EEG electrodes. A linear decoder was trained to reconstruct the speech envelope from the EEG data. The reconstruction accuracy was calculated as the Pearson’s correlation between the reconstructed envelope and the actual speech envelope. An SRT estimate (SRTneuro) was obtained as the midpoint of a sigmoid function fitted to the reconstruction accuracy-vs-SNR data. Additionally, the TRF was estimated at each SNR level, and a statistical analysis was conducted to reveal significant effects of SNR levels on the latencies and amplitudes of the most prominent components. Results The behaviorally determined SRT (SRTbeh) ranged from -6.0 to -3.28 dB with a mean of -5.35 dB. The SRTneuro was within 2 dB from the SRTbeh for 15 out of 20 participants (75%) and within 3 dB from the SRTbeh for all participants (100%). The investigation of the TRFs showed that the latency of P2 decreased significantly and the magnitude of the N1 amplitude increased significantly with increasing SNR. Conclusion The behavioral speech-reception-threshold (SRTbeh) of young normal-hearing listeners, can be accurately estimated based on EEG responses to continuous speech stimuli. In this study, the electrophysiological speech-reception-threshold (SRTneuro) measure was derived from a sigmoid fit to the envelope reconstruction accuracy vs SNR. The reconstruction accuracy of the envelope and the TRF features were all affected by changes in SNR, suggesting that the reconstruction accuracy of the envelope and the TRF features are related to the same underlying neural process. Highlights Speech reception threshold (SRT) can be estimated from EEG using continuous speech SRT estimated using sigmoid fit on envelope reconstruction accuracy Latency of P2 in the temporal response function decreases with increasing SNR Magnitude of N1 in the temporal response function increases with increasing SNR

M. A. Tanveer, Martin A Skoglund, Bo Bernhardsson, E. Alickovic

Objective. This study develops a deep learning (DL) method for fast auditory attention decoding (AAD) using electroencephalography (EEG) from listeners with hearing impairment (HI). It addresses three classification tasks: differentiating noise from speech-in-noise, classifying the direction of attended speech (left vs. right) and identifying the activation status of hearing aid noise reduction algorithms (OFF vs. ON). These tasks contribute to our understanding of how hearing technology influences auditory processing in the hearing-impaired population. Approach. Deep convolutional neural network (DCNN) models were designed for each task. Two training strategies were employed to clarify the impact of data splitting on AAD tasks: inter-trial, where the testing set used classification windows from trials that the training set had not seen, and intra-trial, where the testing set used unseen classification windows from trials where other segments were seen during training. The models were evaluated on EEG data from 31 participants with HI, listening to competing talkers amidst background noise. Main results. Using 1 s classification windows, DCNN models achieve accuracy (ACC) of 69.8%, 73.3% and 82.9% and area-under-curve (AUC) of 77.2%, 80.6% and 92.1% for the three tasks respectively on inter-trial strategy. In the intra-trial strategy, they achieved ACC of 87.9%, 80.1% and 97.5%, along with AUC of 94.6%, 89.1%, and 99.8%. Our DCNN models show good performance on short 1 s EEG samples, making them suitable for real-world applications. Conclusion: Our DCNN models successfully addressed three tasks with short 1 s EEG windows from participants with HI, showcasing their potential. While the inter-trial strategy demonstrated promise for assessing AAD, the intra-trial approach yielded inflated results, underscoring the important role of proper data splitting in EEG-based AAD tasks. Significance. Our findings showcase the promising potential of EEG-based tools for assessing auditory attention in clinical contexts and advancing hearing technology, while also promoting further exploration of alternative DL architectures and their potential constraints.

Johanna Wilroth, E. Alickovic, Martin A Skoglund, Martin Enqvist

Clusters of neurons generate electrical signals which propagate in all directions through brain tissue, skull, and scalp of different conductivity. Measuring these signals with electroencephalography (EEG) sensors placed on the scalp results in noisy data. This can have severe impact on estimation, such as, source localization and temporal response functions (TRFs). We hypothesize that some of the noise is due to a Wiener-structured signal propagation with both linear and nonlinear components. We have developed a simple nonlinearity detection and compensation method for EEG data analysis and utilize a model for estimating source-level (SL) TRFs for evaluation. Our results indicate that the nonlinearity compensation method produce more precise and synchronized SL TRFs compared to the original EEG data.

Joshua P. Kulasingham, H. Innes-Brown, Martin Enqvist, E. Alickovic

The auditory brainstem response (ABR) is a measure of subcortical activity in response to auditory stimuli. The wave V peak of the ABR depends on stimulus intensity level, and has been widely used for clinical hearing assessment. Conventional methods to estimate the ABR average electroencephalography (EEG) responses to short unnatural stimuli such as clicks. Recent work has moved towards more ecologically relevant continuous speech stimuli using linear deconvolution models called Temporal Response Functions (TRFs). Investigating whether the TRF waveform changes with stimulus intensity is a crucial step towards the use of natural speech stimuli for hearing assessments involving subcortical responses. Here, we develop methods to estimate level-dependent subcortical TRFs using EEG data collected from 21 participants listening to continuous speech presented at 4 different intensity levels. We find that level-dependent changes can be detected in the wave V peak of the subcortical TRF for almost all participants, and are consistent with level-dependent changes in click-ABR wave V. We also investigate the most suitable peripheral auditory model to generate predictors for level-dependent subcortical TRFs and find that simple gammatone filterbanks perform the best. Additionally, around 6 minutes of data may be sufficient for detecting level-dependent effects and wave V peaks above the noise floor for speech segments with higher intensity. Finally, we show a proof-of-concept that level dependent subcortical TRFs can be detected even for the inherent intensity fluctuations in natural continuous speech. Visual abstract Significance statement Subcortical EEG responses to sound depend on the stimulus intensity level and provide a window into the early human auditory pathway. However, current methods detect responses using unnatural transient stimuli such as clicks or chirps. We develop methods for detecting level-dependent responses to continuous speech stimuli, which is more ecologically relevant and may provide several advantages over transient stimuli. Critically, we find consistent patterns of level dependent subcortical responses to continuous speech at an individual level, that are directly comparable to those seen for conventional responses to click stimuli. Our work lays the foundation for the use of subcortical responses to natural speech stimuli in future applications such as clinical hearing assessment and hearing assistive technology.

Oskar Keding, E. Alickovic, Martin A Skoglund, Maria Sandsten

In the literature, auditory attention is explored through neural speech tracking, primarily entailing modeling and analyzing electroencephalography (EEG) responses to natural speech via linear filtering. Our study takes a novel approach, introducing an enhanced coherence estimation technique that employs multitapers to assess the strength of neural speech tracking. This enables effective discrimination between attended and ignored speech. To mitigate the impact of colored noise in EEG, we address two biases – overall coherence-level bias and spectral peak-shifting bias. In a listening study involving 32 participants with hearing impairment, tasked with attending to competing talkers in background noise, our coherence-based method effectively discerns EEG representations of attended and ignored speech. We comprehensively analyze frequency bands, individual frequencies, and EEG channels. Frequency bands of importance are shown to be delta, theta and alpha, as well as the central EEG channels. Lastly, we showcase coherence differences across different noise reduction settings implemented in hearing aids, underscoring our method’s potential to objectively assess auditory attention and enhance hearing aid efficacy.

Johanna Wilroth, Bo Bernhardsson, Frida Heskebeck, Martin A Skoglund, Carolina Bergeling, E. Alickovic

Objective. This paper presents a novel domain adaptation (DA) framework to enhance the accuracy of electroencephalography (EEG)-based auditory attention classification, specifically for classifying the direction (left or right) of attended speech. The framework aims to improve the performances for subjects with initially low classification accuracy, overcoming challenges posed by instrumental and human factors. Limited dataset size, variations in EEG data quality due to factors such as noise, electrode misplacement or subjects, and the need for generalization across different trials, conditions and subjects necessitate the use of DA methods. By leveraging DA methods, the framework can learn from one EEG dataset and adapt to another, potentially resulting in more reliable and robust classification models. Approach. This paper focuses on investigating a DA method, based on parallel transport, for addressing the auditory attention classification problem. The EEG data utilized in this study originates from an experiment where subjects were instructed to selectively attend to one of the two spatially separated voices presented simultaneously. Main results. Significant improvement in classification accuracy was observed when poor data from one subject was transported to the domain of good data from different subjects, as compared to the baseline. The mean classification accuracy for subjects with poor data increased from 45.84% to 67.92%. Specifically, the highest achieved classification accuracy from one subject reached 83.33%, a substantial increase from the baseline accuracy of 43.33%. Significance. The findings of our study demonstrate the improved classification performances achieved through the implementation of DA methods. This brings us a step closer to leveraging EEG in neuro-steered hearing devices.

F. Bachmann, Joshua P. Kulasingham, Kasper Eskleund, Martin Enqvist, E. Alickovic, H. Innes-Brown

The auditory brainstem response (ABR) is a valuable clinical tool for objective hearing assessment, which is conventionally detected by averaging neural responses to thousands of short stimuli. Progressing beyond these unnatural stimuli, brainstem responses to continuous speech presented via earphones have been recently detected using linear temporal response functions (TRFs). Here, we extend earlier studies by measuring subcortical responses to continuous speech presented in the sound-field, and assess the amount of data needed to estimate brainstem TRFs. Electroencephalography (EEG) was recorded from 24 normal hearing participants while they listened to clicks and stories presented via earphones and loudspeakers. Subcortical TRFs were computed after accounting for non-linear processing in the auditory periphery by either stimulus rectification or an auditory nerve model. Our results demonstrated that subcortical responses to continuous speech could be reliably measured in the sound-field. TRFs estimated using auditory nerve models outperformed simple rectification, and 16 minutes of data was sufficient for the TRFs of all participants to show clear wave V peaks for both earphones and sound-field stimuli. Subcortical TRFs to continuous speech were highly consistent in both earphone and sound-field conditions, and with click ABRs. However, sound-field TRFs required slightly more data (16 minutes) to achieve clear wave V peaks compared to earphone TRFs (12 minutes), possibly due to effects of room acoustics. By investigating subcortical responses to sound-field speech stimuli, this study lays the groundwork for bringing objective hearing assessment closer to real-life conditions, which may lead to improved hearing evaluations and smart hearing technologies.

Joshua P. Kulasingham, F. Bachmann, Kasper Eskelund, M. Enqvist, H. Innes-Brown, E. Alickovic

Perception of sounds and speech involves structures in the auditory brainstem that rapidly process ongoing auditory stimuli. The role of these structures in speech processing can be investigated by measuring their electrical activity using scalp-mounted electrodes. However, typical analysis methods involve averaging neural responses to many short repetitive stimuli that bear little relevance to daily listening environments. Recently, subcortical responses to more ecologically relevant continuous speech were detected using linear encoding models. These methods estimate the temporal response function (TRF), which is a regression model that minimises the error between the measured neural signal and a predictor derived from the stimulus. Using predictors that model the highly non-linear peripheral auditory system may improve linear TRF estimation accuracy and peak detection. Here, we compare predictors from both simple and complex peripheral auditory models for estimating brainstem TRFs on electroencephalography (EEG) data from 24 participants listening to continuous speech. We also discuss the data length required for estimating subcortical TRFs with clear peaks. Interestingly, predictors from simple models resulted in TRFs that were similar to those estimated using complex models, and were much faster to compute. This work paves the way for efficient modelling and detection of subcortical processing of continuous speech, which may lead to improved diagnosis metrics for hearing impairment and assistive hearing technology.

Oskar Keding, E. Alickovic, Martin A. Skoglund, Maria Sandsten

Coherence estimation between speech envelope and electroencephalography (EEG) is a proven method in neural speech tracking. This paper proposes an improved coherence estimation algorithm which utilises phase sensitive multitaper cross-spectral estimation. Estimated EEG coherence differences be-tween attended and ignored speech envelopes for a hearing impaired (HI) population are evaluated and compared. Testing was made on 31 HI subjects and showed significant coherence differences for grand averages over the delta, theta, and alpha EEG bands. Significance of increased coherence for attended speech was stronger for the new method compared to the traditional method. The new method of estimating EEG coherence, improves statistical detection performance and enables more rigorous data-based hypothesis-testing results.

...
...
...

Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više