Associate Professor, Linköping University
Hearing impairment alters the sound input received by the human auditory system, reducing speech comprehension in noisy multi-talker auditory scenes. Despite such difficulties, neural signals were shown to encode the attended speech envelope more reliably than the envelope of ignored sounds, reflecting the intention of listeners with hearing impairment (HI). This result raises an important question: What speech-processing stage could reflect the difficulty in attentional selection, if not envelope tracking? Here, we use scalp electroencephalography (EEG) to test the hypothesis that the neural encoding of phonological information (i.e., phonetic boundaries and phonological categories) is affected by HI. In a cocktail-party scenario, such phonological difficulty might be reflected in an overrepresentation of phonological information for both attended and ignored speech sounds, with detrimental effects on the ability to effectively focus on the speaker of interest. To investigate this question, we carried out a re-analysis of an existing dataset where EEG signals were recorded as participants with HI, fitted with hearing aids, attended to one speaker (target) while ignoring a competing speaker (masker) and spatialised multi-talker background noise. Multivariate temporal response function (TRF) analyses indicated a stronger phonological information encoding for target than masker speech streams. Follow-up analyses aimed at disentangling the encoding of phonological categories and phonetic boundaries (phoneme onsets) revealed that neural signals encoded the phoneme onsets for both target and masker streams, in contrast with previously published findings with normal hearing (NH) participants and in line with our hypothesis that speech comprehension difficulties emerge due to a robust phonological encoding of both target and masker. Finally, the neural encoding of phoneme-onsets was stronger for the masker speech, pointing to a possible neural basis for the higher distractibility experienced by individuals with HI.
This study investigates the potential of speech-reception-threshold (SRT) estimation through electroencephalography (EEG) based envelope reconstruction techniques with continuous speech. Additionally, we investigate the influence of the stimuli’s signal-to-noise ratio (SNR) on the temporal response function (TRF). Twenty young normal-hearing participants listened to audiobook excerpts with varying background noise levels while EEG was recorded. A linear decoder was trained to reconstruct the speech envelope from the EEG data. The reconstruction accuracy was calculated as the Pearson’s correlation between the reconstructed and actual speech envelopes. An EEG SRT estimate (SRTneuro) was obtained as the midpoint of a sigmoid function fitted to the reconstruction accuracy versus SNR data points. Additionally, the TRF was estimated at each SNR level, followed by a statistical analysis to reveal significant effects of SNR levels on the latencies and amplitudes of the most prominent components. The SRTneuro was within 3 dB of the behavioral SRT for all participants. The TRF analysis showed a significant latency decrease for N1 and P2 and a significant amplitude magnitude increase for N1 and P2 with increasing SNR. The results suggest that both envelope reconstruction accuracy and the TRF components are influenced by changes in SNR, indicating they may be linked to the same underlying neural process.
In the literature, auditory attention is explored through neural speech tracking, primarily entailing modeling and analyzing electroencephalography (EEG) responses to natural speech via linear filtering. Our study takes a novel approach, introducing an enhanced coherence estimation technique to assess the strength of neural speech tracking. This enables effective discrimination between attended and ignored speech. To mitigate the impact of colored noise in EEG, we address two biases–overall coherence-level bias and spectral peak-shifting bias. In a listening study involving 32 participants with hearing impairment, tasked with attending to competing talkers in background noise, our coherence-based method effectively discerns EEG representations of attended and ignored speech. We comprehensively analyze frequency bands, individual frequencies, and EEG channels. Frequency bands of importance are shown to be delta, theta and alpha, and the important EEG channels are the central. Lastly, we showcase coherence differences across different noise reduction settings implemented in hearing aids (HAs), underscoring our method's potential to objectively assess auditory attention and enhance HA efficacy.
Effective preprocessing of electroencephalography (EEG) data is fundamental for deriving meaningful insights. Independent component analysis (ICA) serves as an important step in this process by aiming to eliminate undesirable artifacts from EEG data. However, the decision on which and how many components to be removed remains somewhat arbitrary, despite the availability of both automatic and manual artifact rejection methods based on ICA. This study investigates the influence of different ICA-based artifact rejection strategies on EEG-based auditory attention decoding (AAD) analysis. We employ multiple ICA-based artifact rejection approaches, ranging from manual to automatic versions, and assess their effects on conventional AAD methods. The comparison aims to uncover potential variations in analysis results due to different artifact rejection choices within pipelines, and whether such variations differ across different AAD methods. Although our study finds no large difference in performance of linear AAD models between artifact rejection methods, two exeptions were found. When predicting EEG responses, the manual artifact rejection method appeared to perform better in frontal channel groups. Conversely, when reconstructing speech envelopes from EEG, not using artifact rejection outperformed other approaches.
Visual Abstract
Objective. This study develops a deep learning (DL) method for fast auditory attention decoding (AAD) using electroencephalography (EEG) from listeners with hearing impairment (HI). It addresses three classification tasks: differentiating noise from speech-in-noise, classifying the direction of attended speech (left vs. right) and identifying the activation status of hearing aid noise reduction algorithms (OFF vs. ON). These tasks contribute to our understanding of how hearing technology influences auditory processing in the hearing-impaired population. Approach. Deep convolutional neural network (DCNN) models were designed for each task. Two training strategies were employed to clarify the impact of data splitting on AAD tasks: inter-trial, where the testing set used classification windows from trials that the training set had not seen, and intra-trial, where the testing set used unseen classification windows from trials where other segments were seen during training. The models were evaluated on EEG data from 31 participants with HI, listening to competing talkers amidst background noise. Main results. Using 1 s classification windows, DCNN models achieve accuracy (ACC) of 69.8%, 73.3% and 82.9% and area-under-curve (AUC) of 77.2%, 80.6% and 92.1% for the three tasks respectively on inter-trial strategy. In the intra-trial strategy, they achieved ACC of 87.9%, 80.1% and 97.5%, along with AUC of 94.6%, 89.1%, and 99.8%. Our DCNN models show good performance on short 1 s EEG samples, making them suitable for real-world applications. Conclusion: Our DCNN models successfully addressed three tasks with short 1 s EEG windows from participants with HI, showcasing their potential. While the inter-trial strategy demonstrated promise for assessing AAD, the intra-trial approach yielded inflated results, underscoring the important role of proper data splitting in EEG-based AAD tasks. Significance. Our findings showcase the promising potential of EEG-based tools for assessing auditory attention in clinical contexts and advancing hearing technology, while also promoting further exploration of alternative DL architectures and their potential constraints.
Clusters of neurons generate electrical signals which propagate in all directions through brain tissue, skull, and scalp of different conductivity. Measuring these signals with electroencephalography (EEG) sensors placed on the scalp results in noisy data. This can have severe impact on estimation, such as, source localization and temporal response functions (TRFs). We hypothesize that some of the noise is due to a Wiener-structured signal propagation with both linear and nonlinear components. We have developed a simple nonlinearity detection and compensation method for EEG data analysis and utilize a model for estimating source-level (SL) TRFs for evaluation. Our results indicate that the nonlinearity compensation method produce more precise and synchronized SL TRFs compared to the original EEG data.
The auditory brainstem response (ABR) is a measure of subcortical activity in response to auditory stimuli. The wave V peak of the ABR depends on stimulus intensity level, and has been widely used for clinical hearing assessment. Conventional methods to estimate the ABR average electroencephalography (EEG) responses to short unnatural stimuli such as clicks. Recent work has moved towards more ecologically relevant continuous speech stimuli using linear deconvolution models called Temporal Response Functions (TRFs). Investigating whether the TRF waveform changes with stimulus intensity is a crucial step towards the use of natural speech stimuli for hearing assessments involving subcortical responses. Here, we develop methods to estimate level-dependent subcortical TRFs using EEG data collected from 21 participants listening to continuous speech presented at 4 different intensity levels. We find that level-dependent changes can be detected in the wave V peak of the subcortical TRF for almost all participants, and are consistent with level-dependent changes in click-ABR wave V. We also investigate the most suitable peripheral auditory model to generate predictors for level-dependent subcortical TRFs and find that simple gammatone filterbanks perform the best. Additionally, around 6 minutes of data may be sufficient for detecting level-dependent effects and wave V peaks above the noise floor for speech segments with higher intensity. Finally, we show a proof-of-concept that level dependent subcortical TRFs can be detected even for the inherent intensity fluctuations in natural continuous speech. Visual abstract Significance statement Subcortical EEG responses to sound depend on the stimulus intensity level and provide a window into the early human auditory pathway. However, current methods detect responses using unnatural transient stimuli such as clicks or chirps. We develop methods for detecting level-dependent responses to continuous speech stimuli, which is more ecologically relevant and may provide several advantages over transient stimuli. Critically, we find consistent patterns of level dependent subcortical responses to continuous speech at an individual level, that are directly comparable to those seen for conventional responses to click stimuli. Our work lays the foundation for the use of subcortical responses to natural speech stimuli in future applications such as clinical hearing assessment and hearing assistive technology.
In the literature, auditory attention is explored through neural speech tracking, primarily entailing modeling and analyzing electroencephalography (EEG) responses to natural speech via linear filtering. Our study takes a novel approach, introducing an enhanced coherence estimation technique that employs multitapers to assess the strength of neural speech tracking. This enables effective discrimination between attended and ignored speech. To mitigate the impact of colored noise in EEG, we address two biases – overall coherence-level bias and spectral peak-shifting bias. In a listening study involving 32 participants with hearing impairment, tasked with attending to competing talkers in background noise, our coherence-based method effectively discerns EEG representations of attended and ignored speech. We comprehensively analyze frequency bands, individual frequencies, and EEG channels. Frequency bands of importance are shown to be delta, theta and alpha, as well as the central EEG channels. Lastly, we showcase coherence differences across different noise reduction settings implemented in hearing aids, underscoring our method’s potential to objectively assess auditory attention and enhance hearing aid efficacy.
Objective. This paper presents a novel domain adaptation (DA) framework to enhance the accuracy of electroencephalography (EEG)-based auditory attention classification, specifically for classifying the direction (left or right) of attended speech. The framework aims to improve the performances for subjects with initially low classification accuracy, overcoming challenges posed by instrumental and human factors. Limited dataset size, variations in EEG data quality due to factors such as noise, electrode misplacement or subjects, and the need for generalization across different trials, conditions and subjects necessitate the use of DA methods. By leveraging DA methods, the framework can learn from one EEG dataset and adapt to another, potentially resulting in more reliable and robust classification models. Approach. This paper focuses on investigating a DA method, based on parallel transport, for addressing the auditory attention classification problem. The EEG data utilized in this study originates from an experiment where subjects were instructed to selectively attend to one of the two spatially separated voices presented simultaneously. Main results. Significant improvement in classification accuracy was observed when poor data from one subject was transported to the domain of good data from different subjects, as compared to the baseline. The mean classification accuracy for subjects with poor data increased from 45.84% to 67.92%. Specifically, the highest achieved classification accuracy from one subject reached 83.33%, a substantial increase from the baseline accuracy of 43.33%. Significance. The findings of our study demonstrate the improved classification performances achieved through the implementation of DA methods. This brings us a step closer to leveraging EEG in neuro-steered hearing devices.
Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo
Saznaj više