Logo

Publikacije (61)

Nazad
Johanna Wilroth, E. Alickovic, Martin A. Skoglund, Martin Enqvist

Clusters of neurons generate electrical signals which propagate in all directions through brain tissue, skull, and scalp of different conductivity. Measuring these signals with electroencephalography (EEG) sensors placed on the scalp results in noisy data. This can have severe impact on estimation, such as, source localization and temporal response functions (TRFs). We hypothesize that some of the noise is due to a Wiener-structured signal propagation with both linear and nonlinear components. We have developed a simple nonlinearity detection and compensation method for EEG data analysis and utilize a model for estimating source-level (SL) TRFs for evaluation. Our results indicate that the nonlinearity compensation method produce more precise and synchronized SL TRFs compared to the original EEG data.

Joshua P. Kulasingham, H. Innes-Brown, Martin Enqvist, E. Alickovic

The auditory brainstem response (ABR) is a measure of subcortical activity in response to auditory stimuli. The wave V peak of the ABR depends on stimulus intensity level, and has been widely used for clinical hearing assessment. Conventional methods to estimate the ABR average electroencephalography (EEG) responses to short unnatural stimuli such as clicks. Recent work has moved towards more ecologically relevant continuous speech stimuli using linear deconvolution models called Temporal Response Functions (TRFs). Investigating whether the TRF waveform changes with stimulus intensity is a crucial step towards the use of natural speech stimuli for hearing assessments involving subcortical responses. Here, we develop methods to estimate level-dependent subcortical TRFs using EEG data collected from 21 participants listening to continuous speech presented at 4 different intensity levels. We find that level-dependent changes can be detected in the wave V peak of the subcortical TRF for almost all participants, and are consistent with level-dependent changes in click-ABR wave V. We also investigate the most suitable peripheral auditory model to generate predictors for level-dependent subcortical TRFs and find that simple gammatone filterbanks perform the best. Additionally, around 6 minutes of data may be sufficient for detecting level-dependent effects and wave V peaks above the noise floor for speech segments with higher intensity. Finally, we show a proof-of-concept that level dependent subcortical TRFs can be detected even for the inherent intensity fluctuations in natural continuous speech. Visual abstract Significance statement Subcortical EEG responses to sound depend on the stimulus intensity level and provide a window into the early human auditory pathway. However, current methods detect responses using unnatural transient stimuli such as clicks or chirps. We develop methods for detecting level-dependent responses to continuous speech stimuli, which is more ecologically relevant and may provide several advantages over transient stimuli. Critically, we find consistent patterns of level dependent subcortical responses to continuous speech at an individual level, that are directly comparable to those seen for conventional responses to click stimuli. Our work lays the foundation for the use of subcortical responses to natural speech stimuli in future applications such as clinical hearing assessment and hearing assistive technology.

Oskar Keding, E. Alickovic, Martin A. Skoglund, Maria Sandsten

In the literature, auditory attention is explored through neural speech tracking, primarily entailing modeling and analyzing electroencephalography (EEG) responses to natural speech via linear filtering. Our study takes a novel approach, introducing an enhanced coherence estimation technique that employs multitapers to assess the strength of neural speech tracking. This enables effective discrimination between attended and ignored speech. To mitigate the impact of colored noise in EEG, we address two biases – overall coherence-level bias and spectral peak-shifting bias. In a listening study involving 32 participants with hearing impairment, tasked with attending to competing talkers in background noise, our coherence-based method effectively discerns EEG representations of attended and ignored speech. We comprehensively analyze frequency bands, individual frequencies, and EEG channels. Frequency bands of importance are shown to be delta, theta and alpha, as well as the central EEG channels. Lastly, we showcase coherence differences across different noise reduction settings implemented in hearing aids, underscoring our method’s potential to objectively assess auditory attention and enhance hearing aid efficacy.

Johanna Wilroth, Bo Bernhardsson, Frida Heskebeck, Martin A. Skoglund, Carolina Bergeling, E. Alickovic

Objective. This paper presents a novel domain adaptation (DA) framework to enhance the accuracy of electroencephalography (EEG)-based auditory attention classification, specifically for classifying the direction (left or right) of attended speech. The framework aims to improve the performances for subjects with initially low classification accuracy, overcoming challenges posed by instrumental and human factors. Limited dataset size, variations in EEG data quality due to factors such as noise, electrode misplacement or subjects, and the need for generalization across different trials, conditions and subjects necessitate the use of DA methods. By leveraging DA methods, the framework can learn from one EEG dataset and adapt to another, potentially resulting in more reliable and robust classification models. Approach. This paper focuses on investigating a DA method, based on parallel transport, for addressing the auditory attention classification problem. The EEG data utilized in this study originates from an experiment where subjects were instructed to selectively attend to one of the two spatially separated voices presented simultaneously. Main results. Significant improvement in classification accuracy was observed when poor data from one subject was transported to the domain of good data from different subjects, as compared to the baseline. The mean classification accuracy for subjects with poor data increased from 45.84% to 67.92%. Specifically, the highest achieved classification accuracy from one subject reached 83.33%, a substantial increase from the baseline accuracy of 43.33%. Significance. The findings of our study demonstrate the improved classification performances achieved through the implementation of DA methods. This brings us a step closer to leveraging EEG in neuro-steered hearing devices.

F. Bachmann, Joshua P. Kulasingham, Kasper Eskelund, Martin Enqvist, E. Alickovic, H. Innes-Brown

The auditory brainstem response (ABR) is a valuable clinical tool for objective hearing assessment, which is conventionally detected by averaging neural responses to thousands of short stimuli. Progressing beyond these unnatural stimuli, brainstem responses to continuous speech presented via earphones have been recently detected using linear temporal response functions (TRFs). Here, we extend earlier studies by measuring subcortical responses to continuous speech presented in the sound-field, and assess the amount of data needed to estimate brainstem TRFs. Electroencephalography (EEG) was recorded from 24 normal hearing participants while they listened to clicks and stories presented via earphones and loudspeakers. Subcortical TRFs were computed after accounting for non-linear processing in the auditory periphery by either stimulus rectification or an auditory nerve model. Our results demonstrated that subcortical responses to continuous speech could be reliably measured in the sound-field. TRFs estimated using auditory nerve models outperformed simple rectification, and 16 minutes of data was sufficient for the TRFs of all participants to show clear wave V peaks for both earphones and sound-field stimuli. Subcortical TRFs to continuous speech were highly consistent in both earphone and sound-field conditions, and with click ABRs. However, sound-field TRFs required slightly more data (16 minutes) to achieve clear wave V peaks compared to earphone TRFs (12 minutes), possibly due to effects of room acoustics. By investigating subcortical responses to sound-field speech stimuli, this study lays the groundwork for bringing objective hearing assessment closer to real-life conditions, which may lead to improved hearing evaluations and smart hearing technologies.

Joshua P. Kulasingham, F. Bachmann, Kasper Eskelund, M. Enqvist, H. Innes-Brown, E. Alickovic

Perception of sounds and speech involves structures in the auditory brainstem that rapidly process ongoing auditory stimuli. The role of these structures in speech processing can be investigated by measuring their electrical activity using scalp-mounted electrodes. However, typical analysis methods involve averaging neural responses to many short repetitive stimuli that bear little relevance to daily listening environments. Recently, subcortical responses to more ecologically relevant continuous speech were detected using linear encoding models. These methods estimate the temporal response function (TRF), which is a regression model that minimises the error between the measured neural signal and a predictor derived from the stimulus. Using predictors that model the highly non-linear peripheral auditory system may improve linear TRF estimation accuracy and peak detection. Here, we compare predictors from both simple and complex peripheral auditory models for estimating brainstem TRFs on electroencephalography (EEG) data from 24 participants listening to continuous speech. We also discuss the data length required for estimating subcortical TRFs with clear peaks. Interestingly, predictors from simple models resulted in TRFs that were similar to those estimated using complex models, and were much faster to compute. This work paves the way for efficient modelling and detection of subcortical processing of continuous speech, which may lead to improved diagnosis metrics for hearing impairment and assistive hearing technology.

Oskar Keding, E. Alickovic, Martin A. Skoglund, Maria Sandsten

Coherence estimation between speech envelope and electroencephalography (EEG) is a proven method in neural speech tracking. This paper proposes an improved coherence estimation algorithm which utilises phase sensitive multitaper cross-spectral estimation. Estimated EEG coherence differences be-tween attended and ignored speech envelopes for a hearing impaired (HI) population are evaluated and compared. Testing was made on 31 HI subjects and showed significant coherence differences for grand averages over the delta, theta, and alpha EEG bands. Significance of increased coherence for attended speech was stronger for the new method compared to the traditional method. The new method of estimating EEG coherence, improves statistical detection performance and enables more rigorous data-based hypothesis-testing results.

Sara Carta, E. Alickovic, Johannes Zaar, Alejandro López Valdés, Giovanni M. Di Liberto

Hearing impairment alters the sound input received by the human auditory system, reducing speech comprehension in noisy multi-talker auditory scenes. Despite such challenges, attentional modulation on the envelope tracking in multi-talker scenarios is comparable between normal hearing (NH) and hearing impaired (HI) participants, with previous research suggesting an over-representation of the speech envelopes in HI individuals (see, e.g., Fuglsang et al. 2020 and Presacco et al. 2019), even though HI participants reported difficulties in performing the task. This result raises an important question: What speech-processing stage could reflect the difficulty in attentional selection, if not envelope tracking? Here, we use scalp electroencephalography (EEG) to test the hypothesis that such difficulties are underpinned by an over-representation of phonological-level information of the ignored speech sounds. To do so, we carried out a re-analysis of an EEG dataset where EEG signals were recorded as HI participants fitted with hearing aids attended to one speaker (target) while ignoring a competing speaker (masker) and spatialised multi-talker background noise. Multivariate temporal response function analyses revealed that EEG signals reflect stronger phonetic-feature encoding for target than masker speech streams. Interestingly, robust EEG encoding of phoneme onsets emerged for both target and masker streams, in contrast with previous work on NH participants and in line with our hypothesis of an over-representation of the masker. Stronger phoneme-onset encoding emerged for the masker, pointing to a possible neural basis for the higher distractibility experienced by HI individuals. Significance Statement This study investigated the neural underpinnings of attentional selection in multi-talker scenarios in hearing-impaired participants. The impact of attentional selection on phonological encoding was assessed with electroencephalography (EEG) in an immersive multi-talker scenario. EEG signals encoded the phonetic features of the target (attended) speech more strongly than those of the masker (ignored) speech; but interestingly, they encoded the phoneme onsets of both target and masker speech. This suggests that the cortex of hearing-impaired individuals may over-represent higher-level features of ignored speech sounds, which could contribute to their higher distractibility in noisy environments. These findings provide insight into the neural mechanisms underlying speech comprehension in hearing-impaired individuals and could inform the development of novel approaches to improve speech perception in noisy environments.

E. Alickovic, C. F. Mendoza, Andrew Segar, Maria Sandsten, Martin A. Skoglund

Recent studies of selective auditory attention have demonstrated that neural responses recorded with electroencephalogram (EEG) can be decoded to classify the attended talker in everyday multitalker cocktail-party environments. This is generally referred to as the auditory attention decoding (AAD) and could lead to a breakthrough for the next-generation of hearing aids (HAs) to have the ability to be cognitively controlled. The aim of this paper is to investigate whether cepstral analysis can be used as a more robust mapping between speech and EEG. Our preliminary analysis revealed an average AAD accuracy of 96%. Moreover, we observed a significant increase in auditory attention classification accuracies with our approach over the use of traditional AAD methods (7% absolute increase). Overall, our exploratory study could open a new avenue for developing new AAD methods to further advance hearing technology. We recognize that additional research is needed to elucidate the full potential of cepstral analysis for AAD.

E. Alickovic, Tobias Dorszewski, T. Christiansen, Kasper Eskelund, L. Gizzi, Martin A. Skoglund, D. Wendt

Attending to the speech stream of interest in multi-talker environments can be a challenging task, particularly for listeners with hearing impairment. Research suggests that neural responses assessed with electroencephalography (EEG) are modulated by listener’s auditory attention, revealing selective neural tracking (NT) of the attended speech. NT methods mostly rely on hand-engineered acoustic and linguistic speech features to predict the neural response. Only recently, deep neural network (DNN) models without specific linguistic information have been used to extract speech features for NT, demonstrating that speech features in hierarchical DNN layers can predict neural responses throughout the auditory pathway. In this study, we go one step further to investigate the suitability of similar DNN models for speech to predict neural responses to competing speech observed in EEG. We recorded EEG data using a 64-channel acquisition system from 17 listeners with normal hearing instructed to attend to one of two competing talkers. Our data revealed that EEG responses are significantly better predicted by DNN-extracted speech features than by hand-engineered acoustic features. Furthermore, analysis of hierarchical DNN layers showed that early layers yielded the highest predictions. Moreover, we found a significant increase in auditory attention classification accuracies with the use of DNN-extracted speech features over the use of hand-engineered acoustic features. These findings open a new avenue for development of new NT measures to evaluate and further advance hearing technology.

Tirdad Seifi Ala, E. Alickovic, A. F. Cabrera, W. Whitmer, Lauren V. Hadley, M. Rank, T. Lunner, C. Graversen

Objective: The purpose of this study was to investigate alpha power as an objective measure of effortful listening in continuous speech with scalp and ear-EEG. Methods: Scalp and ear-EEG were recorded simultaneously during presentation of a 33-s news clip in the presence of 16-talker babble noise. Four different signal-to-noise ratios (SNRs) were used to manipulate task demand. The effects of changes in SNR were investigated on alpha event-related synchronization (ERS) and desynchronization (ERD). Alpha activity was extracted from scalp EEG using different referencing methods (common average and symmetrical bi-polar) in different regions of the brain (parietal and temporal) and ear-EEG. Results: Alpha ERS decreased with decreasing SNR (i.e., increasing task demand) in both scalp and ear-EEG. Alpha ERS was also positively correlated to behavioural performance which was based on the questions regarding the contents of the speech. Conclusion: Alpha ERS/ERD is better suited to track performance of a continuous speech than listening effort. Significance: EEG alpha power in continuous speech may indicate of how well the speech was perceived and it can be measured with both scalp and Ear-EEG.

Zlatan Ajanović, E. Alickovic, Aida Brankovic, Sead Delalic, Eldar Kurtic, S. Malikić, Adnan Mehonic, Hamza Merzic et al.

Artificial Intelligence (AI) is one of the most promising technologies of the 21. century, with an already noticeable impact on society and the economy. With this work, we provide a short overview of global trends, applications in industry and selected use-cases from our international experience and work in industry and academia. The goal is to present global and regional positive practices and provide an informed opinion on the realistic goals and opportunities for positioning B&H on the global AI scene.

Payam Shahsavari Baboukani, C. Graversen, E. Alickovic, Jan Østergaard

Objectives Comprehension of speech in adverse listening conditions is challenging for hearing-impaired (HI) individuals. Noise reduction (NR) schemes in hearing aids (HAs) have demonstrated the capability to help HI to overcome these challenges. The objective of this study was to investigate the effect of NR processing (inactive, where the NR feature was switched off, vs. active, where the NR feature was switched on) on correlates of listening effort across two different background noise levels [+3 dB signal-to-noise ratio (SNR) and +8 dB SNR] by using a phase synchrony analysis of electroencephalogram (EEG) signals. Design The EEG was recorded while 22 HI participants fitted with HAs performed a continuous speech in noise (SiN) task in the presence of background noise and a competing talker. The phase synchrony within eight regions of interest (ROIs) and four conventional EEG bands was computed by using a multivariate phase synchrony measure. Results The results demonstrated that the activation of NR in HAs affects the EEG phase synchrony in the parietal ROI at low SNR differently than that at high SNR. The relationship between conditions of the listening task and phase synchrony in the parietal ROI was nonlinear. Conclusion We showed that the activation of NR schemes in HAs can non-linearly reduce correlates of listening effort as estimated by EEG-based phase synchrony. We contend that investigation of the phase synchrony within ROIs can reflect the effects of HAs in HI individuals in ecological listening conditions.

Nema pronađenih rezultata, molimo da izmjenite uslove pretrage i pokušate ponovo!

Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više