Logo

Publikacije (47)

Nazad
G. Valenzise, Andrei I. Purica, Vedad Hulusic, Marco Cagnazzo

Image compression standards rely on predictive coding, transform coding, quantization and entropy coding, in order to achieve high compression performance. Very recently, deep generative models have been used to optimize or replace some of these operations, with very promising results. However, so far no systematic and independent study of the coding performance of these algorithms has been carried out. In this paper, for the first time, we conduct a subjective evaluation of two recent deep-learning-based image compression algorithms, comparing them to JPEG 2000 and to the recent BPG image codec based on HEVC Intra. We found that compression approaches based on deep auto-encoders can achieve coding performance higher than JPEG 2000, and sometimes as good as BPG. We also show experimentally that the PSNR metric is to be avoided when evaluating the visual quality of deep-learning-based methods, as their artifacts have different characteristics from those of DCT or wavelet-based codecs. In particular, images compressed at low bitrate appear more natural than JPEG 2000 coded pictures, according to a no-reference naturalness measure. Our study indicates that deep generative models are likely to bring huge innovation into the video coding arena in the coming years.

Hiba Yousef, J. L. Feuvre, G. Valenzise, Vedad Hulusic

The demand for very high-resolution video content in entertainment services (4K, 8K, panoramic, 360 VR) puts an increasing load on the distribution network. In order to reduce the network usage in existing delivery infrastructure for such services while keeping a good quality of experience, dynamic spatial video adaptation at the client side is seen as a key feature, and is actively investigated by academics and industrials. However, the impact of spatial adaptation on quality perception is not clear. In this paper, we propose a methodology for the evaluation of such adapted content, conduct a series of perceived quality measurements and discuss results showing potential benefits and drawbacks of the technique. Based on our results, we also propose a signaling mechanism in MPEG-DASH to assist the client in its spatial adaptation logic.

Kurt Debattista, Keith Bugeja, Sandro Spina, Thomas Bashford-Rogers, Vedad Hulusic

Maximizing performance for rendered content requires making compromises on quality parameters depending on the computational resources available . Yet, it is currently unclear which parameters best maximize perceived quality. This work investigates perceived quality across computational budgets for the primary spatiotemporal parameters of resolution and frame rate. Three experiments are conducted. Experiment 1 (n = 26) shows that participants prefer fixed frame rates of 60 frames per second (fps) at lower resolutions over 30 fps at higher resolutions. Experiment 2 (n = 24) explores the relationship further with more budgets and quality settings and again finds 60 fps is generally preferred even when more resources are available. Experiment 3 (n = 25) permits the use of adaptive frame rates, and analyses the resource allocation across seven budgets. Results show that while participants allocate more resources to frame rate at lower budgets the situation reverses once higher budgets are available and a frame rate of around 40 fps is achieved. In the overall, the results demonstrate a complex relationship between frame rate and resolution's effects on perceived quality. This relationship can be harnessed, via the results and models presented, to obtain more cost‐effective virtual experiences.

Emin Zerman, Vedad Hulusic, G. Valenzise, Rafał K. Mantiuk, F. Dufaux

Subjective quality assessment is considered a reliable method for quality assessment of distorted stimuli for several multimedia applications. The experimental methods can be broadly categorized into those that rate and rank stimuli. Although ranking directly provides an order of stimuli rather than a continuous measure of quality, the experimental data can be converted using scaling methods into an interval scale, similar to that provided by rating methods. In this paper, we compare the results collected in a rating (mean opinion scores) experiment to the scaled results of a pairwise comparison experiment, the most common ranking method. We find a strong linear relationship between results of both methods, which, however, differs between content. To improve the relationship and unify the scale, we extend the experiment to include cross-content comparisons. We find that the cross-content comparisons reduce the confidence intervals for pairwise comparison results, but also improve the relationship with mean opinion scores.

Vedad Hulusic, G. Valenzise, F. Dufaux

Computing dynamic range of high dynamic range (HDR) content is an important procedure when selecting the test material , designing and validating algorithms, or analyzing aesthetic attributes of HDR content. It can be computed on a pixel-based level, measured through subjective tests or predicted using a mathematical model. However, all these methods have certain limitations. This paper investigates whether dynamic range of modeled images with no semantic information, but with the same first order statistics as the original, natural content, is perceived the same as for the corresponding natural images. If so, it would be possible to improve the perceived dynamic range (PDR) pre-dictor model by using additional objective metrics, more suitable for such synthetic content. Within the subjective study, three experiments were conducted with 43 participants. The results show significant correlation between the mean opinion scores for the two image groups. Nevertheless, natural images still seem to provide better cues for evaluation of PDR.

D. Kane, Antoine Grimaldi, Emin Zerman, M. Bertalmío, Vedad Hulusic, G. Valenzise

The dynamic range of real world scenes may vary from around 102 to greater than 107 , whilst the dynamic range of monitors may vary from 102 to 105 . In this paper, we investigate the impact of the dynamic range ratio (DRratio) between the captured scene and the displayed image, upon the value of system gamma preferred by subjects (a simple global power law transformation applied to the image). To do so, we present an image dataset with a broad distribution of dynamic ranges upon various subranges of a SIM2 monitor. The full dynamic range of the monitor is 105 and we present images using either the full range, 75% or 50% of this, while maintaining a fixed mid-luminance level. We find that the preferred system gamma is inversely correlated with the DRratio and importantly, is one (linear) when the DRratio is one. This strongly suggests that the visual system is optimized for processing images only when the dynamic range is presented correctly. The DRratio is not the only factor. By using 50% of the monitor dynamic range and using either the lower, middle or upper portion of the monitor, we show that increasing the overall luminance level also increases the preferred system gamma, although to a lesser extent than the DR ratio.

Vedad Hulusic, G. Valenzise, Jean C. Gicquel, Jérôme Fournier, F. Dufaux

A key factor to determine the quality of experience (QoE) of a video is its capability to convey the large spectrum of perceptual phenomena that our eyes can sense in real life. In order to meet this demand, the recent DVB UHD-1 Phase 2 specification employs new video features, such as higher spatial resolutions (4K/8K) and High Frame Rate (HFR). The first enables larger field of view and level of details, while the second offers sharper images of moving objects going well beyond the current frame rates. While the contribution of each of these technologies to QoE has been investigated individually, in this paper we are interested to study their interaction, and in quantifying the benefits to users from their combination. To this end, we conduct a subjective test on compressed UHD+HFR content on a recent display capable of reproducing 100 pictures per second at 2160p resolution, with the goal to assess the increase in QoE of UHD and HFR with respect to conventional video, both individually and in combination. The results indicate that for content with fast motion, at higher bitrates the combination of UHD and HFR significantly improves the QoE compared to that obtained when these features are used individually.

Nirvana Pistoljevic, Vedad Hulusic

Children diagnosed with Autism spectrum disorder (ASD), as one of the most complex neurodevelopmental disabilities, are characterized by different brain and functioning development, distinct interaction with the environment and different learning patterns, language and social skills impairments, and repetitive auto-stimulating restricting behaviors. It has been shown that computer-assisted intervention is much more attention captivating and interesting to children compared with a classic approach to teaching, allowing for faster acquisition of skills. This makes these tools and the technology highly suitable for teaching children with autism basic developmental skills. In addition, interactive electronic books showed positive outcomes for comprehension and information acquisition in children with ASD, while decreasing inappropriate children behavior in the classroom. In this paper a pilot user study on an e-book with an embedded educational game for children with developmental disorders was presented. The results show that the e-book can be efficiently used for teaching children with ASD basic developmental skills and that the learned skills can be efficiently transfered to new media and environments. The framework will provide preschool children with and without disabilities with appropriate educational software, to build up their early cognitive abilities and school readiness skills, and promote incorporating technology as part of the educational and pedagogical process in schools.

Vedad Hulusic, Nirvana Pistoljevic

Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder, detectable early in development and characterized by the lack of socialization, development of language and patterns of rigid, repetitive, auto-stimulating behaviors that interfere with overall functioning of a person. Due to reduced level of attention and different style of learning, teaching children with ASD requires a particular set of tools and methods. Studies have shown that computer-based intervention, typically in form of serious games, can be effectively utilized for developing various skills, allowing children with disabilities both learning with teachers and practicing on their own time, when the taught concepts are presented in a fun, informal, and engaging way. Nonetheless, there is a limited amount of appropriately designed serious games for children with ASD, especially in less spread languages native to the children. In this paper we present a complete curriculum for final year Computer Science (CS) undergraduate students, aimed at developing web-based serious games for teaching children with and without autism basic concepts. In addition, we present multiple outcomes of such course taught by the authors, computer scientist and a psychologist and a special educator. We believe that inclusion of such curriculum in CS undergraduate programs could benefit both the students, children with ASD, teachers of both groups and the community in general.

Emin Zerman, Vedad Hulusic, G. Valenzise, Rafał K. Mantiuk, F. Dufaux

High dynamic range (HDR) technology allows for capturing and delivering a greater range of luminance levels compared to traditional video using standard dynamic range (SDR). At the same time, it has brought multiple challenges in content distribution, one of them being video compression. While there has been a significant amount of work conducted on this topic, there are some aspects that could still benefit this area. One such aspect is the choice of color space used for coding. In this paper, we evaluate through a subjective study how the performance of HDR video compression is affected by three color spaces: the commonly used Y'CbCr, and the recently introduced ITP (ICtCp) and Ypu'v'. Five video sequences are compressed at four bit rates, selected in a preliminary study, and their quality is assessed using pairwise comparisons. The results of pairwise comparisons are further analyzed and scaled to obtain quality scores. We found no evidence of ITP improving compression performance over Y'CbCr. We also found that Ypu'v' results in a moderately lower performance for some sequences.

Vedad Hulusic, G. Valenzise, Kurt Debattista, F. Dufaux

High dynamic range (HDR) imaging has become an important topic in both academic and industrial domains. Nevertheless, the concept of dynamic range (DR), which underpins HDR, and the way it is measured are still not clearly understood. The current approach to measure DR results in a poor correlation with perceptual scores (r ≈ 0.6). In this paper, we analyze the limitations of the existing DR measure, and propose several options to predict more accurately subjective DR judgments. Compared to the traditional DR estimates, the proposed measures show significant improvements in Spearman's and Pearson's correlations with subjective data (up to r ≈ 0.9). Despite their straightforward nature, these improvements are particularly evident in specific cases, where the scores obtained by using the classical measure have the highest error compared to the perceptual mean opinion score.

Vedad Hulusic, G. Valenzise, E. Provenzi, Kurt Debattista, F. Dufaux

Although high dynamic range (HDR) imaging has gained great popularity and acceptance in both the scientific and commercial domains, the relationship between perceptually accurate, content-independent dynamic range and objective measures has not been fully explored. In this paper, a new methodology for perceived dynamic range evaluation of complex stimuli in HDR conditions is proposed. A subjective study with 20 participants was conducted and correlations between mean opinion scores (MOS) and three image features were analyzed. Strong Spearman correlations between MOS and objective DR measure and between MOS and image key were found. An exploratory analysis reveals that additional image characteristics should be considered when modeling perceptually-based dynamic range metrics. Finally, one of the outcomes of the study is the perceptually annotated HDR image dataset with MOS values, that can be used for HDR imaging algorithms and metric validation, content selection and analysis of aesthetic image attributes.

Virtual museums enable Internet users to explore museum collections online. The question is how to enhance the viewer's experience and learning in such environments. In the Sarajevo Survival Tools virtual museum we introduced a new concept of interactive digital storytelling that will enable the visitors to explore the virtual exhibits - objects from the siege of Sarajevo - guided by a digital story. This way the virtual museum visitors will learn about the context of the displayed objects and be motivated to explore all of them. In this paper we present the virtual environment we developed and our experience with it. The results from three empirical studies we conducted, indicate the positive influence of digital storytelling and sound effects on visitors' perceptual response, resulting in increased motivation and enjoyment, and more effective information conveyance.

Nema pronađenih rezultata, molimo da izmjenite uslove pretrage i pokušate ponovo!

Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više