Logo
User Name

Sead Delalić

University of Sarajevo

Društvene mreže:

Polje Istraživanja: Applied mathematics

Sead Delalic, Michael Kaca, Pratomo Alimsijah, Noah Weber, Elmedin Selmanovic, Mikailynn Galindez, Glen Marquez, Francisco Balmaceda, Eldina Delalić et al.

The reproducibility crisis and translational gap in preclinical research underscore the need for more accurate and reliable methods of health monitoring in animal models. Manual testing is labor-intensive, low-throughput, prone to human bias, and often stressful for animals. Although many smart cages have been introduced, they have seen limited adoption due to either low throughput (being limited to single animals), low data density (a few metrics only), high costs, a need for new space or infrastructure in the vivarium, high complexity use, or a combination of the above. Although technologies for video-based single-animal tracking have matured, no existing technology enables robust and accurate multi-animal tracking in standard home cages. To solve these problems, we built a new type of assay device: the Smart Lid. Smart Lids mount to existing racks, above standard home cages and stream video and audio data, turning regular racks into high-throughput monitoring platforms. To solve the multi-animal tracking problem, we developed a new computer vision pipeline (MOT - Multi-Organism Tracker) along with a new ear tag purpose-designed for computer vision tracking. MOT achieves over 97% accuracy in multi-animal tracking while maintaining an affordable runtime cost (less than $100 per month). The pipeline returns 21 health-related metrics, covering activity, feeding, drinking, rearing, climbing, fighting, cage positioning, social interactions and sleeping, with additional metrics under development.

Conformers have shown great results in speech processing due to their ability to capture both local and global interactions. In this work, we utilize a self-supervised contrastive learning framework to train conformer-based encoders that are capable of generating unique embeddings for small segments of audio, generalizing well to previously unseen data. We achieve state-of-the-art results for audio retrieval tasks while using only 3 seconds of audio to generate embeddings. Our models are almost completely immune to temporal misalignments and achieve state-of-the-art results in cases of other audio distortions such as noise, reverb or extreme temporal stretching. Code and models are made publicly available and the results are easy to reproduce as we train and test using popular and freely available datasets of different sizes.

Sead Delalic, Rijad Mutapčić, Irhad Fejzić

The Vehicle Routing Problem (VRP) is among the most complex optimization problems. Practical solutions require addressing real-world constraints such as time windows, vehicle capacities, delivery restrictions, driver working hours, and heterogeneous vehicle fleets. Solutions are often implemented in two stages: the first involves clustering customers, while the second focuses on incremental routing of these clusters to reduce complexity and improve solution control and explainability. However, the second stage heavily depends on the quality of the first, and clustering methods vary depending on client requirements. This paper explores various clustering methods and their impact on the final routing results, with a focus on real-world examples. The study includes diverse client scenarios, ranging from small-scale distribution systems with a limited number of customers to large-scale operations managing more than thousand of deliveries daily, covering both small and large orders. From fixed clustering and geographic partitioning to dynamic clustering algorithms and hybrid approaches, the advantages and limitations of each method are analyzed. The findings aim to provide actionable insights into selecting clustering methods that align with specific use cases, ensuring enhanced efficiency and adaptability in practical applications.

Sead Delalic, Zinedin Kadric, Jana Jerkić, Faris Mehmedović

This paper addresses the challenge of analyzing CVs to parse their content into structured formats suitable for further processing and analysis. The proposed solution processes CVs provided as images or PDFs, handling diverse input formats, including free-form, multi-language, non-standardized layouts, and highly structured documents. Various heuristic approaches are employed for layout analysis, complemented by lightweight language models for extracting information. While multimodal models demonstrate strong performance, their cost and deployment complexity remain significant barriers. This study explores alternative methods optimized for computational efficiency, processing accuracy, and easier deployment. A comparative analysis of approaches is conducted on a standard dataset containing CVs from diverse clients and job roles, ranging from entry-level to specialized positions in various domains. The findings highlight the potential of these tailored, efficient solutions for scalable and secure CV parsing.

Sead Delalic, Samra Behić, Harun Goralija, Zenan Sabanac

Warehouse Management Systems (WMS) employ advanced optimization techniques to enhance efficiency and streamline processes, from inventory positioning to order picking and packing. Among these, order picking represents the most time-consuming and resourceintensive operation. This paper presents a novel approach for monitoring worker efficiency in warehouses, focusing on estimating the complexity and time required for order picking. A variety of factors influence these estimates, including item location, quantity, dimensions and weight of items, picking sequence, and whether the location is in the stock or picking zone. Accurate estimation enables effective daily work planning, real-time monitoring of worker productivity, and overall warehouse efficiency. The proposed approach has been tested in real-world warehouse environments, demonstrating its practical applicability and potential to significantly improve worker performance, resource allocation, and operational management.

Zlatan Ajanović, Hamza Merzi'c, Suad Krilasevi'c, Eldar Kurtic, Bakir Kudić, Rialda Spahi'c, E. Alickovic, Aida Brankovic, Kenan Sehic et al.

In this paper, we analyze examples of research institutes that stand out in scientific excellence and social impact. We define key practices for evaluating research results, economic conditions, and the selection of specific research topics. Special focus is placed on small countries and the field of artificial intelligence. The aim is to identify components that enable institutes to achieve a high level of innovation, self-sustainability, and social benefits.

Elmedin Selmanovic, Emin Mulaimović, Sead Delalic, Zinedin Kadric, Zenan Sabanac

Many deep-learning computer vision systems analyse objects not previously observed by the system. However, such tasks can be simplified if the objects are marked beforehand. A straightforward method for marking is printing 2D symbols and attaching them to the objects. Selecting these symbols can affect the performance of the CV system, as similar symbols may require extended training time and a larger training dataset. It is possible to find good symbols differentiated by the given neural network easily. Still, there were no efforts to generalise such findings in the literature, and it is not known if the symbols optimal for one network would work just as well in the other. We explored how transferable symbol selection is between the networks. To this end, 30 sets of randomly selected and augmented symbols were classified by-five neural networks. Each network was given the same training dataset and the same training time. Results were ranked and compared, which allowed the identification of networks which performed similarly so that the symbol selection could be generalised between them.

In the field of telecommunications and cloud communications, accurately and in real-time detecting whether a human or an answering machine has answered an outbound call is of paramount importance. This problem is of particular significance during campaigns as it enhances service quality, efficiency and cost reduction through precise caller identification. Despite the significance of the field, it remains inadequately explored in the existing literature. This paper presents an innovative approach to answering machine detection that leverages transfer learning through the YAMNet model for feature extraction. The YAMNet architecture facilitates the training of a recurrent-based classifier, enabling real-time processing of audio streams, as opposed to fixed-length recordings. The results demonstrate an accuracy of over 96% on the test set. Furthermore, we conduct an in-depth analysis of misclassified samples and reveal that an accuracy exceeding 98% can be achieved with the integration of a silence detection algorithm, such as the one provided by FFmpeg.

Audio fingerprinting techniques have seen great advances in recent years, enabling accurate and fast audio retrieval even in conditions when the queried audio sample has been highly deteriorated or recorded in noisy conditions. Expectedly, most of the existing work is centered around music, with popular music identification services such as Apple’s Shazam or Google’s Now Playing designed for individual audio recognition on mobile devices. However, the spectral content of speech differs from that of music, necessitating modifications to current audio fingerprinting approaches. This paper offers fresh insights into adapting existing techniques to address the specialized challenge of speech retrieval in telecommunications and cloud communications platforms. The focus is on achieving rapid and accurate audio retrieval in batch processing instead of facilitating single requests, typically on a centralized server. Moreover, the paper demonstrates how this approach can be utilized to support audio clustering based on speech transcripts without undergoing actual speech-to-text conversion. This optimization enables significantly faster processing without the need for GPU computing, a requirement for real-time operation that is typically associated with state-of-the-art speech-to-text tools.

...
...
...

Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više