Logo
User Name

Elmedin Selmanović

University of Sarajevo

Društvene mreže:

Sead Delalic, Zinedin Kadrić, Elmedin Selmanovic, Emin Mulaimović, E. Kadusic

Deep learning techniques in computer vision (CV) tasks such as object detection, classification, and tracking can be facilitated using predefined markers on those objects. Selecting markers is an objective that can potentially affect the performance of the algorithms used for tracking as the algorithm might swap similar markers more frequently and, therefore, require more training data and training time. Still, the issue of marker selection has not been explored in the literature and seems to be glossed over throughout the process of designing CV solutions. This research considered the effects of symbol selection for 2D-printed markers on the neural network’s performance. The study assessed over 250 ALT code symbols readily available on most consumer PCs and provided a go-to selection for effectively tracking n-objects. To this end, a neural network was trained to classify all the symbols and their augmentations, after which the confusion matrix was analysed to extract the symbols that the network distinguished the most. The results showed that selecting symbols in this way performed better than the random selection and the selection of common symbols. Furthermore, the methodology presented in this paper can easily be applied to a different set of symbols and different neural network architectures.

Audio fingerprinting techniques have seen great advances in recent years, enabling accurate and fast audio retrieval even in conditions when the queried audio sample has been highly deteriorated or recorded in noisy conditions. Expectedly, most of the existing work is centered around music, with popular music identification services such as Apple’s Shazam or Google’s Now Playing designed for individual audio recognition on mobile devices. However, the spectral content of speech differs from that of music, necessitating modifications to current audio fingerprinting approaches. This paper offers fresh insights into adapting existing techniques to address the specialized challenge of speech retrieval in telecommunications and cloud communications platforms. The focus is on achieving rapid and accurate audio retrieval in batch processing instead of facilitating single requests, typically on a centralized server. Moreover, the paper demonstrates how this approach can be utilized to support audio clustering based on speech transcripts without undergoing actual speech-to-text conversion. This optimization enables significantly faster processing without the need for GPU computing, a requirement for real-time operation that is typically associated with state-of-the-art speech-to-text tools.

Elmedin Selmanovic, Emin Mulaimović, Sead Delalic, Zinedin Kadrić, Zenan Sabanac

Many deep-learning computer vision systems analyse objects not previously observed by the system. However, such tasks can be simplified if the objects are marked beforehand. A straightforward method for marking is printing 2D symbols and attaching them to the objects. Selecting these symbols can affect the performance of the CV system, as similar symbols may require extended training time and a larger training dataset. It is possible to find good symbols differentiated by the given neural network easily. Still, there were no efforts to generalise such findings in the literature, and it is not known if the symbols optimal for one network would work just as well in the other. We explored how transferable symbol selection is between the networks. To this end, 30 sets of randomly selected and augmented symbols were classified by-five neural networks. Each network was given the same training dataset and the same training time. Results were ranked and compared, which allowed the identification of networks which performed similarly so that the symbol selection could be generalised between them.

In the field of telecommunications and cloud communications, accurately and in real-time detecting whether a human or an answering machine has answered an outbound call is of paramount importance. This problem is of particular significance during campaigns as it enhances service quality, efficiency and cost reduction through precise caller identification. Despite the significance of the field, it remains inadequately explored in the existing literature. This paper presents an innovative approach to answering machine detection that leverages transfer learning through the YAMNet model for feature extraction. The YAMNet architecture facilitates the training of a recurrent-based classifier, enabling real-time processing of audio streams, as opposed to fixed-length recordings. The results demonstrate an accuracy of over 96% on the test set. Furthermore, we conduct an in-depth analysis of misclassified samples and reveal that an accuracy exceeding 98% can be achieved with the integration of a silence detection algorithm, such as the one provided by FFmpeg.

The number of loan requests is rapidly growing worldwide representing a multi-billion-dollar business in the credit approval industry. Large data volumes extracted from the banking transactions that represent customers’ behavior are available, but processing loan applications is a complex and time-consuming task for banking institutions. In 2022, over 20 million Americans had open loans, totaling USD 178 billion in debt, although over 20% of loan applications were rejected. Numerous statistical methods have been deployed to estimate loan risks opening the field to estimate whether machine learning techniques can better predict the potential risks. To study the machine learning paradigm in this sector, the mental health dataset and loan approval dataset presenting survey results from 1991 individuals are used as inputs to experiment with the credit risk prediction ability of the chosen machine learning algorithms. Giving a comprehensive comparative analysis, this paper shows how the chosen machine learning algorithms can distinguish between normal and risky loan customers who might never pay their debts back. The results from the tested algorithms show that XGBoost achieves the highest accuracy of 84% in the first dataset, surpassing gradient boost (83%) and KNN (83%). In the second dataset, random forest achieved the highest accuracy of 85%, followed by decision tree and KNN with 83%. Alongside accuracy, the precision, recall, and overall performance of the algorithms were tested and a confusion matrix analysis was performed producing numerical results that emphasized the superior performance of XGBoost and random forest in the classification tasks in the first dataset, and XGBoost and decision tree in the second dataset. Researchers and practitioners can rely on these findings to form their model selection process and enhance the accuracy and precision of their classification models.

The vehicle routing problem is one of the most complex problems in the field of combinatorial optimization. Creating optimal routes leads to timely delivery of orders to end customers, which increases the efficiency of the company and enables maximum earnings. The problem of vehicle routing with a series of real-world constraints is called the rich vehicle routing problem (RVRP). The paper presents an approach to solving RVRP, where the asymmetric routing problem with a heterogeneous vehicle fleet, time windows, customer-vehicle constraints and a number of others is observed. The approach solves the problem in two phases, by dividing customers into clusters using a discrete metaheuristic Bat algorithm, and by solving the routing problem for each obtained cluster. The proposed approach has been tested for 26 days of delivery from large warehouses in Bosnia and Herzegovina. Significant savings were achieved compared to previously implemented approaches. All created routes were feasible. The approach automatically creates routes, and gives results in a shorter time than previously used approaches. Time does not increase significantly with the increase in the number of customers, which is a great advantage of the proposed approach.

Many public figures, companies and associations are planning events in different cities and at the same time have active profiles on social media. The planning process requires processing a large amount of data and different parameters when choosing the best event venue. Social media captures a large number of fan actions per day. This paper describes the process of selecting the most appropriate cities to organize events, aided by data collected from social media. The problem is defined as a combinatorial optimization problem. A modified metaheuristic Bat algorithm was proposed, implemented, and described in detail to solve the problem. Although the original Bat algorithm is designed to solve continuous optimization problems, the implemented bat algorithm is adapted to solve the defined problem. The algorithm is compared to the exhaustive search method for smaller instances, and to the greedy and genetic algorithm for larger instances. The algorithm was tested on benchmark data on cities in 20 European countries, as well as on real data collected from pages on the social network Facebook. Bat algorithm has shown superior results compared to other techniques, both in time and in the quality of the solutions generated.

Distribution companies often store goods in large warehouses. Orders are collected and prepared for transport. Large-scale warehouses are often divided into sectors. Each worker collects a part of the order from the assigned sector. In that case, workers often pick small orders and the process is not optimal. Therefore, order batching is done, where one worker collects multiple orders at a time. In this paper, an innovative concept of orders batching in a warehouse with a 48-hour delivery based on a metaheuristic approach is described. The algorithm divides each order by sectors. An analysis of each part of the order is done and the possibility of batching based on the order content is checked. The order batching is based on the discrete Bat algorithm. The transport scheme and the order of loading goods into the truck are observed. In the order picking process, a number of standard constraints such as weight and item priorities are considered. The concept has been implemented and tested for 50 days of warehouse operation in one of the largest warehouses in Bosnia and Herzegovina. The algorithm is compared with the earlier approach of collecting orders in the warehouse, and significant progress has been observed in the number of kilometers traveled on a daily basis.

Presentations of virtual cultural heritage artifacts are often communicated via the medium of interactive digital storytelling. The synergy of a storied narrative embedded within a 3D virtual reconstruction context has high consumer appeal and edutainment value. We investigate if 360° videos presented through virtual reality further contribute to user immersion for the application of preserving intangible cultural heritage. A case study then analyzes whether conventional desktop media is significantly different from virtual reality as a medium for immersion in intangible heritage contexts. The case study describes bridge diving at Stari Most, the old bridge in Mostar Bosnia. This application aims to present and preserve the bridge diving tradition at this site. The project describes the site and history along with cultural connections, and a series of quiz questions are presented after viewing all of the materials. Successful completion of the quiz allows a user to participate in a virtual bridge dive. The subjective evaluation provided evidence to suggest that our method is successful in preserving intangible heritage and communicating ideas in key areas of concern for this heritage that can be used to develop a preservation framework in the future. It was also possible to conclude that experience within the virtual reality framework did not affect effort expectancy for the web application, but the same experience significantly influenced the performance expectancy construct.

Many users need social media platforms to improve business. The usage of those platforms is usually focused on the marketing and customer targeting. Platforms like Facebook, Instagram or YouTube give their users a large number of reports and analytic tools. Public figures and organizations have a large number of followers who generate a significant number of activities. This paper focuses on the use of Facebook's geography analytic in the process of events planning. The problem is formulated as a combinatorial optimization problem. Data from social media platforms are used as an input to nature-inspired optimization algorithm. A public data set has been created with cities from 20 European countries. An adjusted genetic algorithm (AGA) is proposed. The greedy approach and AGA are compared on real data from several Facebook pages and on the created public dataset. The genetic algorithm shows better results and it gives the same solution as an exhaustive search for smaller instances.

...
...
...

Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više