In the field of telecommunications and cloud communications, accurately and in real-time detecting whether a human or an answering machine has answered an outbound call is of paramount importance. This problem is of particular significance during campaigns as it enhances service quality, efficiency and cost reduction through precise caller identification. Despite the significance of the field, it remains inadequately explored in the existing literature. This paper presents an innovative approach to answering machine detection that leverages transfer learning through the YAMNet model for feature extraction. The YAMNet architecture facilitates the training of a recurrent-based classifier, enabling real-time processing of audio streams, as opposed to fixed-length recordings. The results demonstrate an accuracy of over 96% on the test set. Furthermore, we conduct an in-depth analysis of misclassified samples and reveal that an accuracy exceeding 98% can be achieved with the integration of a silence detection algorithm, such as the one provided by FFmpeg.
Many deep-learning computer vision systems analyse objects not previously observed by the system. However, such tasks can be simplified if the objects are marked beforehand. A straightforward method for marking is printing 2D symbols and attaching them to the objects. Selecting these symbols can affect the performance of the CV system, as similar symbols may require extended training time and a larger training dataset. It is possible to find good symbols differentiated by the given neural network easily. Still, there were no efforts to generalise such findings in the literature, and it is not known if the symbols optimal for one network would work just as well in the other. We explored how transferable symbol selection is between the networks. To this end, 30 sets of randomly selected and augmented symbols were classified by-five neural networks. Each network was given the same training dataset and the same training time. Results were ranked and compared, which allowed the identification of networks which performed similarly so that the symbol selection could be generalised between them.
Audio fingerprinting techniques have seen great advances in recent years, enabling accurate and fast audio retrieval even in conditions when the queried audio sample has been highly deteriorated or recorded in noisy conditions. Expectedly, most of the existing work is centered around music, with popular music identification services such as Apple’s Shazam or Google’s Now Playing designed for individual audio recognition on mobile devices. However, the spectral content of speech differs from that of music, necessitating modifications to current audio fingerprinting approaches. This paper offers fresh insights into adapting existing techniques to address the specialized challenge of speech retrieval in telecommunications and cloud communications platforms. The focus is on achieving rapid and accurate audio retrieval in batch processing instead of facilitating single requests, typically on a centralized server. Moreover, the paper demonstrates how this approach can be utilized to support audio clustering based on speech transcripts without undergoing actual speech-to-text conversion. This optimization enables significantly faster processing without the need for GPU computing, a requirement for real-time operation that is typically associated with state-of-the-art speech-to-text tools.
Deep learning techniques in computer vision (CV) tasks such as object detection, classification, and tracking can be facilitated using predefined markers on those objects. Selecting markers is an objective that can potentially affect the performance of the algorithms used for tracking as the algorithm might swap similar markers more frequently and, therefore, require more training data and training time. Still, the issue of marker selection has not been explored in the literature and seems to be glossed over throughout the process of designing CV solutions. This research considered the effects of symbol selection for 2D-printed markers on the neural network’s performance. The study assessed over 250 ALT code symbols readily available on most consumer PCs and provided a go-to selection for effectively tracking n-objects. To this end, a neural network was trained to classify all the symbols and their augmentations, after which the confusion matrix was analysed to extract the symbols that the network distinguished the most. The results showed that selecting symbols in this way performed better than the random selection and the selection of common symbols. Furthermore, the methodology presented in this paper can easily be applied to a different set of symbols and different neural network architectures.
Artificial Intelligence (AI) is one of the most promising technologies of the 21. century, with an already noticeable impact on society and the economy. With this work, we provide a short overview of global trends, applications in industry and selected use-cases from our international experience and work in industry and academia. The goal is to present global and regional positive practices and provide an informed opinion on the realistic goals and opportunities for positioning B&H on the global AI scene.
The vehicle routing problem is one of the most complex problems in the field of combinatorial optimization. Creating optimal routes leads to timely delivery of orders to end customers, which increases the efficiency of the company and enables maximum earnings. The problem of vehicle routing with a series of real-world constraints is called the rich vehicle routing problem (RVRP). The paper presents an approach to solving RVRP, where the asymmetric routing problem with a heterogeneous vehicle fleet, time windows, customer-vehicle constraints and a number of others is observed. The approach solves the problem in two phases, by dividing customers into clusters using a discrete metaheuristic Bat algorithm, and by solving the routing problem for each obtained cluster. The proposed approach has been tested for 26 days of delivery from large warehouses in Bosnia and Herzegovina. Significant savings were achieved compared to previously implemented approaches. All created routes were feasible. The approach automatically creates routes, and gives results in a shorter time than previously used approaches. Time does not increase significantly with the increase in the number of customers, which is a great advantage of the proposed approach.
By successfully solving the problem of forecasting, the processes in the work of various companies are optimized and savings are achieved. In this process, the analysis of time series data is of particular importance. Since the creation of Facebook’s Prophet, and Amazon’s DeepAR+ and CNN-QR forecasting models, algorithms have attracted a great deal of attention. The paper presents the application and comparison of the above algorithms for sales forecasting in distribution companies. A detailed comparison of the performance of algorithms over real data with different lengths of sales history was made. The results show that Prophet gives better results for items with a longer history and frequent sales, while Amazon’s algorithms show superiority for items without a long history and items that are rarely sold.
Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo
Saznaj više