0
21. 11. 2023.
On Entropy and Source Encoding of Written Language: A South Slavic Example
This article presents a simple software-developed model for calculating the relative frequency of individual symbols and the entropy of the Latin alphabet of a standardised language used by four South-Slavic origin ethnic groups in the Western Balkans in four countries. In addition, a method of applying the Shannon-Fano and Huffman source coding algorithms is presented, which takes into consideration the specificity of the observed alphabet in relation to the English one. The presented model is developed in the MATLAB programming language. The model is tested using an arbitrarily selected text.