0
21. 11. 2023.
Extracting important layout features from the web page content
The visual layout has an enormous influence on human perception and is a subject of many studies, including research on web page similarity comparison. Structure-based approaches use the possibility of direct access to HTML content, whereas visual methods have widespread usage due to the ability to analyze image screenshots of entire web pages. A solution described within this paper will focus on extracting web page layout in forms needed by both above-mentioned approaches.