Correlation of model quality between predicted proteins and their templates
Protein structure prediction is an important process that carries a lot of benefits for various areas of science and industry. Template modeling is the most reliable and most popular method, depending on the solved structures from the Protein Data Bank. An important part of it is template selection, using different methods, which is a challenging task that requires special attention because the proper selection of protein template can lead to a more accurate protein prediction. This study focuses on the relationships between predicted proteins, taken from the Swiss-model repository, and their templates, on a larger scale. Features of predicted proteins are taken into account, including protein length, sequence identity, and sequence coverage. Quality assessment scores are compared and analyzed between the predicted proteins and their templates. Overall, quality assessment scores of predicted proteins show a moderate positive correlation to the sequence identity with the templates. Moreover, based on our data, the level of template quality is noticeably correlated with the predicted protein structuers, because templates with higher quality scores will, on average, also allow for the modeling of predicted proteins with higher quality scores.