Abstract:
Due to recent attention on antimicrobial peptides (AMPs) as targets for antibacterial
drug research, many machine learning methods are now turning their attention to AMP
recognition. Approaches that rely on whole-peptide properties for recognition are challenged by the great sequence diversity among AMPs for effective feature construction.
This thesis proposes a novel and complementary method for feature construction which
relies on an extensive list of position-based amino acid physicochemical properties. These
features are shown effective in the context of classification by support vector machine
(SVM), both in comparison to related work in recognition of AMPs and in a novel study
on the cathelicidin family. A detailed analysis and careful construction of a decoy dataset
allows for the highlighting of antimicrobial activity-related features in cathelicidins. Special attention is also given to residue positions involved with enzymatic cleavage. The
method presented in this thesis is a first step towards understanding what confers to
cathelicidins their activity at the physicochemical level and may prove useful for future
AMP design efforts.