Algorithms to Improve Analysis and Classification for Small Data



Journal Title

Journal ISSN

Volume Title



Binary 2D images can be analyzed with an image operator algorithm based on theoretical shape proportions and encircled image-histograms (SPEIs). These images can be classified with an algorithm that reduces complex classification problems into a series of simpler ones using decision trees with automatic model generation (DAMG). DAMG provides exceptional classification rates for small data problems when using the results of SPIEs as variables alongside other shape metrics. SPEIs describe shapes using two metrics: shape proportions (SPs) and encircled image histograms (EIs). SP and EI values are useful for classifying and describing shapes. SPEI-based approaches outperform convolutional neural networks in small data problems, where data is limited. DAMG converts any multinomial classification problem into a series of elegant binary classification problems. DAMG is particularly effective in small data scenarios as it is able to convert imbalanced problems into balanced problems. The developed SPEIs and DAMG tools are applied to the global issue of pill shape classification. The final models produced outperform current approaches and are more easily interpreted than many statistical or machine learning algorithms. Further, SPEI and DAMG are applied to a variety of different data sets to show their wide applicability.