Disaggregate Agricultural Statistics: An Application of Machine Learning and Nonlinear Constrained Optimization to Spatiotemporal Remotely Sensed Data




Journal Title

Journal ISSN

Volume Title



The location-specific information of the area, yield, and production of crops are vital for food security planning especially in developing countries where the living standards of the dominant population largely rely on agriculture activities. Teff is the most important staple crop in Ethiopia that made up 20% of all cultivated area. The area and production of teff collected from sample surveys are lacking in spatial information and are at the administrative level. And the survey is labor-intensive and time consuming in practice. Alternatively, spatiotemporal remotely sensed data have widely applied in land cover mapping. The normalized difference vegetation index (NDVI) is the most used vegetation index in crop area and yield estimation. This research intends to map the teff area using unsupervised classification with a statistical optimization algorithm by integrating spatiotemporal remotely sensed data and sub-national statistics. First, the 16-day 250m MODIS NDVI pixels are clustered with K-means unsupervised learning. Special attention has been paid to the selection of optimal cluster numbers. A linear regression model under the non-linear constrained minimization optimization is performed to integrate the subnational statistics from household survey with clusters of NDVI pixels. The coefficients of the regression model are used to estimate the fraction of teff area at the pixel level. The validation shows an acceptable R2 (0.66) between the modeled results and survey data. The results demonstrate an innovative method to improve location-specific crop type mapping at sub-pixel level by integrating remotely sensed data and machine learning technique with sub-national statistics.