Analyzing Accidents Among Specialty Contractors: a Data Mining Approach



Journal Title

Journal ISSN

Volume Title



Despite technological and regulatory improvements and plentiful research in occupational safety, construction has remained one of the most dangerous industries in the U.S. and around the world. This is mainly due to many relatively small employers with limited safety personnel and budget, multi-employer worksites, the presence of numerous hazards, and a highly mobile workforce. The uncertainty behind these conditions, combined with the limited personal experience of safety practitioners, can lead to poor safety decisions. Together, such factors ultimately contribute to the high number of fatal and non-fatal injuries in the industry, and the loss of millions of dollars each year. Analyzing historical incidents to understand the causes and consequences of them has been one of the main ideas in safety research to reduce the quantity and severity of occupational injuries. Indeed, the significant amount of safety data being collected on construction sites—e.g., as accident reports—provides a valuable source of information for researchers seeking to better understand construction accidents. Recent developments in advanced analytical methods and computational tools can further improve previous efforts and provide a more data-driven objective approach toward construction safety. To test this approach, three objectives are defined in this research. The first objective is to evaluate the cost of the injuries (a main consequence of accidents) among various scenarios to quantify and compare their financial impact on companies and society. This objective can help contractors better quantify the risks of a construction project/task by estimating the severity of potential accidents in monetary values. Furthermore, the proposed methods contribute to the current body of safety knowledge by assessing alternative hypothesis testing practices that do not require specific assumptions. The second objective is to utilize statistical tests and models to identify the most influential factors contributing to construction accidents. The proposed analysis/modeling approach can be applied among all specialty contracting companies to identify and prioritize more hazardous situations within specific trades. The proposed model development process also provides a framework for codifying data from accident reports and analyzing them through a multivariate logistic regression model. The last objective is to investigate the potential correlations among accident outcomes and propose a novel way to incorporate such correlations through building multi-label machine learning models. The results indicate that knowing the value of one accident outcome can significantly increase the probability of a correct prediction for another outcome. The results further show that a particular multi-label method (i.e., classifier chains) can capture these latent relationships among accident outcomes during model training and significantly improve the performance of the predictive models. This research employs robust data and analytical models to predict the outcomes of accident scenarios, reliably, using variables available on construction sites. It is expected that the findings of this study will provide valuable insight into accident patterns and consequences to safety practitioners and transform the way machine learning models are being utilized in safety studies.



Construction Safety, Cost of Injuries, Machine learning, Occupational Accidents, Statistical Modeling