Mason Archival Repository Service

Optimal Integration of Machine Learning Models: A Large-Scale Distributed Learning Framework with Application to Systematic Prediction of Adverse Drug Reactions

Show simple item record

dc.contributor.advisor Wojtusiak, Janusz
dc.contributor.advisor Gentle, James
dc.contributor.author Ngufor, Che G
dc.creator Ngufor, Che G
dc.date.accessioned 2015-02-12T02:59:55Z
dc.date.available 2015-02-12T02:59:55Z
dc.date.issued 2014 en_US
dc.identifier.uri https://hdl.handle.net/1920/9190
dc.description.abstract Too often in the real world information from multiple sources such as humans, experts, agents, or classifiers need to be integrated to provide support for a decision making system. One popular approach in machine learning is to combine these sources through an ensemble learning method. Ensemble learning has been proven to provide appealing solutions to many complex and challenging problems in machine learning. These include for example learning under non-standard conditions such as learning from large volumes of data, learning in the presence of uncertainties, learning with data streams, or when the concept to be learned drifts over time. Although considerable amount of research work has been done in ensemble learning in recent years, there still remain many open issues and challenges. This thesis explores three major challenges in this research area: First, development of techniques that scale up to large and possibly physically distributed databases. Second, construction of exact or approximately exact global models from distributed heterogeneous datasets with minimal data communication while preserving privacy of the data. Third, how to efficiently learn from modern large-scale datasets which are often characterized by noisy data points, unlabeled or poorly labeled, sample bias, missing values, etc.
dc.format.extent 208 pages en_US
dc.language.iso en en_US
dc.rights Copyright 2014 Che G Ngufor en_US
dc.subject Statistics en_US
dc.subject Mathematics en_US
dc.subject Computer science en_US
dc.subject Active Learning en_US
dc.subject Adverse Drug Reactions en_US
dc.subject Ensemble Learning en_US
dc.subject MapReduce en_US
dc.subject Parallel and Distributed Machine Learning en_US
dc.subject Variational Bayesian Methods en_US
dc.title Optimal Integration of Machine Learning Models: A Large-Scale Distributed Learning Framework with Application to Systematic Prediction of Adverse Drug Reactions en_US
dc.type Dissertation en
thesis.degree.level Doctoral en
thesis.degree.discipline Computational Sciences and Informatics en
thesis.degree.grantor George Mason University en


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search MARS


Browse

My Account

Statistics