Mason Archival Repository Service

Handling Missing Data in Randomization-based Inference

Show simple item record

dc.contributor.advisor Rosenberger, William
dc.contributor.author Tan, Xiao
dc.creator Tan, Xiao
dc.date.accessioned 2022-08-03T20:18:39Z
dc.date.available 2022-08-03T20:18:39Z
dc.date.issued 2022
dc.identifier.uri http://hdl.handle.net/1920/12962
dc.description.abstract Randomized controlled trials (RCTs) serve as the gold standard in researching and developing new therapeutics. A new treatment’s effectiveness is evaluated by comparing it to existing or standard treatment in an RCT. However, the imbalance in participants’ characteristics between groups would harm such comparison. The act of randomization on patients mitigates the bias caused by such imbalance in the evaluation of treatment effects. The randomization-based inference was first introduced by Sir R.A. Fisher as an approach to evaluate treatment effects in an RCT. The limit in computing power has slowed its development in the past. However, the tremendous growth of computing technology enables us to compute randomization tests easily. Randomization-based inference is a natural way to analyze data from a clinical trial. But the presence of missing outcome data is problematic: if the data are removed, the randomization distribution is destroyed, and randomization tests have no validity. There are no randomization-based methods to handle missing data. In this thesis, the unconditional reference set method, the conditional reference set method, and the randomization-based multiple imputation are described to handle missingness while preserving the randomization distribution. Randomization-based missing data methods are compared to population-based and parametric imputation approaches via the metrics of type I error rates and power under both homogeneous and heterogeneous population models. Randomization-based analogs of standard missing data mechanisms are described, and a randomization-based procedure is proposed to determine if data are missing completely at random. A large simulation protocol is implemented to conclude that the unconditional, the conditional reference sets method and the randomization-based multiple imputation are reasonable approaches to handle missing data in patients’ missingness in the context of a two-armed RCT.
dc.format.extent 183 pages
dc.language.iso en
dc.rights Copyright 2022 Xiao Tan
dc.subject Statistics
dc.title Handling Missing Data in Randomization-based Inference
dc.type Dissertation
thesis.degree.name Ph.D. in Statistical Science
thesis.degree.level Ph.D.
thesis.degree.discipline Statistical Science
thesis.degree.grantor George Mason University


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search MARS


Browse

My Account

Statistics