Towards a Common Dimensionality Reduction Approach; Unifying PCA, tSNE, and UMAP through a Cohesive Framework

Draganov, Andrew

Towards a Common Dimensionality Reduction Approach; Unifying PCA, tSNE, and UMAP through a Cohesive Framework

Files

Draganov_thesis_2021.pdf (182 KB)

Authors

Draganov, Andrew

Abstract

Dimensionality reduction is a widely studied field that is used to visualize data, cluster samples, and extract insights from high-dimensional distributions. The classical approaches such as PCA, Isomap, and Laplacian eigenmaps rely on clear optimization strategies while more modern approaches such at tSNE and UMAP define gradient descent search spaces through disparities between the high- and low-dimensional datasets. In this work, we notice that all of these approaches can be interpreted as minimizing the difference between two kernel functions – one for the high dimensional space and one for the low dimensional space. In particular, once we abstract the kernel functions, we can develop a common framework for any dimensionality reduction problem. Namely, one needs to identify their high-dimensional distance kernel, the low-dimensional distance kernel, and the method used for minimization. With this in mind, we identify the relevant general framework and then proceed to discuss the ways in which PCA, tSNE, and UMAP all fit into it. For each, we discuss insights that were obtained during the process. We lastly highlight next steps and directions for future work.

Keywords

Dimensionality Reduction, Principal Component Analysis, TSNE, UMAP, Graph Laplacian

URI

https://hdl.handle.net/1920/12149

Collections

College of Science

Full item page

Towards a Common Dimensionality Reduction Approach; Unifying PCA, tSNE, and UMAP through a Cohesive Framework

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections