Mason Archival Repository Service

Using Multi-Task Learning For Large-Scale Document Classification

Show simple item record

dc.contributor.advisor Rangwala, Huzefa Naik, Azad
dc.creator Naik, Azad 2013-05-03 2013-09-13T15:06:59Z 2013-09-13T15:06:59Z 2013-09-13
dc.description.abstract Multi-Task Learning (MTL) involves learning of multiple tasks, jointly. It seeks to improve the generalization performance of each task by leveraging the relationships among the different tasks. It is an advanced concept of Single-Task Learning (STL), most widely used in classification. In STL, each task is considered to be independent and learnt independently whereas in MTL, multiple tasks are learnt simultaneously by utilizing task relatedness. The main intuition is that the training signal present in related tasks can help each of the tasks learn better models. It also allows for learning of better models with fewer labeled examples. In this thesis our focus is on improving the classification performance for a database categorized as a hierarchy and archiving large number of documents. We focus on improving the classification performance of this database (source) by developing a MTL based model. In this model we use an external database to facilitate the classification process for the source database. We have used the logistic regression model for multiple classification tasks and k-nearest neighbor approach for finding the similarities between the classes in two hierarchical databases. The kNN allows us to de fine task relationships. Experiment on sampled DMOZ dataset has been done to evaluate the performance of MTL with STL, Semi-Supervised Learning (SSL) and Transfer Learning (TL). We have also used random projections for achieving better runtime performance at a minimal effect on classification accuracy.
dc.language.iso en en_US
dc.subject Multi-Task Learning en_US
dc.subject classification en_US
dc.subject model selection en_US
dc.subject logistic regression en_US
dc.subject random projection (hashing) en_US
dc.title Using Multi-Task Learning For Large-Scale Document Classification en_US
dc.type Thesis en Master of Science in Computer Science en_US Master's en Computer Science en George Mason University en

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search MARS


My Account