Using Multi-Task Learning For Large-Scale Document Classification

Naik, Azad

Using Multi-Task Learning For Large-Scale Document Classification

Files

Naik_thesis_2013.pdf (1.93 MB)

Date

2013-09-13

Authors

Naik, Azad

Abstract

Multi-Task Learning (MTL) involves learning of multiple tasks, jointly. It seeks to improve the generalization performance of each task by leveraging the relationships among the different tasks. It is an advanced concept of Single-Task Learning (STL), most widely used in classification. In STL, each task is considered to be independent and learnt independently whereas in MTL, multiple tasks are learnt simultaneously by utilizing task relatedness. The main intuition is that the training signal present in related tasks can help each of the tasks learn better models. It also allows for learning of better models with fewer labeled examples. In this thesis our focus is on improving the classification performance for a database categorized as a hierarchy and archiving large number of documents. We focus on improving the classification performance of this database (source) by developing a MTL based model. In this model we use an external database to facilitate the classification process for the source database. We have used the logistic regression model for multiple classification tasks and k-nearest neighbor approach for finding the similarities between the classes in two hierarchical databases. The kNN allows us to de fine task relationships. Experiment on sampled DMOZ dataset has been done to evaluate the performance of MTL with STL, Semi-Supervised Learning (SSL) and Transfer Learning (TL). We have also used random projections for achieving better runtime performance at a minimal effect on classification accuracy.

Keywords

Multi-Task Learning, Classification, Model selection, Logistic regression, Random projection (hashing)

URI

https://hdl.handle.net/1920/8479

Collections

College of Engineering and Computing

Full item page

Using Multi-Task Learning For Large-Scale Document Classification

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections