Honors Thesis 2016 - Meera Hahn

Advances in Methods and Evaluations for Distributional Semantic Models using Computational Lexicons

Meera Hahn

Highest Honor in Computer Science


Abstract

Word embedding has drastically changed the field of natural language processing and has become the norm for distributional semantic models. Previous methods for generating word embeddings did not take advantage of the semantic information in sentence structures. In this work we create a new approach to word embedding that leverages structural data from sentences to produce higher quality word embeddings. We also introduce a framework to evaluate word embeddings from any part of speech. We use this framework to assess the quality of word embeddings produced with different semantic contexts and show that sentence structure is rich with semantic information. Our evaluations show that our new word embeddings far out preform the original word embeddings in all parts of speech. Furthermore we examine the task of sentiment analysis in order to demonstrate the superiority of our system's word embeddings.

Department / School

Computer Science / Emory University

Degree / Year

BS / Spring 2016

Committee

Jinho D. Choi, Computer Science and QTM, Emory University (Chair)
Li Xiong, Computer Science, Emory University
Effrosyni Seitaridou, Physics, Oxford College of Emory University

Links

Anthology | Paper | Presentation