Honors Thesis 2020 - Haoqi Gu

Level-based Resume Classification on Nursing Job Positions

Haoqi Gu

High Honor in Mathematics


Abstract

In this thesis, we mainly focus on documents of real application resumes. Different from most similar works, we are not categorizing resumes into the suitable groups, for example, IT job resume, medical care job resume, teachers resume, and so on, but we will categorize application resumes on a specific level-based job position called Clinical Research Coordinator from the School of Nursing at Emory University. The job position has 4 different levels, CRC I, II, III, and IV, for applicants to apply to and we aim to write an algorithm to classify resumes into these 4 levels based on their content. Methods used are string matching, feature vectors, bags of words and ensemble models. The best model to predict the admission result of a resume reaches 66.89%.

Department / School

Applied Mathematics and Statistics / Emory University

Degree / Year

BS / Spring 2020

Committee

Jinho D. Choi, Computer Science and QTM, Emory University (Chair)
Bree Ettinger, Mathematics, Emory University
Yuanzhe Xi, Mathematics, Emory University

Links

Anthology | Paper | Presentation