Classification of Job Offers on the World Wide Web

E Dar, J. Dorn:
"Classification of Job Offers on the World Wide Web";
in:"Proceedings of iiWAS 2016", herausgegeben von: ACM; ACM, 2016, ISBN: 978-1-4503-4807-2.

[ Publication Database ]


To automate the retrieval of the explosive growth of online job opportunities, text classification is only viable method - an initial step for job offers retrieval. In this paper we in- vestigated eight text classifiers to study their accuracy and generalization performance on new data. Different job of- fer websites are used to collect data, this data is prepro- cessed with different methods and arranged into five groups. Classifiers are regularized to avoid high variance and their accuracy and generalization errors evaluated. All the classi- fiers showed>90% accuracy but generalization error varied. Ridge Regression and Stochastic Gradient Decent general- ized well on new data of all groups, on the contrary Random Forest and Perceptron tenacious toward high variance. Re- maining classifiers exhibited both behavior, according to a group. We found the two classifiers that generalized well on new data; a successful step to proceed for the ultimate goal of the automation of job offers retrieval.