Tech Company Employee Review Analysis With Statistical Learning Methods
Job seeking and employment situations, either full-time or internships, are hot topics for most graduate students. As new graduates, however, we may lack working experience, which will cause difficulty in understanding various situations of companies and job positions and figuring out career goals. One good way for us to know such information is to look at the reviews made by employees to their employers. In this project, we focused on a dataset of employee reviews for some top IT companies which contains information about various aspects of employers, job positions, working conditions, and other things that potential employees may concern about.
The dataset we found does not have any specific response variables, which means it is not limited to solve specific problems. Thus, we utilized our theoretical knowledge and technical skills to leverage the information inside the dataset and propose smarter insights. Firstly, Exploratory Data Analysis(EDA) was conducted to help observe data and generate initial analysis. Then, we used an open-source tool name AutoPhrase to extract key phrases from the text review and visualize the results with a word cloud figure. We also built regression models using different statistical methods with numerical, categorical, and text features of the dataset. Furthermore, focusing on the review data only, we conducted a sentiment analysis with the help of the NLTK package to study the correlation between the overall rating and the compound score of reviews.