Give us a call: (800) 252-6164

Glassdoor PHP Web Crawler

About the Project

At Potent Pages, we developed a web crawler to collect the public information of companies published on Glassdoor. This data included the ratings of companies, individual reviews, and general information about the company.

This information was used for company analysis to help make investment decisions and to perform competitive analysis. The reviews were also used to build a sentiment analysis neural network system.

Why Glassdoor?

Our client sought to obtain information from Glassdoor in order to gauge the sentiment of employees of companies. As a financial analysis firm, they were looking to identify major changes in how the employees of a company viewed that company.

The Crawler

At Potent Pages, we used our bulk crawler to extract tens of thousands of public pages from Glassdoor. These pages contained JSON data that gave us the data we needed. We wrote a PHP script to quickly extract the information from each of the pages and store the information into a database. From this, we wrote scripts to perform bulk analysis of the data on the site.

The Analysis

From this database via the Cralwer, we were able to extract all of the keywords in each review. With this we were able to search for negative reviews and monitor for changes in employee-given ratings, as well as indications of illegal or detrimental activity.

We also were able to track ratings of companies over time, examining for negative or positive outlooks of the company by its employees.



Scroll To Top