info@worthwebscraping.com
or (+91) 79841 03276

Glassdoor Scraping – Scrape Glassdoor Job Postings using Python

Glassdoor Scraping – Scrape Glassdoor Job Postings using Python

Download Python Script

Send download link to:

Glassdoor is a very popular website for Job serach, company reviews, employee reviews, and interview tips for a particular company or profile etc. Millions of users uses glassdoor every year and share their experience of working in a company which helps others to make a good career choice. Also hundreds of jobs are posted daily on glassdoor making it a more preferred website for job seekers as well as recruiters. So, scrape Glassdoor Job Postings is beneficial to collect such listings.

Using web scraping one can find jobs suiting his profile or scrape reviews do a sentiment analysis and decide which company is a great place to work at.

In this tutorial we will scrape job details from glassdoor. We will search for data scientist jobs in Los Angeles. https://www.glassdoor.co.in/Job/los-angeles-data-scientist-jobs-SRCH_IL.0,11_IC1146821_KE12,26.htm .

We will grab details like company name, location, job title ad then grab the links of all jobs and go to each individual page and scrape complete job description.

See complete code below or watch video for detailed description.

Import Libraries
import requests
from bs4 import BeautifulSoup as soup
Set Headers:
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36'}
Send Get request:
html = requests.get('https://www.glassdoor.co.in/Job/los-angeles-data-scientist-jobs-SRCH_IL.0,11_IC1146821_KE12,26.htm', headers = headers)
bsobj = soup(html.content,'lxml')
Scrape company name:
company_name =[]
for company in bsobj.findAll('div',{'class':'jobHeader'}):
  company_name.append(company.a.text.strip())

company_name

Output:

Scraping Job Title:

job_title = []
for title in bsobj.findAll('div',{'class':'jobContainer'}):
  job_title.append(title.findAll('a')[1].text.strip())

job_title

Output:

Scraping Job Location:

location = []
for i in bsobj.findAll('div',{'class':'jobInfoItem empLoc'}):
  location.append(i.span.text.strip())

location

Output:

Grab individual job links:

links = []
for i in bsobj.findAll('div',{'class':'jobContainer'}):
  link = 'https://www.glassdoor.co.in'+ i.a['href']
  links.append(link) 

links

Output:

Scrape job description by going to individual links:

description = []
for link in links:
  page = requests.get(link,headers=headers)
  bs = soup(page.content,'lxml')
  for job in bs.findAll('div',{'id':'JobDescriptionContainer'})[0]:
    description.append(job.text.strip())

Output:

For bulk data extraction requirement from Glassdoor our scraping services can help you to scrape Glassdoor Job Postings data according to your search criteria. For more insight about data download sample data of Glassdoor scraping.