Our program is a web scraping tool that help the user scrape all the job postings in a single page from indeed.com.
We utilized the BeautifulSoup package to parse the HTML data from that site. Subsequently, we developed a Python program that manipulates the data in the html to return the job title, company name and link to the specific job posting. These data are then moved to a CSV file of the users choosing. Our program's process was eased by utilizing Git and GitHub to commit, push, and pull to merge our different codes.
Consequently, we've been able to ease the process of job searching for college students by creating a working program that creates a CSV file of job postings and their attributes without the grueling process of searching from different pages on indeed.
- Download the latest version of Visual Studio Code depending on your device(Mac/Linux/Windows). The proper documentation and guide can be found here: https://code.visualstudio.com/download
- Download Python for your device here: https://www.python.org/downloads/
- Open your command line on Windows (terminal on Mac/Linux) by searching on your start page and type
python --versionto ensure Python has been installed properly. If you run into errors, check the help guide at the end of this README for extra help. - Download the Python extension via VSCode which is found on the extensions menu of VSCode.
- To ensure everything is installed properly, copy this code
print('Hello World')and press the "Run" button to run the program. - Open your command line or terminal on your computer and type
pip install bs4. BeautifulSoup is a Python package that allows for pulling data out of HTML and XML files. - On your command line or terminal, at your current working directory, copy the following text to clone the project
git clone https://github.com/thuntran/Web-Scraper-BWH.git. A folder namedWeb-Scraper-BWHfolder should be created at the current directory on your local computer. - Run the program with this folder and add an indeed HTML link to get any job posting you want.
- Yayyy, congratulations! You now have a cool CSV file of great internships/jobs.
- NOTE: You can only use an indeed.com link and no other as the program is set up for indeed only.
- Python installation guide: https://realpython.com/installing-python/ and https://docs.microsoft.com/en-us/visualstudio/python/installing-python-support-in-visual-studio?view=vs-2022
- VSCode installation guide: https://code.visualstudio.com/docs/setup/setup-overview
- BeautifulSoup installation guide: https://www.tutorialspoint.com/beautiful_soup/beautiful_soup_installation.htm
- Any further errors: https://stackoverflow.com/ and of course YouTube videos :)
Proudly developed by Abi, Ella and Thu for Black Wings Hack 2022