Data Visualization Exercises

Python Data Visualization Exercises

The goal of this Python data visualization exercises is to display some of the results of the 2018 New Coder survey. The goal of the study was to get an insight into the methods and motivations of thousands of students that were learning to code in 2018.

Here you will find the dataset with the answers of more than 30000 students that took the survey:

dataset to practice python charts

Exercises

1) Read the dataset and rename columns accordingly.

2) Create a plot bar chart showing the age of the students. Set a light blue color for the bars and create 6 age ranges:

  • Under 15 years old
  • 15 to 20 years old
  • 21 – 25 years old
  • 26 -30 years old
  • 31- 40 years old
  • 41 – 50 years old

Expected output:

python matplotlib bar chart exercise

3) Create a violin plot comparing the wage men and women expected to earn after graduation

Expected output:

python seaborn violin exercise

4) Create a heatmap showing the correlations between: Age of the students, the wage they expect to earn ,the gender and the number of hours they usually study.

Expected output:

 

python matplotlib heatmap practice

5) Display the origin country of the students in a bar plot with red colour bars. Ensure you show all countries available in the dataset.

Expected output:

python matplotlib histogram practice

 

SOLUTIONS


## Import the libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
%matplotlib inline

# Read the survey csv
prog = pd.read_csv('2018-new-coder-survey.csv')

## Rename columns
prog.rename(columns={'How old are you?':'Age','About how much money do you expect to earn per year at your first developer job (in US Dollars)?':'Expected_wage',"What's your gender?":'Gender',
'About how many hours do you spend learning each week?':'hours_study','Which country are you a citizen of?':'origin_country','Have you attended a full-time coding bootcamp?':'bootcamp_student'},inplace=True)

### Exercise 2
## Write a function to create age buckets
def clasf(x):
    if x<=15:
        return 'Under 15 years'
    elif 15<x<=20:
        return '15-20 years'
    elif 20<x<=25:
        return '21-25 years'
    elif 25 50:
        return 'Over 51 years'
    else:
        return 'NS/NC'
        
prog['Age_int'] = prog['Age'].map(clasf)

## Chart
(prog['Age_int'].value_counts(ascending=True).plot.barh(color='lightblue').set_title('Students Age',size=15))

### Exercise 3
## Vliolin Graph
sns.violinplot(y = prog['Expected_wage'], x = prog['Gender'])

### Exercise 4
coders = prog[['Age','Expected_wage','Gender','hours_study','bootcamp_student']]
sns.heatmap(coders.corr(), annot = True, cmap = 'viridis')

### Exercise 5
country = prog['origin_country'].value_counts()
country = pd.DataFrame(country)
country = country.sort_values(ascending=False,by='origin_country')

plt.figure(figsize=(10,40))
sns.barplot(y=country.index,x=country.origin_country,color='red')
plt.title('Origin Country',size=15)
plt.xlabel('Students')
plt.ylabel('Country',size=20)
plt.show()

We will be happy to hear your thoughts

Leave a reply

Python and Excel Projects for practice
Register New Account
Shopping cart