Using Pandas Data Frames and Pivot Tables in Python

Показать описание

Introduction
In this video, we'll use data frames and pivot tables to analyze some employee information and provide some insight to others on your team.

The version Ubuntu I used for this video is VERSION="20.04.6 LTS (Focal Fossa)".

The following commands will work for this version but not necessarily for other versions of Ubuntu.

sudo apt update

sudo apt-get install python3.12
python3 -V

sudo apt install pip
pip -V

pip install pandas

The following version of the script ensures no warning with your output.

import pandas as pd
from csv import DictReader

employee_info = []

try:
reader = DictReader(f)
for row in reader:
try:
row["age"] = int(row["age"])
row["salary"] = float(row["salary"])
except ValueError as e:
print(f"Skipping row due to error: {e}")
except FileNotFoundError:
exit(1)

# Check if employee_info is empty before proceeding
if not employee_info:
exit(1)

# Proceed with the rest of the script if data is successfully read
columns = employee_info[0].keys()

gender_difference_table[('salary', 'difference')] = gender_difference_table[('salary', 'Female')] - gender_difference_table[('salary', 'Male')]
print(gender_difference_table)

# Create age ranges

# Group by age range and calculate average salary

# Display the results
print(age_range_salary)

# Calculate the salary difference between the two age ranges
salary_difference = age_range_salary['salary'].iloc[1] - age_range_salary['salary'].iloc[0]
print(f"Salary difference between age ranges 41-80 and 18-40: {salary_difference}")