Question 2: Interview questions on pyspark #pyspark #bigdata #dataengineering #interview

preview_player
Показать описание
InterView Question 2
Imagine you have a PySpark DataFrame named df_employee with the following schema:

root
|-- employee_id: integer
|-- employee_name: string
|-- salary: double
|-- department: string
|-- hire_date: date
Your task is to perform the following operations:

1)Identify and count the number of null values in each column.
2)Replace null values in the salary column with the mean salary of all employees.
3)Replace null values in the department column with a default value of "Unknown."

#subscribe #share with your network.

#pyspark #interview #bigdata #pysparktutorial #questionswithsolutions #pysparktutorial #bigdatatraining
Рекомендации по теме
Комментарии
Автор

Awesome - I have been asked simillar question in Jpmorgan.

soumyakantarath
Автор

bhai data bhi description pr dal dia kro
will be timesaviing while practicing

Paruu
Автор

The explanation is insufficient. Please explain step by step how you solved the first question. Why you used unpacking operator, is there any other simpler way to do it rather than 1 liners, etc...

premanandramesh