In this video, I discussed deloitte PySpark scenario based interview questions and answers.

deloitte pyspark interview questions and answers?

Create dataframe:




Learn PySpark, an interface for Apache Spark in Python. PySpark is often used for large-scale data processing and machine learning.

spark sql and dataframe solution ..

df1 = df.groupBy("DeptName").agg(count(when(col("Gender")=='M', 1)).alias("Male"), count(when(col("Gender")=='F', 1)).alias("Female"), count('*').alias("Total"))

select DeptName, count(case when Gender = 'M' then 1 end ) as male,
count(case when Gender = 'F' then 1 end ) as Female,
count(*) as total_emp
from dept group by DeptName


Great video! Very informative and well-explained.
Another solution I tried was :


Hello sir, I hope you are doing well. I am currently attending interviews I am facing difficulty to explain the SSIS projects in my Resume. Can you please help me with that. Please respond. Thank you


this will also bring the result : df.groupBy('DeptName').agg(count(df.Gender).alias('TotalEmp'), sum(when (df.Gender=='M', 1)).alias('Male_count'), sum(when(df.Gender=='F', 1)).alias('FemaleCount')).show()
