16. deloitte azure databricks interview questions | #databricks #pyspark #interview #azure

preview_player
Показать описание
#azuredatabricks #pyspark #sql #ssunitech #adf #adb
#Databricks #PysparkInterviewQuestions #deltalake

deloitte PySpark second round interview questions and answers?
PySpark interview q & a?
Databricks interview question and answers?
Azure Databricks #spark #pyspark #azuredatabricks #azure
In this video, I discussed deloitte PySpark scenario based interview questions and answers.

deloitte pyspark interview questions and answers?

Create dataframe:
======================================================

-----------------------------------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------------------

============================================================

Learn PySpark, an interface for Apache Spark in Python. PySpark is often used for large-scale data processing and machine learning.

Azure data factory tutorial playlist:

ADF interview question & answer:

1. pyspark introduction | pyspark tutorial for beginners | pyspark tutorial for data engineers:

2. what is dataframe in pyspark | dataframe in azure databricks | pyspark tutorial for data engineer:

3. How to read write csv file in PySpark | Databricks Tutorial | pyspark tutorial for data engineer:

4. Different types of write modes in Dataframe using PySpark | pyspark tutorial for data engineers:

5. read data from parquet file in pyspark | write data to parquet file in pyspark:

6. datatypes in PySpark | pyspark data types | pyspark tutorial for beginners:

7. how to define the schema in pyspark | structtype & structfield in pyspark | Pyspark tutorial:

8. how to read CSV file using PySpark | How to read csv file with schema option in pyspark:

9. read json file in pyspark | read nested json file in pyspark | read multiline json file:

10. add, modify, rename and drop columns in dataframe | withcolumn and withcolumnrename in pyspark:

11. filter in pyspark | how to filter dataframe using like operator | like in pyspark:

12. startswith in pyspark | endswith in pyspark | contains in pyspark | pyspark tutorial:

13. isin in pyspark and not isin in pyspark | in and not in in pyspark | pyspark tutorial:

14. select in PySpark | alias in pyspark | azure Databricks #spark #pyspark #azuredatabricks #azure

15. when in pyspark | otherwise in pyspark | alias in pyspark | case statement in pyspark:

16. Null handling in pySpark DataFrame | isNull function in pyspark | isNotNull function in pyspark:

17. fill() & fillna() functions in PySpark | how to replace null values in pyspark | Azure Databrick:

18. GroupBy function in PySpark | agg function in pyspark | aggregate function in pyspark:

19. count function in pyspark | countDistinct function in pyspark | pyspark tutorial for beginners:

20. orderBy in pyspark | sort in pyspark | difference between orderby and sort in pyspark:

21. distinct and dropduplicates in pyspark | how to remove duplicate in pyspark | pyspark tutorial:
Рекомендации по теме
Комментарии
Автор

spark sql and dataframe solution ..

df1 = df.groupBy("DeptName").agg(count(when(col("Gender")=='M', 1)).alias("Male"), count(when(col("Gender")=='F', 1)).alias("Female"), count('*').alias("Total"))
df1.show()

spark.sql('''
select DeptName, count(case when Gender = 'M' then 1 end ) as male,
count(case when Gender = 'F' then 1 end ) as Female,
count(*) as total_emp
from dept group by DeptName
''').show()

sushantbhardwaj
Автор

Great video! Very informative and well-explained.
Another solution I tried was :
df.groupBy("DeptName")
.agg(count('Gender').alias("TotalEmp"),
count_if(df12.Gender=='M').alias("MaleEmp"),
count_if(df12.Gender=='F').alias("FemaleEmp"),
)

Javi-gpjw
Автор

Hello sir, I hope you are doing well. I am currently attending interviews I am facing difficulty to explain the SSIS projects in my Resume. Can you please help me with that. Please respond. Thank you

kt-qguv
Автор

this will also bring the result : df.groupBy('DeptName').agg(count(df.Gender).alias('TotalEmp'), sum(when (df.Gender=='M', 1)).alias('Male_count'), sum(when(df.Gender=='F', 1)).alias('FemaleCount')).show()

ardsha