PySpark Coding Question - Accenture and TCS | PySpark Interview Question |

preview_player
Показать описание
Hello Everyone,

schema = StructType([
StructField("id", IntegerType(), nullable=False),
StructField("name", StringType(), nullable=False),
StructField("age", IntegerType(), nullable=False),
StructField("department", StringType(), nullable=False),
StructField("salary", DoubleType(), nullable=False)
])
data = [
Row(1, "John", 30, "Sales", 50000.0),
Row(2, "Alice", 28, "Marketing", 60000.0),
Row(3, "Bob", 32, "Finance", 55000.0),
Row(4, "Sarah", 29, "Sales", 52000.0),
Row(5, "Mike", 31, "Finance", 58000.0)
]
display(employeeDF)

This series is for beginners and intermediate level candidates who wants to crack PySpark interviews

#pyspark #interviewquestions #interview #pysparkinterview #dataengineer #aws #databricks #python
Рекомендации по теме
Комментарии
Автор

7 activity in azure synapse Pipeline want to deactivate 2 activity (execute Pipeline activity) only in Production environment but want to run all activity in other environment like QA.STG, Dev

is it possible to deactivate some activity only for prod environment ?

Hope-xbjv
Автор

Id level
1 lvl1
1 lvl2
2 lvl3
2 lvl2
3 lvl1
3 lvl2
3 lvl3

result
id level1 level2 level3
1 lvl1 lvl2
2 lvl2 lvl3
3 lvl1 lvl2 lvl3
please provide the solution in pyspark(question asked in Telstra)

agarwalankita
Автор

TeamA | TeamB | Won

Ind | WI | WI

Ban | Ind | Ind

Ind | Aus | Ind

NZ | Ban | NZ

Result should be like below

TeamName | Won | Lost

Ind | 2 | 1

Ban | 0 | 1

WI | 1 | 0

NZ | 1 | 0


what is the solution

agarwalankita