filmov
tv
top interview questions and answers in pyspark | drop duplicates from dataframe | #interview

Показать описание
top interview question and answer in pyspark | drop duplicates from dataframe | #interview
Create dataframe :
===============
# schema of the above table
schema = StructType([
StructField("emp_id", IntegerType(), True),
StructField("emp_name", StringType(), True),
StructField("emp_gender", StringType(), True),
StructField("emp_age", IntegerType(), True),
StructField("emp_salary", IntegerType(), True),
StructField("emp_manager", StringType(), True)
])
# creating the dataframe !
data = [
(1, "Arjun Patel", "Male", 30, 60000, "Aarav Sharma"),
(2, "Aarav Sharma", "Male", 28, 55000, "Zara Singh"),
(3, "Zara Singh", "Female", 35, 70000, "Arjun Patel"),
(4, "Priya Reddy", "Female", 32, 65000, "Aarav Sharma"),
(1, "Arjun Patel", "Male", 30, 60000, "Aarav Sharma"),
(6, "Naina Verma", "Female", 31, 72000, "Arjun Patel"),
(1, "Arjun Patel", "Male", 30, 60000, "Aarav Sharma"),
(4, "Priya Reddy", "Female", 32, 65000, "Aarav Sharma"),
(5, "Aditya Kapoor", "Male", 28, 58000, "Zara Singh"),
(10, "Anaya Joshi", "Female", 27, 59000, "Aarav Sharma"),
(11, "Rohan Malhotra", "Male", 36, 73000, "Zara Singh"),
(3, "Zara Singh", "Female", 35, 70000, "Arjun Patel")
]
top interview question and answer in pyspark :
#pwc #fang #pyspark #sql #interview #dataengineers #dataanalytics #datascience #StrataScratch #Facebook #data #dataengineeringinterview #codechallenge #datascientist #pyspark #CodingInterview
#dsafordataguy #dewithdhairy #DEwithDhairy #dhiarjgupta #leetcode #topinterviewquestion
Create dataframe :
===============
# schema of the above table
schema = StructType([
StructField("emp_id", IntegerType(), True),
StructField("emp_name", StringType(), True),
StructField("emp_gender", StringType(), True),
StructField("emp_age", IntegerType(), True),
StructField("emp_salary", IntegerType(), True),
StructField("emp_manager", StringType(), True)
])
# creating the dataframe !
data = [
(1, "Arjun Patel", "Male", 30, 60000, "Aarav Sharma"),
(2, "Aarav Sharma", "Male", 28, 55000, "Zara Singh"),
(3, "Zara Singh", "Female", 35, 70000, "Arjun Patel"),
(4, "Priya Reddy", "Female", 32, 65000, "Aarav Sharma"),
(1, "Arjun Patel", "Male", 30, 60000, "Aarav Sharma"),
(6, "Naina Verma", "Female", 31, 72000, "Arjun Patel"),
(1, "Arjun Patel", "Male", 30, 60000, "Aarav Sharma"),
(4, "Priya Reddy", "Female", 32, 65000, "Aarav Sharma"),
(5, "Aditya Kapoor", "Male", 28, 58000, "Zara Singh"),
(10, "Anaya Joshi", "Female", 27, 59000, "Aarav Sharma"),
(11, "Rohan Malhotra", "Male", 36, 73000, "Zara Singh"),
(3, "Zara Singh", "Female", 35, 70000, "Arjun Patel")
]
top interview question and answer in pyspark :
#pwc #fang #pyspark #sql #interview #dataengineers #dataanalytics #datascience #StrataScratch #Facebook #data #dataengineeringinterview #codechallenge #datascientist #pyspark #CodingInterview
#dsafordataguy #dewithdhairy #DEwithDhairy #dhiarjgupta #leetcode #topinterviewquestion
Комментарии