day 11 : american express scenario based interview questions and answers in pyspark

preview_player
Показать описание
# Create DataFrame Code

friends_data = [(1, 2),
(1, 3),
(1, 4),
(2, 1),
(3, 1),
(3, 4),
(4, 1),
(4, 3)]

friend_schema = "user_id int , friend_id int"

likes_data = [
(1, 'A'),
(1, 'B'),
(1, 'C'),
(2, 'A'),
(3, 'B'),
(3, 'C'),
(4, 'B')
]

like_schema = "user_id int , page_id string"

display(friends_df)
display(likes_df)

top interview question and answer in pyspark :

Your Queries :
===========
american express scenario based interview questions and answers in pyspark
american express online assessment
american express interview questions and answers
american express interview questions and answers for data engineer

#pyspark #americanexpress #dataanalytics #dataengineers #youtube #dataengineers #coding #interview #faang
Рекомендации по теме
Комментарии
Автор

We need this piece also after the filter condition.

# Finding the unique records.
answer_df = friend_page_concat_df.select(col("friend_id").alias("user_id"), col("page_id")).distinct()
answer_df.show()

DEwithDhairy