Walmart PySpark Interview Question | Data Engineering |

Показать описание

# Initialize Spark session
.appName("Create Datasets") \
.getOrCreate()

# Define the schema for transactions
transaction_schema = StructType([
StructField("customer_id", IntegerType(), True),
StructField("transaction_type", StringType(), True),
StructField("transaction_amount", FloatType(), True)
])

# Create the transactions DataFrame
transactions_data = [
(1, "credit", 30.0),
(1, "debit", 90.0),
(2, "credit", 50.0),
(3, "debit", 57.0),
(2, "debit", 90.0)
]

# Show the transactions DataFrame

# Define the schema for amounts
amount_schema = StructType([
StructField("customer_id", IntegerType(), True),
StructField("current_amount", FloatType(), True)
])

# Create the amounts DataFrame
amounts_data = [
(1, 1000.0),
(2, 2000.0),
(3, 3000.0),
(4, 4000.0)
]

# Show the amounts DataFrame

This series is for beginners and intermediate level candidates who wants to crack PySpark interviews

#data #walmart #dataengineering #kafka
#python #sql #azuredatabrickswithpyspark #llm

#pyspark #interviewquestions #interview #pysparkinterview #dataengineer #aws #databricks #python

Рекомендации по теме

Комментарии

this was a very tough question but overall amazing

prajju

Sagar, your code will fail if Total Credit is more than Total Debit. You have to reverse the summing and finally add when generating the result. Hope this has been captured and raised by someone else also 🙂

SanjaySuryavanshi-rk

i am not a Pyspark developer, however correct me if my logic will work here ... Separate Dr and Cr, then in Debit side multiply -1, then union all three table/data set ..then select the district record system will add and minus those values ..

Lemme know if you think this logic won't work.

sumit

Bro tell me honesltly one answer if i see your video and i want to become a data engineer . Your content is next level.

praveenbhandari

Walmart PySpark Interview Question | Data Engineering |

Walmart PySpark Interview Question | Data Engineering |

Walmart Labs Pyspark Interview Question for Senior Data Engineer Position | Big Data Analytics

Walmart | Tech Lead | Hadoop Bigdata Interview | PySpark, Hive | Interview Questions and Answers

Top 50 PySpark Interview Questions & Answers 2024 | PySpark Interview Questions | MindMajix

walmart interview questions and answers | Data Engineering

Trending Big Data Interview Question - Number of Partitions in your Spark Dataframe

Top 15 Spark Interview Questions in less than 15 minutes Part-2 #bigdata #pyspark #interview

Data engineer interview question | Process 100 GB of data in Spark Spark | Number of Executors

Big Data Interview | Mock | Problem Solving | Technical Round | Pyspark , SQL #interview #question

Amazon | Walmart | Target | Zepto | Blinkit Pyspark Interview Coding Question for Data Engineers

Understanding Apache Spark Architecture | Common Big Data Interview Questions #interview

EY Data Engineer Interview Experience | Interview Questions | How to prepare | 4 YOE, 27 LPA

Cluster Configuration in Apache Spark | Thumb rule fo optimal performance #interview #question

Data Engineer Mock Interview | SQL | PySpark | Project & Scenario based Interview Questions

Live Managerial Round Data Engineering Interview | PySpark | DSA | Project #interview #question

Understanding how to Optimize PySpark Job | Cache | Broadcast Join | Shuffle Hash Join #interview

Find The First Login, First Logout, Last Login, Last Logout | Walmart Interview Question SQL-Pyspark

Advantages of PARQUET FILE FORMAT in Apache Spark | Data Engineer Interview Questions #interview

How to Crack Data Engineering Interviews

Pyspark Interview Questions #1 | Data Engineer Interview Questions | Employee Table

Pyspark Interview Questions 3 : pyspark interview questions and answers

3 most common data modeling interview questions

Understanding How to Handle Data Skewness in PySpark #interview

Cloud Data Engineer Mock Interview | PySpark Coding Interview Questions |Azure Databricks #question