Top Big Data Interview Questions asked in 2024 | Cloud Data Engineer | Azure | Spark | SQL#interview

preview_player
ะŸะพะบะฐะทะฐั‚ัŒ ะพะฟะธัะฐะฝะธะต

I have trained over 20,000+ professionals in the field of Data Engineering in the last 5 years.

๐–๐š๐ง๐ญ ๐ญ๐จ ๐Œ๐š๐ฌ๐ญ๐ž๐ซ ๐’๐๐‹? ๐‹๐ž๐š๐ซ๐ง ๐’๐๐‹ ๐ญ๐ก๐ž ๐ซ๐ข๐ ๐ก๐ญ ๐ฐ๐š๐ฒ ๐ญ๐ก๐ซ๐จ๐ฎ๐ ๐ก ๐ญ๐ก๐ž ๐ฆ๐จ๐ฌ๐ญ ๐ฌ๐จ๐ฎ๐ ๐ก๐ญ ๐š๐Ÿ๐ญ๐ž๐ซ ๐œ๐จ๐ฎ๐ซ๐ฌ๐ž - ๐’๐๐‹ ๐‚๐ก๐š๐ฆ๐ฉ๐ข๐จ๐ง๐ฌ ๐๐ซ๐จ๐ ๐ซ๐š๐ฆ!

"๐€ 8 ๐ฐ๐ž๐ž๐ค ๐๐ซ๐จ๐ ๐ซ๐š๐ฆ ๐๐ž๐ฌ๐ข๐ ๐ง๐ž๐ ๐ญ๐จ ๐ก๐ž๐ฅ๐ฉ ๐ฒ๐จ๐ฎ ๐œ๐ซ๐š๐œ๐ค ๐ญ๐ก๐ž ๐ข๐ง๐ญ๐ž๐ซ๐ฏ๐ข๐ž๐ฐ๐ฌ ๐จ๐Ÿ ๐ญ๐จ๐ฉ ๐ฉ๐ซ๐จ๐๐ฎ๐œ๐ญ ๐›๐š๐ฌ๐ž๐ ๐œ๐จ๐ฆ๐ฉ๐š๐ง๐ข๐ž๐ฌ ๐›๐ฒ ๐๐ž๐ฏ๐ž๐ฅ๐จ๐ฉ๐ข๐ง๐  ๐š ๐ญ๐ก๐จ๐ฎ๐ ๐ก๐ญ ๐ฉ๐ซ๐จ๐œ๐ž๐ฌ๐ฌ ๐š๐ง๐ ๐š๐ง ๐š๐ฉ๐ฉ๐ซ๐จ๐š๐œ๐ก ๐ญ๐จ ๐ฌ๐จ๐ฅ๐ฏ๐ž ๐š๐ง ๐ฎ๐ง๐ฌ๐ž๐ž๐ง ๐๐ซ๐จ๐›๐ฅ๐ž๐ฆ."

๐‡๐ž๐ซ๐ž ๐ข๐ฌ ๐ก๐จ๐ฐ ๐ฒ๐จ๐ฎ ๐œ๐š๐ง ๐ซ๐ž๐ ๐ข๐ฌ๐ญ๐ž๐ซ ๐Ÿ๐จ๐ซ ๐ญ๐ก๐ž ๐๐ซ๐จ๐ ๐ซ๐š๐ฆ -

BIG DATA INTERVIEW SERIES

This mock interview series is launched as a community initiative under Data Engineers Club aimed at aiding the community's growth and development

Link of Free SQL & Python series developed by me are given below -

Don't miss out - Subscribe to the channel for more such informative interviews and unlock the secrets to success in this thriving field!

Social Media Links :

TIMESTAMPS : Questions Discussed
01:00 Introduction
01:47 What is Hadoop and how does it work?
03:09 Why move from MapReduce to Spark?
05:07 Does Spark provide storage?
05:47 Give a high-level explanation of Spark.
06:50 Why switch from RDDs to DataFrames in Spark?
07:53 Which languages does Spark support?
08:27 What are RDDs and their importance?
09:47 What happens during actions/transformations in Spark?
11:15 Explain Spark architecture.
13:06 What are deployment modes and their use cases?
14:30 Describe the plans created when executing a Spark job.
16:00 What is a predicate push down?
18:10 Explain jobs, stages, and tasks in Spark.
19:10 What are the types of transformations in Spark?
20:38 Difference between repartition and coalesce?
23:30 Should you infer schema or specify it when creating a DataFrame?
24:19 What are the ways to enforce schema? Provide an example.
24:54 SQL coding questions
41:09 Which Azure cloud services have you used?
41:35 Explain Databricks architecture at a high level.
42:40 How do you run SQL queries in Databricks?
44:10 How can one notebook run another in Databricks?
45:35 Can you use parameters when running Databricks notebooks?
46:07 Difference between Data Lake and Delta Lake? Pros and cons of each.
48:11 What activities are available in ADF?
49:09 Scenario-Based question

Music track: Retro by Chill Pulse
Background Music for Video (Free)

Tags
#mockinterview #bigdata #career #dataengineering #data #datascience #dataanalysis #productbasedcompanies #interviewquestions #apachespark #google #interview #faang #companies #amazon #walmart #flipkart #microsoft #azure #databricks #jobs
ะ ะตะบะพะผะตะฝะดะฐั†ะธะธ ะฟะพ ั‚ะตะผะต
ะšะพะผะผะตะฝั‚ะฐั€ะธะธ
ะะฒั‚ะพั€

The guy answered very well ! Got the good idea on what to say and what to avoid during interview

rishabhkesarwani-brrx
ะะฒั‚ะพั€

Though it is a mock interview, I appreciate his calm and pleasant responses to all the questions!

lazzybirdflying
ะะฒั‚ะพั€

16:53
Broadcast join decided on the go or run time which is by Adaptive Query Execution not spark sql engine or catalytic optimizer as said

gudiatoka
ะะฒั‚ะพั€

`he is always looking at his left side. xD

voxdiary
ะะฒั‚ะพั€

This is awesome. Literally, every concept from Spark is covered. A must watch interview.

ShashankVankadari
ะะฒั‚ะพั€

The million dollar question is...."Is he selected"..??? and how did he do in the 2nd round..??..2nd round questions please..

TarunChakraborty-kw
ะะฒั‚ะพั€

Java used in Hadoop
Bound to work on mapreduce
Can only work on batch process not real time in map reduce

hdr-tech
ะะฒั‚ะพั€

When ever transformation applied it never created a dag rather than it created a lineage between rrds and action created a DAG

gudiatoka
ะะฒั‚ะพั€

Thanks for the videos.
It's very helpful!

shaileshchile
ะะฒั‚ะพั€

The row_number values for marks are not correct (35:16).
The correct output is:

Marks Row_number
100 1
100 2
99 1
98 1
98 2
98 3
97 1
96 1
95 1

rushirajkadge
ะะฒั‚ะพั€

Sir pls provide the questions in description

suvenduku
ะะฒั‚ะพั€

he answered to the point most of the questions very good

shrikantkorate
ะะฒั‚ะพั€

Spark core -Rdd (flexible)
high level apis-
Df and Spark sql (easy to write query)

Transformation n action
Spark submit process
Deployment modes
Types of transformation
Repartition n coalesce

Methods for schema enforcement - ddl, struct

Consecutive wins in sql

hdr-tech