Live Big Data Mock Interview | Techno Managerial #interview | PySpark, Hive, SQL, Python #question

preview_player
ะŸะพะบะฐะทะฐั‚ัŒ ะพะฟะธัะฐะฝะธะต

I have trained over 20,000+ professionals in the field of Data Engineering in the last 5 years.

๐–๐š๐ง๐ญ ๐ญ๐จ ๐Œ๐š๐ฌ๐ญ๐ž๐ซ ๐’๐๐‹? ๐‹๐ž๐š๐ซ๐ง ๐’๐๐‹ ๐ญ๐ก๐ž ๐ซ๐ข๐ ๐ก๐ญ ๐ฐ๐š๐ฒ ๐ญ๐ก๐ซ๐จ๐ฎ๐ ๐ก ๐ญ๐ก๐ž ๐ฆ๐จ๐ฌ๐ญ ๐ฌ๐จ๐ฎ๐ ๐ก๐ญ ๐š๐Ÿ๐ญ๐ž๐ซ ๐œ๐จ๐ฎ๐ซ๐ฌ๐ž - ๐’๐๐‹ ๐‚๐ก๐š๐ฆ๐ฉ๐ข๐จ๐ง๐ฌ ๐๐ซ๐จ๐ ๐ซ๐š๐ฆ!

"๐€ 8 ๐ฐ๐ž๐ž๐ค ๐๐ซ๐จ๐ ๐ซ๐š๐ฆ ๐๐ž๐ฌ๐ข๐ ๐ง๐ž๐ ๐ญ๐จ ๐ก๐ž๐ฅ๐ฉ ๐ฒ๐จ๐ฎ ๐œ๐ซ๐š๐œ๐ค ๐ญ๐ก๐ž ๐ข๐ง๐ญ๐ž๐ซ๐ฏ๐ข๐ž๐ฐ๐ฌ ๐จ๐Ÿ ๐ญ๐จ๐ฉ ๐ฉ๐ซ๐จ๐๐ฎ๐œ๐ญ ๐›๐š๐ฌ๐ž๐ ๐œ๐จ๐ฆ๐ฉ๐š๐ง๐ข๐ž๐ฌ ๐›๐ฒ ๐๐ž๐ฏ๐ž๐ฅ๐จ๐ฉ๐ข๐ง๐  ๐š ๐ญ๐ก๐จ๐ฎ๐ ๐ก๐ญ ๐ฉ๐ซ๐จ๐œ๐ž๐ฌ๐ฌ ๐š๐ง๐ ๐š๐ง ๐š๐ฉ๐ฉ๐ซ๐จ๐š๐œ๐ก ๐ญ๐จ ๐ฌ๐จ๐ฅ๐ฏ๐ž ๐š๐ง ๐ฎ๐ง๐ฌ๐ž๐ž๐ง ๐๐ซ๐จ๐›๐ฅ๐ž๐ฆ."

๐‡๐ž๐ซ๐ž ๐ข๐ฌ ๐ก๐จ๐ฐ ๐ฒ๐จ๐ฎ ๐œ๐š๐ง ๐ซ๐ž๐ ๐ข๐ฌ๐ญ๐ž๐ซ ๐Ÿ๐จ๐ซ ๐ญ๐ก๐ž ๐๐ซ๐จ๐ ๐ซ๐š๐ฆ -

30 INTERVIEWS IN 30 DAYS- BIG DATA INTERVIEW SERIES

This mock interview series is launched as a community initiative under Data Engineers Club aimed at aiding the community's growth and development

Link of Free SQL & Python series developed by me are given below -

Don't miss out - Subscribe to the channel for more such informative interviews and unlock the secrets to success in this thriving field!

Social Media Links :

TIMESTAMPS : Questions Discussed
00:00 Introduction
01:12 PySpark and Azure integration for pipelines
02:44 Analytics setup and data warehousing
04:38 Configuring Spark job
06:21 Spark optimization
08:58 Shuffling avoidance techniques
10:22 Understanding and minimizing shuffling
11:04 Initial Spark job steps for shuffling reduction
12:40 Spark job partitions
13:47 CPU cores and partition relationship
16:55 Partitioning and bucketing use cases
20:00 Hash functions and tables
23:40 Decreasing partitions
24:23 Coalesce vs. repartition
25:14 Dealing with data skewness
26:06 Partition skew solutions
26:34 Salting purpose
27:35 Scenario-based question
30:01 Narrow and wide transformation examples
31:31 Spark's lazy evaluation
32:25 RDD vs. Spark comparison
33:38 Optimizers in higher-level APIs
34:50 Out-of-memory error handling
37:24 Another scenario-based query
42:00 Job scheduling with Azure Data Factory
43:26 Coding questions

Music track: Retro by Chill Pulse
Background Music for Video (Free)

Tags
#mockinterview #bigdata #career #dataengineering #data #datascience #dataanalysis #productbasedcompanies #interviewquestions #apachespark #google #interview #faang #companies #amazon #walmart #flipkart #microsoft #azure #databricks #jobs
ะ ะตะบะพะผะตะฝะดะฐั†ะธะธ ะฟะพ ั‚ะตะผะต
ะšะพะผะผะตะฝั‚ะฐั€ะธะธ
ะะฒั‚ะพั€

The interviewer and interviewee both are very knowledgeable. Can we have more discussion between them? One more suggestion - interviewer can explain the possible answer after one round of discussion that will be helpful. Thanks

jjayeshpawar
ะะฒั‚ะพั€

29:17 repartition will work to reduce the skewness

BalaMurugan-
ะะฒั‚ะพั€

Very knowledgeable interviewee and. Interviewer

tulsimalviya
ะะฒั‚ะพั€

All output will be 10 rows...each 1 will go with every 1 same goes for 2

HemantkumarSharma-ns
ะะฒั‚ะพั€

Assuming both tables (A and B) have a column named 'col' with the data provided (1, 1, 1, 2, 2 for A and 1, 1, 2, 2 for B), here's the count for each join type:
* Inner Join: 3
* Right Join: 5
* Left Join: 5
An inner join only keeps rows where there's a match in both tables on the join column ('col' in this case). In this example, there are three rows (1, 1, and 2) that appear in both tables.
A right join keeps all rows from the right table (B) and matches them with the left table (A) if there's a match on the join column. Here, all rows from B (including duplicates) are included since they all have a matching value in A.
A left join keeps all rows from the left table (A) and matches them with the right table (B) if there's a match on the join column. Any unmatched rows in B will have null values in the corresponding columns from A. In this case, both tables have the same number of rows (5), so the left join also results in 5 rows.

singhjirajeev
join shbcf.ru