Data Engineering Interview | Apache Spark Interview | Live Big Data Interview

Показать описание

This video is part of the Spark Interview Questions Series.
A lot of subscribers has requested me to give some experience on how an actual Big Dta Interview look like. In This Video we have covered what usually happens in Big data or data engineering interview happens.
There will be more videos covering different aspects of Data Engineering Interviews.

Here are a few Links useful for you

If you are interested to join our community. Please join the following groups

You can drop me an email for any queries at

#apachespark #sparktutorial #bigdata
#spark #hadoop #spark3 #bigdata #dataengineer

Рекомендации по теме

Комментарии

Questions :
1) Why you shifted from map reduce development to spark development?
2) How Spark Engine is different from Hadoop Map Reduce engine?
3) What are the steps for spark jobs optimization?
4) What is executor and executor core? Reference in terms of process & threads
5) How to you identify that your hive script is slow?
6) When do we use partitioning and bucketing in hive?
7) Small file problem in hive ? ---> Skewiness
8) How do you improve high cardinality issue in dataset? In resect of Hive.
9) How do you care code merging with other teams, explain your development process?
10) Again, Small files issue in Hadoop ?
11) Metadatasize of hadoop ?
12) How spark is differentiated from Map Reduce?
13) In a class having 3 fields name, age, salary & you are creating series of objects from this class? How do you compare the object ----(I didn't got the question exactly)
14) Scala : what is === in joins conditions? What does it means?

Hope so it will help?

tradingtransformation

I must really appreciate for posting this interview in public domain. This is a really good one.. it would be really great to see a video on process to optimize the job

bramar

really great video, it would have been much greater, if you can answer the questions which the candidate was not able to answer, like what are symptoms of a job, on which you will decide that you should increase the number of executors or memory per executors. Can anyone please answer here, so that it may be beneficial for candidates. Thanks a lot for this video.

tradingtexi

@Data Savvy
It can be watched at one stretch. Really helpful. 👍🏻🙌🏻

amansinghshrinet

since this is mock interview at the end the interviewers should hv given feedback in the call itself so its helpful for viewers

ajithkannan

Wish I didn't have the haircut that day😂😂😀😀😂😂😂

arindampatra

Awesome Harjeet sir!!
I can even watch such thousand videos at a stretch😁
Very informative!!!
Can't wait for long, please upload as much as u can sir.

rohitrathod

Ur interview is very helpful.
Keep up the good work 👍👍👍

chaitanya

hadoop is meant for handling big files in small numbers and also small file problem arises when file size is less than HDFS block size [ 64 or 128 ] . Moreover, handling bulk number of small files may increase pressure on Name node, if we have option to handle big file. so in hadoop file size matters alot so only Partitioning and Buckting came into picture. correct me if i did mistake

kaladharnaidusompalyam

Hello, I was asked the followed questions in a AWS developer interview-
Q1. We have *sensitive data* coming in from a source and API. Help me design a pipeline to bring in data, clean and transform it and park it.
Q2. So where does pyspark come into play in this?
Q3. Which all libraries will you need to import to run the above glue job?
Q4. What are shared variables in pyspark
Q5. How to optimize glue jobs
Q6. How to protect sensitive data in your data.
Q7. How do you identify sensitive information in your data.
Q8. How do you provision a S3 bucket?
Q9. How do I check if a file has been changed or deleted?
Q10. How do I protect my file having sensitive data stored in S3?
Q11. How does KMS work?
Q12. Do you know S3 glacier?
Q13. Have you worked on S3 glacier?

Anonymous-feep

This is very useful. Please make more videos like this.

kranthikumarjorrigala

Nice video. The purpose of using '===' while joining is to make sure that we are comparing right values (join key value) and right data type as well. Please correct me if my understanding is wrong.

kiranmudradi

Default block size is 128MB, when small size files will be created using partitioning. Lot of storage will go waste. And required horizontal Scaling ( fails the purpose of distribution)

ShashankGupta

Awesome video. Thank you for putting this out. It's helpful.

sukanyapatnaik

Thank you very much, this is very useful!!!

sujaijain

Very Informative.. Thanks a lot Guys...

rahulpandit

Keep up the excellent work👍 expecting more such videos.

sathyansathyan

amazing job, really interesting thank you for sharing this interview with us.

AhlamLamo

Very 2 helpful nd plz have 1, 2 more interviews of same level.
Great effort by interviewer and interviewee.

MoinKhan-cgcu

It would be really helpful if you could make more such a mock interviews. I think we have only 3 live interviews yet on channel

shubhamshingi

Data Engineering Interview | Apache Spark Interview | Live Big Data Interview

Data Engineering Interview | Apache Spark Interview | Live Big Data Interview

Big Data Live Interview | Data Engineering Interview | Apache Spark Interview | Trendytech

2nd Data Engineering Interview | Apache Spark Interview | Live Big Data Interview

Top 10+ Data Engineer Interview Questions and Answers

Live Data Engineering Interview | Big Data Coding Interview | Apache Spark Interview

Live Data Engineering Coding Round Mock Interview | Apache Spark | Big Data Project #interview

Apache Spark Interview Questions And Answers | Apache Spark Interview Questions 2020 | Simplilearn

Data Engineering Interview | Apache Spark Interview | Live Big Data Interview

Journey into Tech - Data Engineering with Anthony Culver

Data Engineer Interview Questions | Data Engineer Interview Preparation | Intellipaat

Advantages of PARQUET FILE FORMAT in Apache Spark | Data Engineer Interview Questions #interview

Apache Spark End-To-End Data Engineering Project | Apple Data Analysis

Learn Apache Airflow in 10 Minutes | High-Paying Skills for Data Engineers

Understanding Apache Spark Architecture | Common Big Data Interview Questions #interview

Scala Interview - Data Engineering Interview | Apache Spark Interview @OnlineLearningCenterIndia

Cluster Configuration in Apache Spark | Thumb rule fo optimal performance #interview #question

Apache Spark 1st Technical Round Live Interview for Experienced Candidates | Azure | SQL #interview

Apache Spark 1st Technical Round Live Interview | Spark Optimization Coding #interview #question

Big Data Mock Interview | DSA | APACHE SPARK | PYTHON | DATABRICKS #interview

What Is Apache Spark ?| Most Asked Interview Question| Data Engineering Interview Questions

Live Data Engineering Technical Round Mock Interview | Apache Spark, SQL & Project #question

Apache Spark Mock Interview | Live Big Data Mock Interview | Spark and Scala Interview

Ensuring Data Quality in Apache Spark | Best Practices for High Quality Data #interview #question

4 Recently asked Pyspark Coding Questions | Apache Spark Interview