Must-Know PySpark Interview Question for Data Engineers - Live Demo & Tips!

preview_player
Показать описание
#ApacheSpark #DataEngineering #AzureDataEngineer #SparkSQL #DataTransformation #DataFrame #InterviewQuestion #BigData #AzureDatabricks #PySpark #DataAnalysis #DataScience #SQLQuery #Optimization #Efficiency #Tutorial

In this video, we'll dive into a popular PySpark interview question often asked by financial and banking companies—calculating the running total for grouped data. We'll explore the concept step-by-step using a simple DataFrame in Databricks, breaking down the logic behind partitioning data by ID and implementing a running total using PySpark’s window function. Whether you're prepping for an interview or looking to enhance your PySpark skills, this tutorial will guide you through the nuances of this essential data transformation technique. Don't miss out on this key topic for any aspiring Data Engineer!

– – – Book a Private One on One Meeting with me (1 Hour) – – –

– – – Express your encouragement by brewing up a cup of support for me – – –

– – – Other useful playlist: – – –

– – – Let’s Connect: – – –

Instagram: mrk_talkstech

– – – About me: – – –

Mr. K is a passionate teacher created this channel for only one goal "TO HELP PEOPLE LEARN ABOUT THE MODERN DATA PLATFORM SOLUTIONS USING CLOUD TECHNOLOGIES"

I will be creating playlist which covers the below topics (with DEMO)

1. Azure Beginner Tutorials
2. Azure Data Factory
3. Azure Synapse Analytics
4. Azure Databricks
5. Microsoft Power BI
6. Azure Data Lake Gen2
7. Azure DevOps
8. GitHub (and several other topics)

After creating some basic foundational videos, I will be creating some of the videos with the real time scenarios / use case specific to the three common Data Fields,

1. Data Engineer
2. Data Analyst
3. Data Scientist

Can't wait to help people with my videos.

– – – Support me: – – –

Рекомендации по теме
Комментарии
Автор

Even if we don't add rowsBetween, it works the same way right? I mean it's default right?

DheerajMaddula
Автор

Waiting on your full data engineer tutorial video Mr KT.... Some months ago you said you are working on something, i hope you are still working on it, I really look forward to that video as it will help me a lot.


Thank you for this too.

billionairemindset
Автор

Sir please upload scenario based questions for adf, key vault etc
It's asked in interviews

moyeenshaikh
Автор

Very nice explanation..Very good..
Waiting on your full data engineer tutorial video..

Abhinavkumar-ktgj
Автор

Your explanation skill is too good ❤️ hoping for more videos on topics suchs as Projects, airflow, dbt, snowflake :)

jaypandya
Автор

Great video brother! Looking forward to the upcoming videos . Thanks for the efforts 🎉

nanthagopalm
Автор

Thank you for the dedication on your vidoes man they are very helpfull for hands on projects and learning

emil
Автор

Don't we also need to add order by col(total) in the window spec? That would make the code deterministic

reachrishav
Автор

Good one. Waiting for your next project video :)

sharaniyaswaminathan
Автор

Unboundedpreceding and current row is default, right?

rahulmittal
Автор

@mr.ktalkstech do videos on regular basis your subject is awesome

manibaddireddy