Master Databricks and Apache Spark Step by Step: Lesson 22 - PySpark Using SQL

preview_player
Показать описание
In this video, we use PySpark to load Spark dataframes from queries and perform data analysis at scale. You'll learn why using SQL with Python is so important and how it jump-starts your productivity on Databricks.

Video demo notebook at:

Apache Spark SQL Docs

For information on how to upload files to Databricks see:
Рекомендации по теме
Комментарии
Автор

Thank you Bryan for the series of videos on Databricks and Spark. I like the way how you elaborate and explain the concepts which makes it easy to understand for beginners like me trying to get into data engineering.
Thanks again keep up the good work.

tsri
Автор

Extremely useful teaching approach and content, thank you so much! I've found lessons 22 and 23 to be especially relevant at this stage, but I listened to all of the preceding videos, which filled in a lot of holes I had in my understanding. Great stuff!

loisf
Автор

Thanks for the series of videos. Best of all that can be found in YouTube

harryzhang
Автор

You are the best, we were egarly waiting for this looking forward for more, Thanks 😀

hmishra
Автор

Bryan, that "Databricks is smart the query for visualization" part is not clear to me...Can you please explain what's that ?

Raaj_ML
Автор

Hello Bryan, Thank you for you video again, could you help to advise what are the differences of using sparksql and sql with pyspark, for me, it seems that they both could deal with the spark dataframe with sql clause. is the only difference would be that sparksql is spark native runtime, while pyspark is interacting with sparkcore via the dataframe API. It would be very appreciated, if you could instruct on this.

JasonZhang-sejo
Автор

Hi Bryan - thanks for the video. I don't really understand why one would want to use pyspark sql vs just using sparksql. Are there use cases where it makes more sense? It seems like it would be significantly easier to just write run the sparksql code in a very intuitive and familiar way, and then convert the result to a dataframe. Am I missing something?

eugenezhelezniak
Автор

The session of Spark Dataframe writer is not clear.

boxiangwang
Автор

Just wondering if I am supposed to know Pandas before embarking on this? I don't recall a prior lesson on Pandas, but Brian you make references to Pandas on more than one occasion!

anandmahadevanFromTrivandrum
Автор

What is the equivalent to using exists and with clauses in Spark SQL?

rydmerlin