PySpark Tutorial 17: PySpark Correlation Analysis | PySpark with Python

preview_player
Показать описание
PySpark Tutorial 17: PySpark Correlation Analysis | PySpark with Python

About this video: In this video, you will learn how to about PySpark Correlation Analysis in pyspark

Large Language Model (LLM) - LangChain

Large Language Model (LLM) - LlamaIndex

Machine Learning Model Deployment

Spark with Python (PySpark)

Data Preprocessing (scikit-learn)

Social Media Links

#llamaindex #openai #llm #ai #huggingface #api #genai #generativeai #statswire #spark #pyspark #python #pythonprogramming #pythontutorial
Рекомендации по теме
Комментарии
Автор

How do you create a csv file of your final data frame correlation results? Excellent video by the way, great job!

kike
Автор

I followed your video and it work well, but I am trying to create heatmap instead of the table/data frame.
import seaborn as sns
plt.figure(figsize=(20, 18))
#corr = Tablematrix[cor_col]
sns.heatmap(dataset.correlation.corr(),
cmap= 'viridis', vmax=1.0, vmin=-1.0, linewidths=1,
annot=False, annot_kws={"size":1}, square=True);
plt.set_xticklabels(plt.get_xticklabels(), rotation=30)
plt.xlabel("X-Axis")
plt.ylabel("Y-Axis")
plt.show()

how can I do this without Pandas?

tutorb
Автор

Could you please send me the dataset house.csv link ?

mahdisaid
Автор

Hi Amir, in colab the following line is giving error:
matrix = Correlation.corr(df_vector, vector_col).collect()[0][0]
I have taken your house_price.csv dataset.
Till then everything executed properly.

pritishbanerjee
Автор

bro why 9 unavailable videos are hidden

kashyaprathore