Saving PySpark DataFrame AS One .CSV File | Big Data

preview_player
Показать описание
In PySpark, the write method doesn't save the dataframe as a single csv. In this video I will talk about many ways of solving this problem and compare them in terms of memory and speed.
Рекомендации по теме
Комментарии
Автор

you have a good tv voice. you should be commentating a science documentary or something

kernelab
Автор

very straightforward, thanks for sharing!

theunwaveringkeynote
Автор

This trick is a lifesaver :) Thanks a lot!!!

hoangng
Автор

first cat command in my opinion is not needed too, just do cat ./cleaned_data/*.csv > data.csv

kernelab