Why and How to use Dask (Python API) for Large Datasets ?

preview_player
Показать описание
In this video, how to import large datasets in python using dask data frame, which is faster than pandas data frame.
Dask is open source and freely available. It is developed in coordination with other community projects like Numpy, Pandas, and Scikit-Learn. Dask provides multi-core execution on larger-than-memory datasets.
Dask supports the Pandas dataframe and Numpy array data structures and is able to either be run on your local computer or be scaled up to run on a cluster.
Рекомендации по теме
Комментарии
Автор

Thank you so much Nisha for such informative video. I've been thinking of shifting to dask because our dataset has around 12 lakh records and it is literally taking days to process. I don't know if that is normal with Pandas or we are doing something wrong.

ahmadjamalmughal
visit shbcf.ru