Big Data with Small Computers: Building a Hadoop Cluster with Raspberry Pis

preview_player
Показать описание
Alexandria Kalika

1. Details of the hardware set up - Raspberry Pis and network set up to create a functional cluster.
2. Installing Hadoop, Yarn, HDFS, Spark etc and different software.
3. Different data sets and how to use your newly built cluster to analyze data.
4. Using powerful Spark technologies to quickly analyze datasets.
5. Overview of open source technologies in creating a personal, powerful data cluster.

The Hadoop ecosystem created a wide array of amazing tools and technologies that made processing of large amounts of data easier and more fun. In this talk I will go through how to use Raspberry Pi 2s to create a distributed cluster worthy of interesting data analysis. I will use Apache Spark and other open source, easy to obtain software and hardware for data insights.

===

A FREE annual conference for anyone interested in Python in and around Ohio, the entire Midwest, maybe even the whole world.

Sat Jul 27 10:30:00 2019 at Barbie Tootle
Рекомендации по теме
Комментарии
Автор

I got it all working great on Pi4 Java 11, Python 3.7.3, Hadoop 3.2.2, and Spark 3.0.1. I boot to SSD and it smokes for a little 5 node cluster plus edge node. Doing data science and experimenting.

knjpollard
Автор

Thanks. Great inputs with concise presentation.

jesska.parker