Tutorial - Hadoop Streaming using Python scripts on an AWS EC2 Cluster

preview_player
Показать описание
This video tutorial shows you how you can perform a Hadoop Streaming job using mapper and reducer written in python. The setup consists of:
1. 3 nodes cluster running Hadoop 3.2.2 on top of Ubuntu 20.4
2. Python 3 installed

The sample code is the classic Word Count example, using the Shakespeare dataset.

Enjoy the video and happy learning!
Рекомендации по теме