Why Spark is Faster Than Hadoop MapReduce

Показать описание

In this video I talk about why Apache Spark's in memory processing. That's why Spark is so much faster than Mapreduce or other analytics frameworks. it's simple but awesome for stream processing and batch processing. That's why I explain first what stream and batch processing is.

►Learn Data Engineering with my Data Engineering Academy:

Check out my free 100+ pages data engineering cookbook on GitHub:

Please SUPPORT WHAT YOU LIKE:

- As an Amazon Associate I earn from qualifying purchases from Amazon. Just use this link:

#ApacheSpark #DataEngineering #PlumbersofDataScience #bigdata

Рекомендации по теме

Комментарии

Shortly: MR materializes intermediate state, however, data flow engines like Spark does not. They operate in-memory. Another important point is, Sorting is implicit in MR, so mappers will always sort the output. That is not the case with Spark. It can be done when it is needed only.

qwaszx

Still don't quite understand the picture in 6:54 . If the difference between spark and mapreduce is simply that spark is using RAM to store data and mapreduce use hard disk. How come mapreduce guys didn't think of it (everyone knows RAM is much faster than hard disk) ? And it seems one can't say Spark is any different from mapreduce if the RAM is the only difference (just like running the same quick sort algorithm in both slow and fast computer, it is still the same algorithm).

leecharlie

Thanks for the video! it's really helping me to get the concept

fajarnadril

Thank you ! Could have been 4 minute video too!

onewithsixonewithsix

A good 7 minutes totally wasted. I wish Youtube can bring back Dislike stats back..

louuuuuu

Should be 2 min video for this content

shirsendubasu

Why Spark is Faster Than Hadoop MapReduce

Why Spark is Faster Than Hadoop MapReduce

1.1 Why Spark is Faster Than Hadoop | hadoop Vs spark |Hadoop Interview questions

Hadoop vs Spark | Map Reduce vs Spark | Interview Question

Spark Architecture Part 1: Spark Vs Hadoop MR spark vs mapreduce,spark vs hadoop #bigdata #pyspark

Hadoop vs Spark | Hadoop MapReduce vs Spark | Difference Between Spark & Hadoop | Intellipaat

SPARK VS HADOOP | BEST REVIEW YOU CAN FIND

Learn Apache Spark in 10 Minutes | Step by Step Guide

What Is Apache Spark?

Which is best ? | Spark vs Pandas

Hadoop vs Spark | Lec-3 | In depth explanation

Spark Dataframes vs SparkSQL

95% reduction in Apache Spark processing time with correct usage of repartition() function

Performance of Spark vs MapReduce | Spark Tutorial | Big data tools comparison | Edureka

Features Of Apache Spark | Learnomate technologies

Why Apache Flink is better than Spark by Rubén Casado

Why Spark part 1? what was before spark? #whyspark #spark #bigdata #shorts #pyspark #sparksql

Easy Hack To Get More Orders With Walmart Spark!

Why Should you Learn Spark | Intellipaat

Qubole Sparklens: understanding the scalability limits of Spark applications - Rohit Karlupia

Difference Between Spark and Hadoop Map Reduce #whyspark #spark #bigdata #shorts #pyspark #sparksql

Enabling Vectorized Engine in Apache Spark

Apache Spark Core—Deep Dive—Proper Optimization Daniel Tomes Databricks

Spark's Performance: The Past, Present, and Future (Sameer Agarwal)

The Perfect Match: Apache Spark Meets Swift