WTF is MapReduce?? [Batch Processing] | Systems Design Interview 0 to 1 with Ex-Google SWE

Показать описание

I'd post this to Blind but I think they're more in need of FapReduce instead of MapReduce

Jordan has no life

Рекомендации по теме

Комментарии

I like the humor you incorporate to these informative videos, def subbing 🔥

fallencheeto

Do not believe this man's lies. He did not choose to leave google for HFT, he was fired for taking an excessive amount of fairlife milks from the cafeteria, after multiple warnings. It was determined the net loss from his excessive milk consumption exceeded his positive contributions to the company.

brendansullivan

Also if I am understanding correctly, the reason we want sorted keys is because when we shuffle the keys into their respective node, we will be able to perform that O(n) merge join on multiple lists of key value pairs correct?

So for example, we take all k3, v from node 1 and all k9, v from node 3. Their hashing determines they all will be sent to node three, where a merge join will occur for those 2 lists. Is this accurate?

Great video!

timothyh

@3:20 should be Reducer: (key, List[value]) -> (key, value)

John-nhoJ

In the MapReduce architecture, should the shuffle step happen before sort happens? I am getting a sense that shuffle basically groups together the same key (say k3) from different nodes onto a single node (node 3 in this case); Then by sorting such shuffled keys, we are taking advantage of the reducer mechanism you demonstrated.

PavanBommana-

I found it a bit unclear how keys go from Sort to Shuffle. Do keys get redistributed first, and then get sorted locally? Or do keys get globally sorted first (ex. we could use n-way merge sort), and then get redistributed based on key ranges? I think the first flow sounds more reasonable, but then it conflicts with some process like merge join that was mentioned in the video, cuz if so merge join won't be necessary.

zhonglin

Are nodes 1, 2, and 3 supposed to be replicas of each other? Or simply 3 nodes storing three different data files?

timothyh

hey loving the videos so far, just one question at 4:12 if these are nodes of a hadoop cluster then arent they supposed to be replicas and hence roughly same
how do they begin with different raw data on each node
are these some sort of partitions??
I googled and on the surface found that theres not a partition system in hadoop as such

varundubey

WTF is MapReduce?? [Batch Processing] | Systems Design Interview 0 to 1 with Ex-Google SWE

WTF is MapReduce?? [Batch Processing] | Systems Design Interview 0 to 1 with Ex-Google SWE

Map Reduce explained with example | System Design

What is MapReduce♻️in Hadoop🐘| Apache Hadoop🐘

Hadoop In 5 Minutes | What Is Hadoop? | Introduction To Hadoop | Hadoop Explained |Simplilearn

MapReduce - Computerphile

Map Reduce explained with example | System Design

0115 big data analysis using map reduce batch processing ppt slide

What Is MapReduce? | What Is MapReduce In Hadoop? | Hadoop MapReduce Tutorial | Simplilearn

Google SWE teaches systems design | EP15: Batch Processing

What is Batch Processing?

3.10 Enterprise batch processing | CS802(B) |

Stream vs Batch processing explained with examples

Working Process of MapReduce Overview | Distributed Offline Batch Processing and Yarn

Chapter 10 Batch Processing

What is MapReduce Function in Bigdata and Hadoop? #mapreduce #bigdata #datascience #dataanalytics

Advantages of map reduce in Cloud computing | hadoop batch processing

The MapReduce tldr

Data - Batch processing vs Stream processing

What is Batch processing and real-time Processing | Apache Spark Tutorial | Edureka

Batch Processing vs Stream Processing | System Design Primer | Tech Primers

Why Spark is Faster Than Hadoop MapReduce

My Jobs Before I was a Project Manager

Hadoop vs Spark | Hadoop And Spark Difference | Hadoop And Spark Training | Simplilearn

3: Spark over MapReduce - Versatility #spark #programming #learning #python #datascience #jeenu #sql