2.5 Transformations Vs Actions | Spark Interview Questions

preview_player
Показать описание
As part of our spark Interview question Series, we want to help you prepare for your spark interviews. We will discuss various topics about spark like Lineage, reduceby vs group by, yarn client mode vs yarn cluster mode etc.
Please subscribe to our channel.
Here is link to other spark interview questions

Here is link to other Hadoop interview questions
Рекомендации по теме
Комментарии
Автор

To summarize the answer to this question, one can say that `Transformations` are the functions which take RDD as input and produce one or more RDDs as output. Some e.g. are `map()`, `filter()` etc. One thing which we should note that whenever any `transformation` or series of `transformation` functions are called then the result is not produced immediately instead they are lazily evaluated and instead of producing new RDDs immediately, DAG(Direct Acyclic Graph) is created with input RDD and functions called. This graph will be kept on building until some `Action` functions like `collect()`, `take()` is triggered.
`Action` will not produce new RDDs like transformation. It will create some non-RDDs result which can be stored on driver or store to some external system. This brings laziness of spark into motion. `Action` is one of the ways to send data from the `Executor` to the `Driver`

ankurranjan
Автор

1. Transformations are operations on RDDs or DataFrames that create a new RDD or DataFrame from an existing one.
2. lazily evaluated, meaning the execution is deferred until an action is called.
3. No return result.

Action :
1. Actions are operations that return a value to the driver program

pandurangbhadange
Автор

Please make a video on cluster size and daily data size to process..

swapnilnavkar
Автор

Please make a video on cluster size and daily data size to process.

subhashreebehera
Автор

Before taking on topic .. please list a key point in ppt for explanation as well as it will be useful for viewers to take note.. I see some of lag here .. you missed saying that the output of an action will not be an RDD and it may be a file or string or collection.. if prepare major points and give structure then the tutorial would be more informative.. after all we have to learn the skill instead of just answer the interview questions

rajinisharma
Автор

Please upload the machine learning tutorial....

rameshthamizhselvan
Автор

Please define wide and narrow transfomation in brief.

varultyagi
Автор

What is difference between save and write?

shivarajuyalagala
Автор

what are saying not understand everything

veeresh