Distributed cache | Hadoop Interview questions

preview_player
Показать описание
As part of our Hadoop interview series questions, we intend to share and guide society on what kind of questions are generally asked in Hadoop Interviews. we intend to cover topics like split size, block size, distributed cache etc.

Please check other questions on Hadoop

Spark Interview questions
Рекомендации по теме
Комментарии
Автор

please add video for optimization in spark .and how to moniter performance in spark UI

prajaktadange
Автор

When you do a spark submit, spark takes the jar file (if java), hive.xml file, properties file etc and uploads all of them into the distributed cache of hadoop and then uses yarn to provision and create the containers for each executor + driver then uploads the jar file into the driver program which runs teh program, eventually it initializes the spark context and then commands the data nodes to perform transformations. Did I get that correct?

mmm-iews
Автор

Again another nice video, please make some videos on Spark streams and Kafka if possible

bhargavhr
Автор

explain about rankfunc, densefunc in hive?

ampolusantosh
Автор

I think it is stored in cache of data node, so it's called distributed cache, please correct me if I am wrong

ajaypratap