[Tech Talk] Enhancing Apache Spark for robust data processing

preview_player
Показать описание
In this session, we present our research on Apache Spark, an open-sourced distributed framework for large-scale data analytics and AI workloads. The main goal of our research is to achieve a faster execution speed and fault-tolerance in such applications by enhancing the memory management capability of the system. For this, we have investigated a chronic memory issue in Spark and developed our advanced memory management scheme for it, which will be explained during our session. To be more specific, we will introduce our solution, the lineage-checkpoint approach, which we have developed to solve the long-lineage problem in Spark.

#Samsung, #SDC21, #DataProcessing
Рекомендации по теме