Efficient Multiple Input handling using R - Marco Mingione

preview_player
Показать описание
In many real-world data applications, there is a need to process and analyze input data that is originally distributed across multiple (possibly huge) files. The natural solution would be to read these files sequentially and merge them in a single data object for later analysis. However, such an approach is unefficient as it requires an execution time that is linear in the size of the input. We exploit the distributed nature of TeraStat2 by developing a procedure in R for reading in parallel the content of several data files by using multiple processors - Intervento nell'ambito del 1° Workshop su Supercalcolo organizzato dal Dipartimento di Scienze Statistiche de L'Università di Roma La Sapienza
Рекомендации по теме