Using multiple GPUs for Machine Learning

preview_player
Показать описание
In this seminar, we will demonstrate how to run Machine Learning codes (TensorFlow and PyTorch) on Compute Canada systems using multiple GPUs. We will consider two cases - when the GPUs are inside a single node, and a multi-node case.
________________________________________­_________

Рекомендации по теме
Комментарии
Автор

This is exactly what I've been looking for! Thanks for the clear explanations

emilefortier
Автор

Stupid question... Can I use two different GPU classes in the same machine? For example, use a 970 and a 3070 to do compute? I know this isn't possible for graphics.

matthewwillox
Автор

How to overcome with this issue:-RuntimeError: NCCL error in: /opt/conda/conda-bld/pytorch_1670525552843/work/torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1269, i
nternal error, NCCL version 2.14.3
ncclInternalError: Internal check failed.
Last error:
Duplicate GPU detected : rank 0 and rank 1 both on CUDA device 1a000

GarimaKumari-slgq
welcome to shbcf.ru