Graph Neural Networks User Group Meetup 10/27/2022

Показать описание

Agenda 10/27/2022:
• 4:00 - 4:05 PM (PDT): Welcome and Updates.
• 4:05 - 4:35 PM (PDT): GNN Tool - A Tool for Building End-to-End Workflows with Graph Neural Networks (GNNs) (Onur Yilmaz, Nvidia).
• 4:35 - 5:05 PM (PDT): PotentialNet on DGL-LifeSci (Tatsuya Arai, Amazon ML Solutions Lab)
• 5:05 - 5:35 PM (PDT): Open Discussion.

Meeting Details:
4:05 - 4:35 PM (PDT): GNN Tool - A Tool for Building End-to-End Workflows with Graph Neural Networks (GNNs) [RECORDING REMOVED PER SPEAKER REQUEST
Abstract: Building end-to-end workflows with GNNs is more challenging than building workflows using other neural network models due to the additional challenges posed by the structure of the underlying data. Firstly, the raw data needs to be modeled and transformed into a graph complete with featured nodes and edges. Transforming big data to a graph at a scale is even more challenging, requiring advanced knowledge of the big data tools such as Spark and Dask. Loading the graph structure and the features of big data and constructing the graph in the memory using the popular GNN libraries like DGL or PyG is another challenge faced by data scientists. Managing the graph structure data in host memory is also an equally important problem. If the data is large and doesn’t fit into the GPU memory, it needs to be kept in the host memory and shared with the processes in the case of multi-gpu training. Otherwise, the data should be moved to the GPU memory for performance reasons. Lastly, Triton does not support GNN model deployment. We present a tool with a set of accessible APIs for managing the challenges stated above. The tool includes functions to transform large raw data into a graph along with APIs to load and construct DGL and PyG graphs in memory. It also automatically creates multiple processes and shares the graph with these processes for the multi-GPU GNN training.

Speaker Bio: Onur Yilmaz is a lead deep learning software engineer who's been with NVIDIA for almost 6 years. He's one of the main engineers who contributed to RAPIDS cuML, the GPU-accelerated machine learning open source library, and he also contributed to Merlin, an open-source framework for building large-scale deep learning recommender systems. He is currently making graph neural network training and deployment easy for data scientists, researchers, and engineers. Onur holds a Ph.D. in computer engineering from the New Jersey Institute of Technology. His dissertation focused on traditional machine learning and high performance computing for finance

4:35 - 5:05 PM (PDT): PotentialNet on DGL-LifeSci
Abstract: The Amazon Machine Learning Solutions Lab pairs customers with ML experts to help them identify and build ML solutions to address their organization’s highest return-on-investment ML opportunities. The field of drug discovery is competitive. However, customers are often hesitant to make use of open source tools due to concerns on intellectual property. In this presentation, we are going to discuss a practical application of PotentialNet on DGL-LifeSci in the industry and share some tips on data anonymization to protect your intellectual assets. We also showcase a scalable end-to-end ML pipeline of PotentialNet on Amazon SageMaker. PotentialNet is a type of graph neural network to predict a binding affinity between ligands and corresponding target proteins. DGL-LifeSci supports various graph algorithms to help biomedical scientists.
Speaker Bio: Arai, Tatsuya. Tatsuya is a Senior Research Scientist at Amazon Machine Learning Solutions Lab. He received Ph. D. degree in Bioengineering from the University of California, San Diego. His research interests are in the fields of quantitative functional medical imaging, image guided radiation therapy, and application of machine learning on the medical imaging.