Learn to Use a CUDA GPU to Dramatically Speed Up Code In Python

Показать описание

I explain the ending of exponential computing power growth and the rise of application-specific hardware like GPUs and TPUs. Includes a demo of using the Numba library to run code on Nvidia GPUs orders of magnitude faster via just-in-time compilation and CUDA.

00:00 Start of Video
00:16 End of Moore's Law
01: 15 What is a TPU and ASIC
02:25 How a GPU works
03:05 Enabling GPU in Colab Notebook
04:16 Using Python Numba
05:40 Building Mandlebrots with and without GPU and Numba
07:49 CUDA Vectorize Functions
08:27 Copy Data to GPU Memory

📝 Guided Projects:

Рекомендации по теме

Комментарии

holy shit, i was looking into this to speed up my mandelbrot-zooms and they are what you use as an example! This is a dream come true!

pkeric

Hello,

Thank you for this great introduction to numba and more specifically numba+cuda.
It is effectively a very easy approach to harness the power of cuda in simple python scripts.

there is a mistake in the "cuda" example.
You are calling the regular "create_fractal" instead of calling the "mandel_kernel" cuda version.
But if you call the cuda version : "mandel_kernel" you also have to precise the size of the grid (be careful x and y are reversed).
Therefore the final version for the call of "cuda" version of Mandelbrot is:

image = np.zeros((1024, 1536), dtype=np.uint8)
start = timer()
mandel_kernel[1536, 1024](-2.0, 1.0, -1.0, 1.0, image, 20)
dt = timer() - start
print("Mandelbrot created in %f s" % dt)
imshow(image)
show()

ChristopheKumsta

This is amazing! Thank you for taking effort to make it!

somefriday

Nice demo - I am getting into CUDA -GPU programming and have a workstation build with a 1950x 16 core CPU and two rtx 2080ti gpus and would like to check this demo on the machine and observe the outcome results without using colab- definitely will check this out today. By the way, with notebook python3 environment , I need to use pip to install numba library as shown or do i have to create a new virtal environemnt? I am curious about that. Thank you

gjubbar

I tried to follow this on my Windows 10 machine. The function you call as at 7:16 is still create_fractal() and not mandel_kernel() so I don't see why it is faster. When I changed it to mandel_kernel(), it complained that I hat to provide a launch configuration, telling the gpu how many grids and blocks to create. I added it like so (First properly setting a grid and block variable): mandel_kernel[grid, block](-2.0, 1.0, -1.0, 1.0, image, 20). It then worked and really was nearly 100x faster than the jit version.

ramoni

This is very helpful. Most people don't realize the overheads and code refactoring necessary to take advantages of the GPUs. I am going to refactor a simple MNIST training propgram I have which currently uses only Numpy. See if I can get meaningful improvements in training time.

vallurirajesh

6:41, except the time when you run the function for the first time, as in the rest, it will be fast.

alexzander__

Can I use this in an app that has a Kivy GUI ?

agnichatian

A great and unique video. Thanks a lot for sharing.

bernietgn

Thank-for this. I was able to replicate locally using Jupyter Notebook with Nvidia and WSL2, worked like a charm.

ShaunPrince

good stuff on here :)
I like how you did the website for documenting the video notes for reference later

chetana

Is the GPU script correct? No to_device and copy_to_host functions to copy the image to and from the GPU. And the script uses the create_fractal function rather than the mandel_kernel.

QuantumWormhole

Hi, Can You show the same problem solution in code with cpu device to compare performance cpu vs gpu?

PP-tczp

can I use numba for training models in sklearn libraries?

yogeshwarshendye

how can i speed up my machine learning code (sklearn and tensorflow), its very slow, ahhh😡

saebifar

Can you use this to speed up kmeans?
I have 60 million rows to cluster. On 16 cores it is going for hours.

knowledgelover

I don't understand example with numpy array sum, why you do this?. I don't need this. Yo can do just sum two the same tables with numpy: df = df2 + df2. Effect is the same without gpu, imediately
So why use gpu for this operation? I don't see any adventage with table example.

PP-tczp

Is there something other than Cuda that I can use? I don't plan to use any Nvidia GPUs. So, cuda is useless for me. In addition, unless you work in game development or some kind of niche research, work computers will not have an Nvidia-based GPU. I own several computers and none use Nvidia.

ajflink

sir I am still having some doubts.. can you please share your contact_num/mail_id?

Actually I have downloaded 2 files on github: one is a .cu file and the other is a .sh file.
Now the thing is both the files are interconnected, as like the .cu file takes the input from .sh file. I don't know how to run them or how to upload them.
I request you to please guide me. I will be highly thankful to you. My project review is there.

summercamp

My head gonna explode from all of theese, but I feel if learn this, I will get powerful....still no idea how to make my program run on GPU even when its HIGLY parael stuff...

jakubkahoun

Learn to Use a CUDA GPU to Dramatically Speed Up Code In Python

Nvidia CUDA in 100 Seconds

Intro to CUDA - An introduction, how-to, to NVIDIA's GPU parallel programming architecture

CUDA Explained - Why Deep Learning uses GPUs

CUDA Programming Course – High-Performance Computing with GPUs

CUDA Simply Explained - GPU vs CPU Parallel Computing for Beginners

Writing Code That Runs FAST on a GPU

Your First CUDA C Program

CUDA by NVIDIA Explained in 60 Seconds #new #CUDA #nvidia #ai #aitechnology #shorts #short #facts

Parallel Processing: Reinventing Gaming & Revolutionizing Industries

What is CUDA? - Computerphile

Learn to Use a CUDA GPU to Dramatically Speed Up Code In Python

Working with CUDA, Device and GPU / CPU in PyTorch #shorts

How NVIDIA CUDA Revolutionized GPU Computing !

C++ CUDA Tutorial: Theory & Setup

How to Write a CUDA Program - Parallel Programming #gtc25 #CUDA

How CUDA is Like Turbo for Your Machine Learning Projects - CUDA, cuDNN and Using Your GPU in ML

Parallel Computing with Nvidia CUDA

PyTorch in 100 Seconds

CUDA On AMD GPUs

CUDA programming Masterclass - learn CUDA

Buying a GPU for Deep Learning? Don't make this MISTAKE! #shorts

The Power of CUDA in AI Development

GTC 2022 - How CUDA Programming Works - Stephen Jones, CUDA Architect, NVIDIA

Learn to use a CUDA GPU to dramatically speed up code in Python.