Multiprocessing - Intermediate Python Programming p.10

preview_player
Показать описание
Welcome to part 10 of the intermediate Python programming tutorial series. In this part, we're going to talk about the built-in library: multiprocessing.

Let us take a moment to talk about the GIL. The GIL stands for Global Interpreter Lock. What the GIL does for us is... hmm... well it serves as a sort of memory management safeguard. Sounds like a good idea I guess, but not really. The real problem is that, since the GIL has existed, people have built infrastructure around it. There are better options today, but ripping out the GIL would be catastrophic.

Alright, so the GIL is here to stay for probably a while. What's that mean? Python is single threaded. Even if you use threading, Python runs on a single CPU. If you have 4 physical cores, your computer probably thinks you have 8, and you're using 1 of those 8, including when threading. All threading really lets you do is access idle threads, and nothing more. It's not using more power, its just using idle power. If you monitor your CPU %, you might see that you're only using 10 or 15%, instead of the desired 100%, or at least close to it.

Well that stinks! What can we do?! Enter multiprocessing!

Рекомендации по теме
Комментарии
Автор

Args takes a tuple and python recognizes a tuple by the comma. Try this: t = (1, ) type(t)=<class 'tuple'> where as t = (1) type(t) = <class 'int'>

PhantomKenTen
Автор

For those a bit confused about why we need multiprocessing at all, there are a number of scenarios when you might want that extra power. Or when you want to run more than one process at the same time.
For example, say you have made yourself a Youtube backup tool that downloads all your own videos. Yes, there is a tool for that in Python that lets you do that in less than 20 lines of code. It's awesome. BUT it only downloads one video at a time, which is a real hassle, since it takes about 10 seconds to download each video. And if you're a music producer and want to convert it to MP3 or FLAC instead of keeping it as a video (and of course at the highest possible bitrate), it will take up to 30 seconds to convert as well. So now you're sitting there, with 150 songs waiting to download, taking 40 seconds per song.

Or you could use multiprocessing and start 10 processes at a time, cutting your waiting time down from 1 hour 40 minutes to just 10 minutes.

Or if you have a very time-consuming task that uses a lot of calculations, like rendering video, and have access to a lot of computer cores, like a stack of Raspberry Pi's, you can connect them all in a "supercluster" (or a supercomputer) using multiprocessing, letting all cores of all computers act like they were part of one single computer and render one frame each at a time. This is what major film studios do when they render their projects (but with proper and purpose built computers, not microcomputers, principle is the same though).

The example given in the video, web spiders, is another great example. Usually, the program will only look at a single URL at a time, but with multiprocessing, even running at a single core, you can look at several links at a time.

morphman
Автор

When I ran the multiprocess loop for 100 iterations, my CPU maxed out. But I went with 500 to follow the tutorial, and the consequences were never to be :forgotten!

prashantsingh
Автор

(i, ) is the syntax for creating a tuple with a single element, and multiprocessing.Process expects args to be a tuple

slapusillydawg
Автор

Followed the tutorial and made the 500 processes with infinite loops in each of them. CPU got max out at around 130; Everything got freezed out, panicked and started to force close the running script which got cancelled after sometime in the meantime I think Windows killed a lot of background operations which for some reason it runs. Now my PC run smoother and better. It seems I made myself a poor mans Ram Cleaner. Not sure what it's consequences are going to be. But PC runs a hell lot faster now

vishalAmbreappbuzz
Автор

Thx a lot for all you've shared, half of what I know in programming comes from your channel. A very generous gesture and rare enthusiasm mix makes you one of the best teachers around ! An idea for a next video on intermediate python serie : create and distribute a python package (installable from pip)?

JulienPy
Автор

Process loops over the elements of args, so you need a sequence. Adding the comma makes it a one-element tuple.

huegckb
Автор

thank you, this video should show up as the very first result when anyone searches for multiprocessing in python tutorial.

muhammadsarimmehdi
Автор

8:08 because the function takes a tuple as an argument
and adding a ', ' makes it a tuple.

For example:
type((3)) # <class 'ini'>
type((3, )) # <class 'tuple'>

eddie
Автор

There have already been a few comments on the tuple question raised in the video, I thought I would add a little more details about why it works the way it does:
It all boils down to how the python language parser works. The parenthesis can be used for either a tuple or to establish priority in an expression:
Example:
y = (1 + 2) * 3 # Result 9 (not a tuple)

In this case, the parens are used for priority in the order of operation. If (1 + 2) was converted to a tuple then the math would be wrong.
We could write a more simple example of
y = (1 + 2) # Result 3 (still not a tuple)

If we want to have it be a tuple instead, then we need to have a way to tell the parser to consider the value within the parens to be a tuple - the easy thing to do is add a trailing coma:
y = (1 + 2, ) # y is a tuple containing a single value of 3.

We could also simplify this by writing it as:
y = (3, ) # same as above

It is also important to note that () is also a tuple (note that there are no coma's in there.) That is because the parser is able to distinguish the fact that there is no expression within the parens and it can figure out that it's a tuple.
y = () # empty tuple

AdamBecker
Автор

@sentdex #8:06 The comma is necessary so that the interpreter understands that it is a one element tuple. If there is no comma, the parentheses are not enough to identify it as a tuple. Thanks for your great videos!

DavidBudaghyan
Автор

How cool is that. I just started using multiprocessing yesterday at work and you post a video about it. Thanks!

Xblade
Автор

8:20 The reason you need the coma is because it means it's a tuple. Interesting you didn't know this because I learnt it from your video :P

Example:
i = 5
type((i))
#<class 'int'>

type((i, ))
#<class 'tuple'>

janekmuric
Автор

the trailing comma is required to explicitly define the single variable as a tuple, as is required by the method arguments, so when Python unpacks this into *args it will unpack the whole variable and not as separate arguments.

dave
Автор

Can't wait for the next vid! Your doing great and I love this seiries!

nic
Автор

shouldn't the p.join() be outside of for loop ? That makes more sense, right ? Else, it would be simply sequential processing. Put time.sleep(2) to see the difference

dsinghr
Автор

brilliant examples and fantastic explanation

random_act
Автор

You can use generators to simulate channels (from Go) so you can thread safe communicate through them

codruterdei
Автор

The comma is required because otherwise the interpreter interprets the parentheses as mathematical parentheses and not a tuple construct.

AdrianHannah
Автор

Hi I'm still learning and struggling to understand how using join is no different from just doing normal loops without processing, since the next process waits for the last one to finish, just like how the next loop starts after the previous one is finished. Thanks

thefamousdjx