Image Recognition and Python Part 7

preview_player
Показать описание
This is the seventh video to my image recognition basics series. Image recognition can be used for all sorts of things like facial recognition, identifying what is in pictures, character recognition, and more.

Рекомендации по теме
Комментарии
Автор

If you get "Line 30: ValueError: assignment destination is read-only" (Python 2.7) try checking if you have used the wrong function around lines 40-50:
iar3 = np.asarray(i3)
should be:
iar3 = np.array(i3)

lindsay
Автор

the ubyte_scalar overflow warnings are because the pixels are stored in a ubyte array (max value 255), but the reducer can exceed 255 during its calculation.

the program still works but you can remove the warnings by having it use a larger type instead (e.g. int):
reduce(lambda x, y: int(x) + int(y), eachPix[:3])

caleb
Автор

Hello Harrison, Please Read! as this has kept me busy, debugging ha And anyone else Who has inverted Images

Okay within the video, you're printing eachPix with 5 second delay, they are uint-8 ( for those who do don't know, uint-8 has a max value 0xFF | 255 ),
so you're not getting the true values of the future calculations, when using lambda function, it does seem that the calculations are being done in uint-8 format then being converted to float-64 array via reduce? ( i don't know, this has taken me longer then I would like to admit, a few days, but I have been playing with the kinect Ir/depth sensors).
For example:
for eachPix in eachRow:
avgNum = reduce(lambda x, y: x + y, eachPix[:3]) / len(eachPix[:3])
print((type(avgNum), avgNum, eachPix))

outputs Result:
(<class 'numpy.float64'>, array([255, 242, 0, 255], dtype=uint8)) /3 !=

*changed the following was only for debugging*
avgNum = reduce(lambda x, y: x + y, eachPix[:3]) / 1.0 # <1.0 debugging,
output Result :
(<class 'numpy.float64'>, 241.0, array([255, 242, 0], dtype=uint8)) sense, 255+242+0 = 241 /1.0 =241.0

Adding, dtype=numpy.uint16 to the image:
aImage3 = threshold1(numpy.array(image3, dtype=numpy.uint16))
and reverting back to: avgNum = reduce(lambda x, y: x + y, eachPix[:3]) / len(eachPix[:3])
that you passthrough to the function you will get the correct Results:

(<class 'numpy.float64'>, array([255, 242, 0, 255], dtype=uint16)) <--- 255 + 242 + 0 = 497 / 3 =

My image is still inverted, but we know now why, although I am not sure if this will change anything within the code, perhaps an additional check would need to be done
on the four corners, if black then invert the whole image.

This will also clear the error message people was getting for the Overflow at Runtime.

I did notice I wasn't getting no print/sysout until I closed the window, however that can wait until another day.
one more thing, on your website, this tutorial - part 6, you have code from part 7, no biggie of course just helps anyone, who comes across the site.

ProgrammingWithRook
Автор

To get the average of a list we can use `numpy.mean(list_name)`. In our case we imported numpy as np so we would Use:
average = np.mean(list_name). - I find it is much simpler than using the reduce function.
Example:
l = [1, 2, 3, 4]
average = np.mean(l[:3])
# average will be 2.0

In sentdex's example that would be for the pixel and for the image:
avgNum = np.mean(eachPixel[:3])
balance = np.mean(balanceAr)

.

toutenunmot
Автор

Elliot, for some reason I am unable to respond to your comment via reply. The issue is you've probably left out the conversion away from a numpy array, since it was a read only array when read in. Did you do the conversion to newAr?

Could you possibly paste the full error code with the code it found?

sentdex
Автор

Well, we didnt have to return anything. It is going through the array and editing that array. That's why you don't have to save it.

sentdex
Автор

So, I've been following along and everything's working out well until we get to threshold(iar3), which should return me a picture of a black 0 on a white background.  What I'm getting is a white 0 on a black background.  I'm getting the same thing you are for the other times you used threshold, but on the y0.5.png image, I'm getting the opposite of what you're getting.  Any ideas on why that's happening?  From what I can tell, it looks like the portion of the y0.5.png that creates the 0 is a lighter yellow than the background, but on the other images, the background is lighter than the foreground.  If I use "<" in the if statement that determines if the pixel is above or below the balance threshold, I get the same image as you on y0.5.png, but I get the reverse for the two other images.

stejaka
Автор

Thank you for putting these videos out,  I find it wonderful to hear your thoughts as you code.

A hopefully quick question for you: My Python version (2.76 on Windows 8.0 64bit) is making a fuss every time it runs across one of the lines similar to:

avgNum = reduce(lambda x, y: x + y, eachPix[:3]) / len(eachPix[:3])

they produce runtime warnings that look like so:

RuntimeWarning:  overflow encountered in ubyte_scalars

which I gather is related to the mathing of data types as we are trying to take the color averages, so I was wondering if you could suggest an alternative?

PS.  The code does function despite the runtime warning.

stephenbooth
Автор

Hi sentdex, do you mind explain more about your formula for avgNum and balance? You can also point me to any related source which I could study myself. Thanks!

Jsheng
Автор

I know it's not into the direct topic. However, would you be so nice to create a video in which you show how to make a gaussian filter for an image?? Please!

rodrigoloza
Автор

can you tell me the function of reduce?
i mean about its parameters and the output it gives..

samiasaman
Автор

Ay ay ay you procured reduce out of nowhere, how did that happen?

GregMcRegor
Автор

Hi, Is it possible to recognize a graph and extract the points points plotted

chanukyalakamsani
Автор

Hey Harrison, i get an error when i type in eachPix[3]=255.Can you help me find out why its popping out?

DIYGUY
Автор

i didn`t understand this part avgNum = reduce(lamda x, y : x + y, would you please clarify me
 eachPixel[:3] means from 0 to 2[i.e pix[0]+pix[1]+pix[2]] pixel but what x and y is

tushant
Автор

My code working but after using threshold function on iar1, 2, 3 in graph I get whole black image of iar1, 2, 3 please give me solution

rahultalekar
Автор

Not sure what is going on with my threshold function, works fine on the 0 images but not the logo image. When it goes to do the logo image it returns an all black image with a single white pixel at about x=58 y=15. I ran the function on a random image I had on my computer and it worked fine. Any guess what is the problem?

MattCamp
Автор

Why did you use reduce instead of sum()?

saminchowdhury
Автор

i am using x32 bit python, how can I get that to work? :(

mohamedgabr
Автор

Why does we leave eachPix[3] = 255 in the else portion of the program ?? Should it not be set to zero.. Please Help !

SagarKumar-jzwq