Google+ My Python Projects: Draw in the air with OpenCV Python:Part II(Explanation) Google+

Tuesday, September 10, 2013

Draw in the air with OpenCV Python:Part II(Explanation)

Note: This explanation  is written for people who have little experience in Image processing techniques.It may sound trivial to many of you out there.Sorry for the inconvenience,if any.

The essence of this project' idea is the color thresholding to differentiate objects in the frame.Using color thresholding we map the original(color) frame into a binary frame where the pixels with our color of interest is mapped into pixels with one value(1 or 255) and all other pixels with another value(0 or zero). In our little project the binary frame is shown in the window 'Eroded',and you may notice it as black/white image.

Let us start with color space,the color space used in this project is the HSV(Hue.Saturation and Value) or HSI.Which is also a 3 channel color space like the popular RGB many of you know.

Here is the color space visualized using a cone.

You can know about this more in wikipedia.

What's important is the fact that it is more immune to lighting variations when compared to the ordinary RGB color space.That's one of the reason why we use this non-linear color space in this project.The other reason is that its convenience,that you will understand later.

In OpenCV instead of the usual ranges of HSV values used commonly, it follows a slightly different set of range for values of H,S and V.

H --> 0 to 180 (instead of 0 to 360)
S --> 0 to 255
V --> 0 to 255

In the first stage of this program we threshold the acquired frame based on the color of the object with which we like to draw in the air(in front of the camera of course). So we use trial and error to find out the maximum and minimum HSV set of our object of interest.The object with red color is a little bit trickier because its HSV set covers both the higher end(-->180) of the Hue as well as Lower end(-->0).In this case we would have to check for both conditions by using an 'or' logic in 3rd line of 'preprocess' method.

So the cv2.inRange() do the trick for us.It returns 255 to the thresholded image for each corresponding pixel in the actual frame which have the HSV values in the range between blue_low and blue_hi(these are just variables) and other pixels are set to 0.

What we now have is a binary image which represent the places in which we have our color of interest(in my program its blue).

In the resulting binary image we can see one/many cluster(s) of pixels which have threshold value of 255(white in color).These will be called as 'Blobs' from now on.

If the captured frame is too noisy we can see many white dots in the thresholded frame.These create a lot of problems, and our aim is to minimize these noisy blobs.For that we select the blobs which possess particular range of Aspect ratio and Area(which have maximum values just above the dimensions of our object and minimum value just below the dimensions of the object). This is what I have done using the conditional statements in 'segment' method.And this method also draws rectangle around the valid blobs(which 'survives' the segmentation based on the area and aspect ratio constraints).And also returns the center pixel co-ordinates of the first blob which satisfies the constraints.


The center of the object(in the current frame) is thus obtained,and a line drawn between the previous center(obtained from previous frame) and the current center.Thus it gives an effect that these lines follow the path of the object.

Note : Since we are taking the center of the first blob which satisfies the constraints the drawn path may deviate from the actual path of the object if the noises are large enough and pass the constraints first.

The Pygame Part

Here I have used the pygame module to draw the path of the object in the screen.The pygame.draw.line() draws a line between the to pixel co-ordinates specified inside the argument list.The first parameter is the window in which it has to be drawn and the second is color in which the line has to be drawn(in RGB format) and the last two are the pixel co-ordinates.


So that's it.If you want more explanations or wish to give a feedback please comment here.

No comments:

Post a Comment