Random header image... Refresh for more!

Recognize, yo!

Recognize Successful

I now have a basic recognizer working off of a video recording of the game.  It’s finding the top and bottom boundaries, and most of the time, it knows where the left and right paddles and the ball are.  There are still occasional glitches, like where an especially prominent piece of noise decides it wants to be the ball, or where for some reason, there are nearly overlapping contours found over one of the paddles, however, I think it’s at a point where I can proceed to the gameplay logic calculations.

I’m sure there’s some good reason that OpenCV is returning duplicate or near duplicate contours, and I’m sure there’s some known way to deal with the problem.  I probably should read the book a bit more…

Nah, the book’s only for when I get completely stuck.

Anyway, here’s a description of what I’ve done so far, for those of you playing at home:

  1. Pull a frame from a pre-recorded video of Pong.  This frame is in the upper left video box.
  2. Run the frame through a Gaussian smoothing operation twice to tone down the annoying noise in the video.
  3. Turn the frame greyscale for the next operation.
  4. Run a Canny edge detector on the image.  The results of this are in the upper right box.
  5. Feed the Canny image into a contour finder.  The results, shown in the lower left, look pretty much the same as the Canny image, but the difference is that the Canny results are a regular bitmap image, while the contours are sets of vectors that are more easily manipulated.
  6. The contours are run through in order to find the playfield elements.  The assumptions used to find the playfield elements are as follows:
    1. Anything greater than 70% of the width of the screen is either the top boundary or the bottom boundary.  If it’s above the middle of the screen, it’s the top, if it’s below the middle, it’s the bottom.
    2. The ball and paddles are between the top and bottom boundaries.  This lets me throw out the score.
    3. While I’m at it, throw out any contours that are really dark or black.  They’re probably just noise.
    4. Sort what’s left by area, descending.
    5. The two biggest things are the left and right paddles.  Whichever one is farthest left is the left paddle.
    6. The third biggest thing is the ball.
    7. Everything else is noise.
  7. I package up the recognized playfield elements into a nice class containing a bunch of rectangles.  These rectangles are drawn in the window on the lower right.

Going forward, I’ll keep a list of the recognized playfield elements from the last X frames and use them to calculate the position and trajectory of the ball.  Since I know where the ball is across multiple frames and where the boundaries are, I can extrapolate where the paddle needs to be  The goal for this phase will be to get the calculations written and to display an overlay on a live game that will tell me exactly where I need to place my paddle as soon as the ball starts heading my way.  Most of this is pure algebra, but there may be complications if the recognition isn’t good enough.

The code has been checked in up to this point.  https://mathpirate.net/svn/


There are no comments yet...

Kick things off by filling out the form below.

Leave a Comment