Random header image... Refresh for more!

Category — Crazy Weekend Project 2: Wesley Crusher Removal

Boldly Going…

As I said in the beginning, detecting the faces is only part of the equation.  Just knowing that something in a face isn’t all that useful for me.  I need to know whose face it is.  In order to write that part of the code, I’m going to need a whole bunch of faces to use as training data.  For this to be useful, I’m going to need a large number of views of the same person, as well as sets for different people.  You can’t rightly have something you call “facial recognition” if all it can do is recognize your own face.  So, that means that the webcam is ruled out for this phase.  There’s only so many faces I have around my apartment, and it just won’t do.

There are, of course, other options.  I could try to raid someplace like Flickr and steal someone’s family photos, or scour gossip sites for celebrity images.  However, what I have in mind is much easier and, as you’ll see, has a practical benefit, as well.

Now, I just wrote a face detector.  It’s able to detect faces in video streams. So, why not put it to work on a video stream, detecting faces for me to use as my training set?  I’ll only get faces that I know the system can detect and they’ll already be cropped to the location of the face for me.

STTNGFaceDetection

Obviously, the best source material to use is an episode of Star Trek: The Next Generation.  This has two main benefits.

First, once I get all the faces detected and classified and the recognizer written, I’ll be able to run it against a different episode of TNG as a test.  Many of the people will be the same between episodes, so it should be able to recognize them, but the system should not recognize members of the supporting cast and guest stars, since they’re not in the source episode.  This is important, because I need to make sure it handles people it doesn’t know about correctly.

The second and far more important benefit is that if I am successful, I will have written a program that is able to watch an episode of Star Trek: The Next Generation and let me know if Wesley Crusher is in it, so that I can skip those scenes.  That sort of technology has money-making potential.

November 26, 2009   No Comments

Speed +5, Accuracy +3

As I hoped, tweaking the parameters to the function greatly sped up the processing of the face detection.  It’s now acceptably fast.  Not 120 fps fast, but it’s unlikely that faces in video will change that fast.  It’s running probably ten frames a second, which is good enough for what I wanted.  Additionally, the false positives have been cut down quite a bit.  So, here’s what I did:

The key function in face detection is called CvHaarDetectObjects.  It has something to do with Haar Classification and Cascades and all sorts of other technical mumbo jumbo that I don’t really care about at the moment because someone else has already figured it out and written it all for me.  All I know is that I have to call it with some parameters and the function returns to be a list of faces.  (And things that it thinks are faces because it’s delusional and psychotic…)  It’s not strictly a face detector, it’s a generalized object detector, it just happens to be used for detecting faces.

The first few parameters are straightforward, the image and storage, etc.  Then you get into less apparent parameters, like “Scale Factor” and “Min Neighbors”.

Scale Factor is how much to increase the size of the scanning window between passes.  I think that translates into something like “Higher numbers run faster wil less accuracy.”

Min Neighbors has something to do with overlaps and multiple detected objects or something like that.  Basically, the higher the number, the fewer false positives you get, but if you set it too high, you risk getting fewer true positives, too.

Then there’s a flag parameter which alters how the operation works.  I used the “Do Canny Pruning” value because that’s what someone else said to do.  Supposedly that value make things run faster and result in fewer false positives.

Finally, there’s Min Size.  That’s the size of the scan window to use, which translates roughly to the smallest object you expect to find.  The smaller the window the more things you’ll pick up, but at the expense of speed.

Right now, I’m calling the following and it seems to be doing a pretty good job:

CvSeq sequence = _classifierCascade.HaarDetectObjects(image, _memStorage, 1.2, 30, HaarDetectionType.DoCannyPruning, new CvSize(40, 40));

Now, it’s on to acquiring the training images…

November 26, 2009   No Comments

Next steps…

Now, I need to do three things:

  1. Tweak the face detection settings so it’s not lame-ass slow.
  2. Tweak the face detection settings so that it doesn’t think walls and boxes look like faces.
  3. Get a whole bunch of identifiable faces for detection training.

Steps 1 and 2 might turn out to be difficult and error prone.  I know there were settings like minimum face size and tricks like grayscaling images and performing noise reduction that I skipped which may potentially help those problems. 

Step 3, though…  That I’ve already figured out…

November 26, 2009   No Comments

Putting People In Boxes

So, after fixing the crashing issue, I started up the detection algorithm and let it roll.

FaceDetection1

Despite the fact that I’m in a poorly lit room and wasn’t even looking at the camera, it found me.  I like that it’s sensitive enough for that, since my planned application of this can’t rely on well-lit clear images all the time.

Of course, the sensitivity has a downside…

FalsePositives

I turned on the lights and it still found me, which was good.  However, it also believes that there are faces on the wall, on the ceiling, a big face made up of shelving and boxes, as well as three separate faces on the Commodore Plus/4 box.

It’s confused, of course, because it’s actually the Tomy Tutor box that has all the faces on it.

TomyTutorFaces

So, in summary:

  • Multiple faces: good.
  • Faces where they don’t exist: psychotic and delusional.

Must fix the psychotic and delusional part before I let this loose on the world.

There’s one other slight problem…  The facial detection is pegging my CPU and processing about two frames a second with a two second delay.  That’s not going to be acceptable, either…

November 26, 2009   No Comments

When In Doubt, Link Debug.

So, there’s apparently a bug somewhere in OpenCV that happens when you try to load a Haar Classifier file.  The error you get is the following:

—————————
OpenCV GUI Error Handler
—————————
Unspecified error (The node does not represent a user object (unknown type?))
in function cvRead, .\cxpersistence.cpp(5061)

Press “Abort” to terminate application.
Press “Retry” to debug (if the app is running under debugger).
Press “Ignore” to continue (this is not safe).

—————————
Abort   Retry   Ignore  
—————————

I’ve found references to this crash which indicate that the error has been happening for several years and several iterations and was not fixed in the version I’m using from early September.  (It may be gone in 2.0, not sure.)  The “fix” that most people recommend is to link against cvd.lib, rather than cv.lib. 

Now…  I haven’t fully researched this since I’m not dealing with the C libraries directly, but from the looks of it, they’re suggesting that you link against the debug build of the library and your problems will go away.

What a wonderful fix for a bug in a high-performance math-heavy visual processing library!

There is, however, a less commonly described workaround that won’t be totally dumb.  You do some other operation before calling load, such as creating a junk image and doing something to it. This is what I used and it worked for me:

//Workaround for crashing bug while loading classifier.
IplImage fakeImage = new IplImage(100, 100, BitDepth.U8, 1);
fakeImage.Erode(fakeImage);
fakeImage.Dispose();

November 26, 2009   No Comments

i++: Unreachable Code Detected

UnreachableCode

Apparently for loops are taking the day off.

November 26, 2009   No Comments

Version 2.0?

Looks like OpenCV has leveled up since early September.  It’s now at 2.0, up from 1.1pre1a.

Oh well.  Not upgrading now.  Just note that if you’re taking any of the code that’s here, it might not work the same way in version 2.  Hooray for instant obsolesence!

November 26, 2009   No Comments

So… Facial Recognition.

The second part of the project is to attempt to recognize faces.  Not just say that something is a face, but specifically, whose face it is.  This will be done in two parts.  First, it needs to detect a face in an image.  That should be fairly easy.  OpenCV has functions and data for face detection built in.  Then the recognition needs to happen on the detected face.  That won’t be quite as easy, because it’s going to involve training the system to figure out who is who.  In other words, the recognition phase will require actual work.

November 26, 2009   No Comments

End of Day 1

So, at the end of the first day, one thing is clear:  It’s not just easy to use the speech recognition libraries in .Net 3.0’s System.Speech.Recognition libraries, it’s really really freaking easy to use them.  It’s just a handful of lines to start recognizing speech.

The harder part is figuring out what you want to recognize it and how to deal with it after you’ve recognized the speech.

There’s really not that much left for me to do with the speech recognition at this point.  I wanted to spent time to get it working but it didn’t actually take any time to get it working…  Any more work will be toward a specific use, which I wasn’t really planning on doing here.  So then, that means tomorrow will focus on the facial recognition side of the project.

November 25, 2009   No Comments

I need to implement this property on myself.

November 25, 2009   No Comments