Kinect-tac-toe: Improvments learned by messing with O data

When I retraced my steps that I had done for the first 6 weeks with my X data, and translated it to training a classifier for an O I discovered multiple things I can do better, and I even fixed a bug! (And by fixing the bug I got even better accuracy :D).

Improvements:
1) Intra-rater reliability, just like David. But instead of a trained dermatoligist classify wrinkles (and ranking them in a different order), it is me labeling a positive X or a positive O. Everytime I extract what is an positive X or and positive O from a image, I extract slightly different areas.

To attempt to compensate for this, for each training frame I extract the positive example twice. What I learned from this is that the area that I, as a human, classify as the X or the O being made but the human can vary up to 5% in its size/location. While not much, I felt it was significent enough to matter.

2) Throwing out the top and bottom 10% is not ALWAYS doing what I want it to do. What I was attempting to achieve with this was a quick hack to throw out most false positives. It depends on where in the image the X or O is, and the amount of data, but for the most part it doesn't do this.

Proof on concept, and example where it does work: In very small numbers throwing out top and bottom 10% of data does what I want it to do. For the following I only used 4 frames of a sequence.

Without throwing at data: (computer thinks center of O is somewhere within the green).

Throwing out top and bottom 10%

As you can see, a couple false positives when dealing with a small amount of data GREATLY reduce the accuracy. Even though when throwing out the top and bottom 10% make it so the correct area is not selected, the answer is a lot more precise. However, again this is with a crazy small amount of data.

Changed this to weighing each positve based on how far it is from the mean.
Without any weighing:

First attempt: Some improvements, but not as much as I hoped.

Second attempt: Ooohhh yeaahhhh.

Sample final result of 10 frames (Note: This result is FAR BETTER THAN THE AVERAGE RESULT WILL BE), and is pretty much as good as I will be able to make it.

3) Use the positive examples that I used to train to determine what ratio of boxes to scan when looking for the X or O. What this means is that when I labeled all my positive X's, they all roughly had the same ratio of size. This ratio did not really change based on distance from the Kinect and hand size (this is a good thing). When scanning frames for an X, I was using a ratio of 4/3 width-to-height (1.33333), the actual ratio is 1.336. Close enough. However for Os the actual ratio is 1.848, which is quite different than 1.33.

Bug Fixed!

Due to bug by preforming a shallow copy instead of a deep copy, I was incorrectly passing in data to be scanned by the X classifier. The exact bug was that once I found a positive on an image the pixels were modified to label this region based on the tree that detected it as a positive, and this modified data was then passed on for further scannings.

Kinect-tac-toe

Tuesday, May 31, 2011

Improvments learned by messing with O data

No comments:

Post a Comment