Monday, May 16, 2011

Math is cool

This week I worked on locating the most probably location for an X (or and O) based on the data from the multiple trees. When given a live stream of data from the kinect this will probably be the last second or so of data (30 frames). However since my desktop (which I have been using for the live data from kinect) is still broken I worked with what I had and use the static images I had and matlab to experiment.

This is what I came up with and seems to be working the best. I do not know if it will be able to be run in real time, but if it is unable to it will at least be close, and should only require a few modifications to get it to real-time runnable.


1) Scan over all image with 3 different block sizes, 1/6, 1/7 and 1/8 of image. Run each block in all 3 three trees. Run boxes with a 25% overlay. This will result in a lot of positives, some will be false positives. However, the highest concentration of positives should be the correct area.
2) Keep the data of the last however amount of frames, and for each new frame add the data to this collection.
3) Throw out the top and bottom 10% of data.
4) Find the area where the CENTER of the X (or O) should be by using the mean of the remaining data and the standard deviation. So for the results below the computers best guess for the CENTER of the X is the region within the green box, the box does NOT outline the whole area of the X.

Results!
Note: Since I did not have access to a live stream of data, I manually picked out images where the X's are close by. The final image where the box is drawn on is just the first image of the series that was passed in as the.

Some of the images that were passed in for this (18 images where passed in, not going to upload them all)






Some more result images.






Since I was not using a live stream of data, and I was instead eye-balling grouping images together that appeared to Xs in the same area of the image the final results are not as good as they will be once using real live stream data. Also, as you can see on the results the computer is less certain about the Y coordinate of the image, but generally pretty sure about the X coordinate. This makes sense as most false positives are heads, elbows, and knees/upper legs. These results throw off the Y coordinate more than the X.

No comments:

Post a Comment