Older blog entries for motters (starting at number 77)

Whilst there is a big song and dance about the Kinect progress on Sentience continues. Since they're not prohibitively expensive and I have plenty of time on my hands I'll try to acquire a Kinect and evaluate how suitable it is for robotics uses. Willow Garage already seem to be doing something Kinect related.

A new dense stereo algorithm called ELAS, developed by Andreas Geiger, has been added to the v4l2stereo utility. This works well and at a reasonable frame rate on the Minoru. It's probably the best dense stereo method that I've tried to date.


It may turn out that the structured light method which the Kinect uses isn't very useful outdoors, or suffers from interference when multiple units are used in close proximity, so there may still be a place for stereo vision as a depth sensing method.

8 Apr 2010 (updated 8 Apr 2010 at 20:56 UTC) »

Whilst testing out omnidirectional stereo vision I thought it would be a good idea to try to apply a dense stereo method to images like the ones used to produce this anaglyph.


Here the disparity is vertical rather than horizontal as is usually the case for stereo cameras.

However, it became apparent that I don't yet have a dense stereo algorithm for v4l2stereo, so I decided to take some time out to develop one, with the hope being that whatever is developed in a conventional stereo vision setup can be similarly applied to the omnidirectional case.

The stereo correspondence method which I've used for dense stereo is a fairly conventional one, and I've made extensive use of openmp to make it as multi-core scalable as possible. This uses the simple "patch matching" approach which is commonly described in the literature, but works reasonably well on the Minoru provided that some initial correction is done to make the colour mean and variance in the left and right images as similar as possible, so that comparing pixels becomes a less haphazard affair.

An example of the end result is the "big blob detection" reminiscent of what I had running on the Rodney humanoid over five years ago appears in the following video.


The depth resolution isn't fantastic, but it's functional and may be of use for obstacle detection or just detecting people nearby.

Also I wanted to experiment with integrating this code with the Willow Garage ROS system. This would potentially enable very expensive stereo cameras traditionally used in academic research to be replaced by something like a Minoru webcam, or a pair of webcams, which would be affordable to the hobbyist. The current source code release for v4l2stereo includes example ROS publisher and subscriber, which should make integration with ROS based robots into a fairly straightforward process.



The current plan is to attempt to construct a 2D map based upon the features from omnidirectional stereo vision. I can locate edge features close to the ground plane quite well, but the trouble with edges is that they're not very unique. I could use the edge data, projected into cartesian coordinates, to begin building a local map, but after a short time the map would begin to degenerate.

So what's needed are more unique features rather than edges. These could be tracked between frames (data association), and I could then use an off-the-shelf graph based SLAM algorithm, such as TORO to build a map. At first I thought of using SIFT, which would be the obvious choice if I were an academic researcher, but there are software patent issues associated with that method that I'd rather not have to deal with. FAST corners would be nice, but the relatively low resolution caused by the mirror distortion means that this algorithm doesn't work well. But I can use the Harris corner features from "good features to track" which is already built into OpenCV. Having been an OpenCV refusenick for quite a number of years I'm now slowly growing to like it. Harris corners seem to work quite reliably, despite the low resolution.

Detecting the ground using an omnidirectional stereo vision system.


The green features in the centre mirror have been identified as being close to the ground plane.

Edge features close to the ground plane are detected by projecting all features from the centre mirror to the ground plane (the height of the camera is known), then reprojecting the ground features back into the image plane of the four peripheral mirrors. The reverse operation is then applied, and features within the centre mirror are compared. Features with small reprojection error must belong somewhere close to the ground plane.

This provides a convenient and general way of locating the ground, which does not depend upon unreliable texture, colour histogram or image segmentation methods.

A brief explanation about the new combined stereo and omnidirectional vision system on the GROK2 robot.


Having tried the classical space carving/voxel colouring techniques, and found them wanting, I'm going to try simpler methods such as edge, line and motion detection. If the robot is stationary an observed moving object, like a person, should show up well and be easy to triangulate.

1 Feb 2010 (updated 1 Feb 2010 at 23:35 UTC) »

Looks like it's been a while since I last blogged here, so here's an update. In the last six months I've mainly been writing more stereo vision code. This was primarily for use with the Surveyor SVS, but I also wrote a version which runs on a PC under Linux.


v4l2stereo was initially written as an easy way in which to test the stereo algorithm before transferring it to the blackfin, but later developed into a piece of software in its own right. You can see an example of the stereo disparities obtained with the Minoru webcam here:


The Minoru only has a short 6cm baseline, so the effective stereo range is not very great (probably less than two metres), but it works well on Linux. As is always the case with webcams the image capture is not synchronized, so if the cameras are moving quickly the delay does become a problem - but for most slow moving robots would be ok.

A recent nice feature of v4l2stereo is that it can be run in "headless" mode with no GUI output to the screen and also can stream the image over a network using gstreamer.

I also replaced Rodney's head with a simplified version which has the Minoru webcam mounted on it.


As of December 2009 I've also been experimenting with omnidirectional vision. I saw one of the videos from Pi Robot a while back, and had been meaning to try out something similar using a Christmas tree decoration as a spherical mirror. This actually works very well, and I've now created a new project called Omniclops for this code, since I didn't want to mix it up with anything else.


Fortunately the geometry for a spherical mirror is pretty simple to deal with, and the results look promising. So promising in fact that it's a cause for regret that I didn't try doing this many years ago. I've been aware of this type of vision for at least a decade, but the mirrors always looked too exotic or expensive to be worth bothering with, and the idea of making a parabolic mirror by hand without milling machinery seemed like probably something which wouldn't be very successful.

Whilst fooling around with omnidirectional vision using a Christmas decoration a thought occurred to me. Could I somehow combine stereo vision with omnidirectional vision, so that objects could be ranged without needing to do structure from motion? At first I just thought of using a couple of mirrors with one of the stereo cameras, but then I thought why not just use a single camera with a wide field of view looking at multiple mirrors spaced some distance apart. This seems like a good way to do things for the following reasons:

- You only need a single camera

- There are no camera synchronisation issues

- There are no illumination/colour correction issues

- Ultra wide field of view compared to conventional stereo vision

- Very cheap to build

On the down side the resolution of the image within each mirror is rather low, but this probably isn't a major handicap. Also the geometry is more complex than for ordinary stereo vision, but not prohibitively so. I lashed up a prototype from aluminium and cardboard, using five mirrors made from Christmas decorations (carefully) sawn in half to make hemispheres. You can see the resulting effect like so:


This is effectively the same as having five cameras with overlapping fields of view and fisheye lenses. Currently I'm thinking that this approach may be well suited to voxel coloring/space carving volumetric techniques, since it complies with the simple plane ordering constraint and the positions of the mirrors are known.

Workwise, I'm pretty much unemployed now - like a lot of software engineers at present - so I can work on this full time and see if I can get any useful volumetric modeling.

Calibrating the pan and tilt mechanism (again)


I've adapted older code to simplify this procedure and make it more integrated with the rest of the system.

Views from both cameras as an animated gif.


The effective stereo range with the cameras spaced 12cm apart looks like it's 4-5 metres. Objects in the far distance shouldn't appear to move.

GROK2 gets new cameras.


A brief guide to using the Minoru stereo webcam.


It seems to me that this device might be quite useful for robot projects. It wasn't very long ago that such as device would cost a couple of thousand dollars or more.

In addition to the feature based stereo I may also try implementing a dense stereo algorithm. My thoughts on using this as a replacement for the cameras on GROK2 are that the baseline is probably a little on the short side, but that it probably would work.

68 older entries...

Share this page