Older blog entries for motters (starting at number 21)

In order to get better occupancy grid results I'm trying to improve the accuracy of the stereo matching algorithm.

Pretty much all the stereo algorithms which other people have devised perform abominably, even on high quality images. There are some examples of stereo matching from various researchers which at first glance appear to give impressive results, but when I substitute in my own images these algorithms produce an unintelligable fog. Traditionally stereo matching algorithms look for unique feature points in both images and then attempt to match them. The only system to date which performs half decently is one devised by Stan Birchfield, and my own system is based on his idea.

Instead of trying to look for identifiable feature points I decompose the image into a set of horizontal slices, where eash slice represents an area of continuous colour or texture. I then attempt to match the slices between the two images. Matching of the slices occurs using the weighted sum of various attributes, such as colour, length and position relative to neerby slices.

Until recently my stereo matching system only used mono images. The original colour images were converted to mono and then matched. However, I found that using colour gives a huge improvement in matching accuracy.

The algorithm that I've got at the moment is pretty good, but there is still room for improvement. Matching of horizontal slices is now about 90% accurate, but the estimation of distance based upon horizontal displacement still seems fairly inaccurate.

I'm developing the new stereo algorithm in isolation at the moment using VB and taking images from the robot in order to test the accuracy of the depth images. The system runs sluggishly in VB, but once perfected I'll tansfer it back to C++.

I've now done some reorganisation of Rodney's control system and turned it into a subsumption architecture. Previously the main part of the control program - called "executive control" - wasn't particularly well structured, having been hacked about rather a lot during development.

Using the subsumption system makes it much easier for me to add and test new behaviors without disturbing older ones too much.

Using the new ASC-16 controller I've been able to get significantly better movement of the robot, with control over speed and acceleration of the servos. Consequently I've improved the "blink" behavior so that it looks a lot more realistic.

I've implemented a crude 3D occupancy grid in the style of Moravec. This doesn't work brilliantly at the moment, and I'm not sure that my maths is correct for transforming camera coordinates into 3D coordinates. It needs more testing and debugging, but I think in principle I should be able to get it to work (although it won't be as accurate as Moravec's system).

I've now got Rodney version 4 up and running, and have been working on his head and eye movements. The system which I've used is similar to the previous tracking system. It's still pretty simple and uses a "centre of gravity" type algorithm to calculate a target position for the eyes.

It took quite a long time to tune the feedback loop which controls the saccade movements of the eyes, but I've now got them moving fairly accurately. If I place any object within stereo vision range the robot quickly moves its eyes and orients its head/torso to focus on it. The head and torso movements are just slaved from the eye movements, such that if the eyes move towards the limits of their motion the head and body make compensating movements to bring the eyes back to a central position. This orientation behavior looks fairly natural, and fixation of a nearby target by the eyes is almost completely accurate.

A new feature which I've added is a visual attention system. This is based on a similar system used on the MIT Cog and Kismet robots. Rodney presently pays attention to both stereo and motion cues. Physical movement of the robot's head supresses attention to visual motion in a classical subsumption style.

There are a couple of things to work on next. Firstly Rodney's behavior at the moment is entired "bottom up". He is entirely driven by events in his immediate environment. I need to add some internal goals and motivations which help to structure the behavior in a rational manner (for example a desire to explore his surroundings, rather than just being obsessed by whatever object is closest to him). This leads on to the second thing which I'll have a go at, which is a 3D occupancy grid similar to those devised by Hans Moravec. Using some crude stereo methods and servo position feedback I may be able to have the robot build up a rough 3D model of his immediate surroundings.

19 Jan 2003 (updated 19 Jan 2003 at 22:25 UTC) »

I've made some mechanical and electronics changes to Rodney. The head has been completely rebuilt to accomodate the new firewire cameras, and the eyes can now virge together or move from side to side. There have also be en some minor changes to the torso and the new ASC-16 servo controller has been fitted.

I'll be testing out the new robot over the nextfew days. It's possible that the improved movement of the eyes may help me to devise a much simpler method for calculating the distance to an object by using the vigence angle together with some sort of disparity minimisation function.

I've made a modified version of the stereo vision system to detect faces. The face detector looks for 'head and shoulders' type shapes within the stereo image, and doesn't rely on colour as most similar systems do.

The face detector seems to work well provided that the person is within 2 metres of the robot (the effective range of the stereo matching), and I think I could make further additions to estimate the gaze direction of the person and also guess their identity.

I've now finished the first version of my stereoscopic vision demo for Windows. Formerly this only worked under linux, but I've been able to make my way through the jungle of DirectShow to produce a Windows version of the same thing.

The windows version performs significantly faster, since there are no problems with support for the camera image compression.

Since the stereo vision system doesn't rely upon motion detection or having the cameras in a fixed position it would be particularly suitable for mobile robotics. You could use it for visual obstacle avoidance, but my first thought is to use it to produce a more reliable face location system. I've tested the system on my 1.7GHz PC and I think that sort of processing speed is probably the minimum for practical applications.

More tomfoolery continues in the Rodney camp. I've now got both quickcams working under Redhat linux 8.0 and devised a stereo vision demo using a matching algorithm devised by Stan Birchfield.

The matching is far from perfect and frequently subject to phantom detections, but it does seem to be the most successful algorithm that I've come across so far.

Running the quickcams under linux is far from ideal and gives an agonisingly slow frame rate because the images are transfered from the cameras in an uncompressed format.

You can find the source code at http://www.fuzzgun.btinternet.co.uk/rodney/vision.htm

Have been experimenting with various webcams which I might use for the next version of Rodney. All the cameras which I've tried have Direct Media compatible drivers, so that I can access a pair of cameras simultaneously.

The tiny Digital Dream L'espion is about the size of a matchbox and produces an image which is similar in quality to the Quickcam Express. It's very lightweight and would be ideal for a robot of some sort, but I think the way that the USB cable plugs into the side might make it awkward to incorperate into a stereo vision head.

I also have an old pair of USB ZoomCams. I originally intended to use these on the first version of Rodney, but decided against it because the drivers under windows 98 were a bit dodgy to put it mildly. However, under winXP the drivers seem ok. The main advantage of this camera is the quality of the image, which looks significantly cleaner and less grainy than the Quickcams. If I take off the outer shell of the camera to expose the bare circuit it looks like it would be relatively easy to mount this so that it could have both vertical and horizontal movement. I could maybe use half of a ping-pong ball to cover the electronics and make it look aesthetically more like an eye.

I've now got hold of an ASC-16 servo control board from www.medonis.com, which is going to replace the miniSSCs in the next version of Rodney.

The new board will be better because it includes speed/acceleration control in hardware, whereas previously all that was done in software. It also includes a few analogue inputs which might be useful, and it runs off a single 5V power supply, wereas the SSC needed two separate supplies.

I've tried the recently released open edition of Borland's Kylix 3. It looks nice, and completely identical to the windows equivalent C++ Builder which I've been using to develop Rodney's vision system.

Actually, I chose C++ Builder precisely becase I knew that Borland were developing a Linux version. This should mean that I can port Rodney's code to linux with minimal effort, whilst not losing the modern drag and drop type development.

There are a couple of unknowns when dealing with Linux. Firstly is the question of how does video for linux work, and does it support simultaneous access of two cameras. Secondly is the question of how serial comms works under linux. Under windows I'm using the MScomm control, and presumably there is something equivalent in linux.

12 older entries...

Share this page