Older blog entries for rudybrian (starting at number 30)

Zaza 's replacement belts and motors arrived on the 3rd, but there is a problem the encoder on one of the motors. The new motor uses the current HP encoder technology, and a different cable from the old one. Unfortunately iRobot didn't ship the needed adapter cable with the order, so we have to wait yet another week before it arrives and can do the hardware checkout and qualification before returning Zaza to service.

On a positive note, I spent some time improving the client-server communication schema for the voiceServer and face applet over the last few weeks. The old method used a CGI-based polling technique that was rather slow and inefficient even with mod_perl. The new method uses a Perl POE-based server-side interface (POEfaceClient) to manage each of the connected face applets. This method greatly reduces the amount of handshaking needed to keep current with the voiceServer's cue stack. Another advantage that using shared memory provides is that backward compatablity is retained, so clients connecting through a firewall can still use the old interface method if needed.

The new client handshaking technique required re-writing several key areas of the face applet, so I re-named it zazaface2 so there won't be any caching problems with older browsers. The 2.01 release supports auto-reconnect on socket error, and can now handle server disconnect gracefully. I'll probably add an auto-fallback to CGI handshaking option in 2.02 and put a few of the options as applet parameters instead of hard-coding them.

The Tech received a rather sizable donation of Intel PRO/Wireless 5000 (802.11a and 802.11b) wireless LAN gear and is getting ready to install it. This opens up all kinds of options for Zaza, including real-time high-rate video and audio streaming. I have started the search for a ethernet-802.11a bridge that will alow use of both of the onboard computers, or alternatively a PCI card that has linux drivers available and an external entenna that can be positioned somewhere in the acrylic hood...

It ends up that Zaza's base troubles are a bit worse than I had originally estimated. After pulling each of the motors and running them through a few tests, I found that all three drive (translation) motors are bad. Fortunately, the Tech is closed for renovation the entire month of September, so we should be able to get her back up and running by the time the museum re-opens.

During the month-long downtime we should be able to make a few other enhancements like adding cooling and speaker vents in the robot's upper-enclosure, similar to what the newer B21r's have. This should allow us to keep the enclosure doors closed all the time without the potential for overheating. The speaker audio vents should help improve speech intelligibility too.

I recently added support for Sable markup to the voiceServer. There are some quirks, but it's now possible to re-introduce support for the sound effects we used with the original VB-based face application. Use of Sable also alows support for multiple voices and languages, as well as specifying pronunciation and inflection to improve the quality of the speech.

It's been a while since the last post, so I guess it's time ;)

We have had to push back the Zaza Phase IV deployment plans until we can resolve a voice intelligibility problem introduced with the new speech system. I had originally been using the MBROLA 'us1' voice with Festival, but both the quality and pitch of the voice were too low. I made several attempts to improve recognition accuracy by pitch-shifting the waveform and applying high pass filters to block-out some of the resonant frequencies of the enclosure, but they seemed to have no effect on intelligibility. I recently switched to the OGI CSLU 'tll' voice, which isn't quite as good as the M$ SAPI4/5 'Mary' voice, but a marked improvement over the MBROLA voice. Initial tests this past weekend showed a remarkable improvement in recognition accuracy. Some of the informational text monologs for the exhibit locations were taken directly from the museum's webpage, and should probably be re-worded to improve pronunciation with the TTS engine. It might also be beneficial to introduce support for Sable in the near future.

Zaza had another major hardware failure on Saturday. Since her hardware has used quite a bit, the belts that link all of the wheels to the motors are worn and due for replacement. It also looks, rather sounds, like one of the motors is in desperate need of new brushes. The amount of odometric error has been growing progressively worse and has begun to negatively influence the localizer. The 'MCP' has begun 'limping' the drive motors after most motion commands with a 'TERR' (translate error). This needs to be fixed before we can run the robot again. Hopefully RWI will be forthcoming with the info we need to service the robot...

17 Jun 2002 (updated 17 Jun 2002 at 19:01 UTC) »

Whooboy... The Zaza Phase IV deployment target is edging ever closer, and quite a bit has been done to wrap things up, but there is more to do. I have made major updates to reaction, poslibtcx and poslib and a bunch of minor updates to the map applet, face, and voice system since the last post.

Last weekend's test run went very well. All of the software modules performed flawlessly. The only problem observed was an ACCESS.bus lockup at the end of the run.

Three weeks ago the DC/DC converter that supplies the -12V reference voltage for both of the motherboards on the robot failed and needed to be replaced. Unfortunately the RocketPort serial board needs this voltage to drive the serial interfaces for three of the robot's subsystems. A drop-in replacement part wasn't available, so a TI PT4224 had to be installed in it's place. The new part is substantially more efficient, but larger and a different form factor so it required an adapter board to be made. Removing the old part was a real exercise. The power distribution board needed to be removed from the robot which took about an hour and a half. Then the old part needed to be delicately removed with a Dremel tool and large pair of Vise-Grips ;) The new part was installed and tested operational less than a week after the old part failed, but I might have stressed the ACESS.bus cables a bit re-installing the power distribution board causing the overly sensitive bus to be even more flakey. I'll need to fix this before next weekend's run.

I'll be announcing a limited public run of the new web-based tourguide functionality (Phase IV) for TRCY club members this week for next Saturday's run. If you aren't already a member, join and get in on the fun!

Last Saturday's Phase IV test went pretty much as planned. We didn't have any collisions or close calls during the run. During the test another demo activity converged in the 'atrium area' on the lower level of the museum blocking Zaza's only route to her next goal. The large number of people and the unmapped obstacles currently in the area from the installation of the 'Play' exhibit in the temporary exhibit space prevented Zaza's localizer from getting a revised position for an extended period of time. She began searching for an obstacle that looked familiar but the visitors were so engaged in the demo that they would not let her pass. The 'reaction' module was running, and the robot began verbalizing her dislike of being blocked to the demo audience, to the dismay of the presenter ;) To keep Zaza from further interfering with the demo, we manually joysticked her out of the area. The remainder of the test was fairly uneventful. I was finally able to get the high-level people detection code working about half-way through the run, and we used it for the remainder of the test. If it can do a better job of detecting people, I'll write a new version of reaction to support it for Phase III and IV operation modes.

I made a few architectural improvements to the voice/face system over the last few days. The voiceServer now maintains a 'stack' of the last n cues in shared memory to provide slower asynchronous clients a loss-free way to get speech cue data. This should eliminate the possibility of loosing cues that are sent too quickly to be spoken in real-time. The change required updating the clients and applet, but it was worth it.

Just for grins, I ran the code I have written since the aquiring the robot through SLOCCount, quite amusing this obsession is ;)

The Tech Museum's web folks now have the 'official' Zaza website online.


I finally had a chance to update the Zaza Info and Technical sections on my site this week. Quite a bit has happened in the last five months!

I'm looking forward to tomorrow's Phase IV test and hoping everything goes smoothly ;)

Made a fair amount of progress today.

I solved a fairly serious problem where Zaza would occasionally crash into obstacles after visitors 'herded' her into them. It ends up we weren't using a critical component of the high-level collision avoidance code, and hadn't noticed until now. Oops ;)

I fixed a bug in the voiceClient Perl code that caused the Java face applet to get 'shy' and stop talking when it received a bad speech request from a client.

Last week I submitted a few pages I have been working on for the Tech's website about Zaza. Hopefully the Tech's webfolks will have it up in a week or so.

The FFmpeg team is nearly ready to release another official version that will fix some of the problems with live streaming. It's been nearly nine months since the last release, so it's been a long time in coming. This will allow me to replace the JPEG-based 'webcam' scripts and applet/ActiveX control with a true video streaming system using MPEG4/H263 compression and MP3 audio.

23 Apr 2002 (updated 23 Apr 2002 at 21:15 UTC) »

The Zaza project is picking up momentum.

After three months of software development work I was finally able to switch Zaza's second onboard computer over to Linux. There were some quirks to getting the machine working properly though. The SMP kernel/hardware combination had some trouble with two of the network cards we tried, but I eventially found one that worked properly. We couldn't get the composite NTSC video out port on the video card to sync when running under X, so the addition of a dedicated external VGA to NTSC converter was required. Last Saturday we did the first public run with the new face/voice setup made possible by the upgrade. Things went pretty well all considered.

We have been doing software tests of the Phase IV code on Saturday afternoons after the regular public demo. Slow progress is being made on the 'tour' behavior code, but we should have something usable by mid-summer.

Lots of updates on Zaza's code development since last month:

I think I finally resolved the bug in the zazacam applet that caused it to slowly consume virtual memory. Apparently Java likes to cache images in memory indefinitely if the getImage() function is used. Garbage collection and flush()ing manually don't seem to help.

I fixed a bug accidentally introduced into zazamap applet during the re-write for 0.60 that prevented the graphics canvas from repainting automatically after a position update. The applet still consumes far too much of the CPU in auto-track and higher zoom modes, so I still have some work to do.

As mentioned in the last entry, the zazaface applet now works with the faceServer and performs fairly well. I expected both the scripts and waveforms to be cached on the client-side, but it appears that only the waveform is. Performance is still acceptible. I also updated the viseme image handling routine to auto-scale to the graphics canvas. I still need to tweak the sections of the code related to running in application mode, and correct the polling delay so that is is consistant between platforms.

Before yesterday's Phase IV test run I found and fixed one longstanding bug in poslibtcx's goal arrival code that caused unpredictable behavior. For some reason GCC isn't issuing warnings about variables that are declared twice ;)

I started researching possibilities for the new video distribution system that switching zaza2 to Linux enables. MPEG4IP has some great tools for live streaming, but until the licencing fee issue is resolved, and MPEG4 clients are standardized it isn't an option. FFMpeg worked quite well in tests with an earlier version, FFServer in the current version is broken, so we won't be able to use it either. I guess we are still stuck with MJPEG until a better option becomes available.

20 Feb 2002 (updated 20 Feb 2002 at 22:35 UTC) »

Last week I finished a reference implimentation of Zaza's new face/voice server in Perl. I debated using either Perl or Java for the server-side components as each has aspects that could be of some use. I finally settled on Perl because it would take less time to put something together that would be stable.

The 'faceServer' application is designed to be scalable to the available computing resources both onboard and offboard the robot. It connects to a Festival server by use of the Festival::Client::Async module. Since Festival's rendering of speech can be rather slow, the Festival server can be located on a higher-speed offboard computer. Speech 'cues' are cached on the local file system for better performance with pre-rendered cues. The Java face applet is notified when new cues have been sent to the server by use of cue ID numbers (10-digit CRC32s of the cue string). The client downloads the waveform and 'script' data and begins playback as soon as it has both files. Since the filenames of each cue are unique, they are cached by the applet so no download is required if the cue has already been 'performed' by the applet.

I still need to spend some time with the applet, but expect to have it operational by Friday afternoon.

As always, the project code can be found here.

21 older entries...

Share this page