Memory Leak Blamed for Princeton's Urban Challenge Loss

Posted 18 Nov 2007 at 23:41 UTC (updated 20 Nov 2007 at 16:53 UTC) by steve Share This

The Princeton Autonomous Vehicle Engineering (PAVE) team chose Microsoft's proprietary Robotics Studio and the C# programming language as the development platform for their robot. As it turns out, this choice may be partially responsible for their failure in the qualifying rounds. The robot exhibited a slowly deteriorating response time, which ended in complete software failure after about 40 minutes. As a work around, they adopted a strategy of having a timer reboot their computers every 40 minutes. However, the problem turned out to be related to garbage collection of memory used to store information about obstacles, which meant the more obstacles the robot saw, the faster the software failed. In the qualifying rounds, there were many more obstacles than expected leading to a crash after 28 minutes. Analysis of the code performed later with a proprietary .net code profiler revealed that under certain conditions, the C# garbage collector wasn't freeing memory as the programmers expected. For more see PAVE team member Bryan Cattle's more detailed description of the problem. A discussion of the incident can also be found in a recent Slashdot posting. Correction: The events mentioned in the referenced article occurred in the 2005 DARPA Grand Challenge, not the 2007 Urban Challenge as suggested by the articles release date. See comments below from Microsoft's Tandy Trower for further details.

CLR or user at fault?, posted 19 Nov 2007 at 17:33 UTC by Rog-a-matic » (Master)

Are they sure this was a problem with the runtime itself and not the user's code? A quick read of the /. postings reveal a debate about this.

The article sort of looks like an promo piece for ANTS.

Or both?, posted 19 Nov 2007 at 19:54 UTC by steve » (Master)

I've seen similar disasters with Java development. I think the fundamental problem is that some programmers rely too much on garbage collectors doing the "right thing", for definitions of "right thing" held by the programmer and not the language designers. :-)

Do what I mean, not what I say!, posted 19 Nov 2007 at 20:10 UTC by Rog-a-matic » (Master)

Maybe the programmers left objects around that were still referenced somewhere that they didn't need anymore and the whole thing just bogged down after memory ran out. I've written programs that do that :) If the garbage collector could just read our mind!


Stress test, posted 19 Nov 2007 at 20:47 UTC by motters » (Master)

I've also run into garbage collection problems with C# in a few cases, but generally it works well. The bottom line is that whenever developing realtime systems you always need to benchmark the code and see what the timings are and how they change. Also for a robot like this there's no substitute for doing "stress testing" just by running the system for long periods of time to see if anything breaks.

Microsoft responds, posted 20 Nov 2007 at 16:39 UTC by steve » (Master)

I received this comment in an email from Tandy Trower of Microsoft this morning, offering corrections of some of the facts of the story. Most importantly, that while this story was published this month, the events described actually took place in the 2005 Grand Challenge, not the 2007 Urban Challenge. This date precedes the release of Microsoft's Robotics Studio and PAVE's use of the software:

I wanted to offer some corrections to your recent post on where you suggested that PAVE’s failure at the DARPA Challenge might have been due to their selection and use of Microsoft Robotics Studio. Perhaps you have also received from other sources.

First, the reference you used from Slashdot, which was in turn derived from the article posted on the Code Project website, was about Princeton’s participation in the 2005 DARPA Grand Challenge, not the 2007 event, which you can see if you take a closer look. Microsoft Robotics Studio was only used by the PAVE team in the 2007 event, and wasn’t even announced or previewed in 2005. So the conclusion and linkage to our robotics SDK is incorrect.

Second, the article written by Bryan Cattle cited on Code Project I believe was intended to talk about how the ANTS profiler tool could have been used to identify a problem, not a C# or CLR GC issue.

Third, it is important to know that you can still leak in a managed memory environment, if you don’t remove references to resources.

As an aside, Microsoft Robotics Studio actually does more than any system we are aware of, in that it will automatically discard messages accumulating on DSS ports, if a certain limit is exceeded.

Even the Slashdot post you derived your post from never mentions any reference to Microsoft Robotics Studio nor does the Code Project article. I am certain if you contact Bryan Cattle or the other folks at PAVE they would also confirm that your inclusion of Microsoft Robotics Studio in your post was in error. Again the article references only their 2005 participation, not 2007. While they indeed did not make into the finals at the 2007 event, we have no information that suggests that it had anything to do with their use of our SDK.

Tandy Trower

See more of the latest robot news!

Recent blogs

30 Sep 2017 evilrobots (Observer)
10 Jun 2017 wedesoft (Master)
9 Jun 2017 mwaibel (Master)
25 May 2017 AI4U (Observer)
25 Feb 2017 steve (Master)
16 Aug 2016 Flanneltron (Journeyer)
27 Jun 2016 Petar.Kormushev (Master)
2 May 2016 motters (Master)
10 Sep 2015 svo (Master)
14 Nov 2014 Sergey Popov (Apprentice)
Share this page