3D-environment modeling as an enabler for autonomous vehicles

3D-ENVIRONMENT MODELING AS AN ENABLER FOR AUTONOMOUS VEHICLES

COVER STORY AUTONOMOUS DRIVING

4

The ability of generating and interpreting a

three-dimensional model of the real world

is a key technology for autonomous vehicle

movements. This article discusses a power-

ful approach invented in the BMW Group

which provides dense and accurate point

clouds of a vehicles environment generated

with a motion stereo algorithm. The potential

of the approach will be showcased using a

parking assistance system.

DR. RER. NAT. ERIC WAHL analyzes the potential of image

based approaches for the BMW Group in Munich (Germany).

CHRISTIAN UNGER, M.SC. investigates environmental

3D-modells as Doctoral Student at the BMW Group in Munich

(Germany).

DIPL.-ING. ARMIN ZELLER is Team Leader responsible for near field systems at the BMW Group in

Munich (Germany).

DR.-ING. DIRK ROSSBERG is Department Leader for

camera-based driver assistance systems at the BMW Group in

Munich (Germany).

AUTHORS

502I2010 Volume 112

3D-DATA-ACQUISITION

In the last recent years camera-based driver assistance systems were established in the automotive industry. Currently the most popular applications like lane departure warning (LDW), traffic sign recognition (TSR) and blind spot surveillance (BSS) tar-get the space in front or behind a car.

Looking at assistance functions which ease the driving task but still keep the driver in the loop, these systems fulfill all essential requirements. But stepping from an assisting vehicle to a fully autonomous vehicle is also required information of a car’s lateral space, at best in a spatial representation.

However, applications interpreting the lateral space of a car are still rare and mostly limited to ultrasound or lidar sen-sors. By nature, these sensors strongly reduce details acquired from the world to a small set of measurements. Thus every model derived from these sensor outputs is very coarse and therefore limit the number of possible applications. This might be one reason for parking assist-ance systems being the only application established in the lateral environment of a car, since the required accuracy of infor-mation is relatively low.

Again the usage of cameras could end this restriction enabling a mass of new applications on both sides of a car. So far, cameras were mostly used in the context of a “visual enhancement” to the driver observing areas around a car that are cov-ered due to geometric restrictions. Further-more, the negative effects of high velocities to the quality of image acquisition, which results in blurry images and large changes between subsequent shots, hid the poten-tial of camera systems with respect to lat-eral driver assistance systems.

In this article, we will demonstrate a powerful approach for 3D data acquisition in the context of parking assistance. How-ever, this system is just one example in a line of promising possibilities.

A BASE TECHNOLOGY FOR ADVANCED PARKING ASSISTANCE

The basic principle of every camera is the projection of a three-dimensional world to a flat two-dimensional image. In the context of modeling a car’s environment this lost depth information could be retrieved by

Display of side-view images

Junction with low visibility

Side-view camera system in front wheel housing


6

estimating it based on complex and often wrong assumptions on the world or by add-ing movement in the data acquisition proc-ess. The difference between both strategies is that the first approach estimates depth while the second approach measures it. This principle is known as “motion stereo”, where the basic idea is using different images retrieved from different known posi-tions and finally calculating the depth infor-mation from corresponding image parts.

Some of the challenges one has to face are shearing effects when using rolling shutter cameras, smearing with global shutter, and misalignments whenever interlaced images are involved.

Moreover the current cameras suffer from weak sensitivity in low light condi-tions. If an application also should run at night, current systems would require some kind of active illumination. This involves additional costs, installation space, and often leads to legal conflicts in some countries.

Since parking maneuvers are performed with relatively low speed, weaknesses of cam-era technology are tolerable to some extent.

SENSORS

The position and orientation of a camera with respect to a vehicle is a parameter that has to be taken into account as well. Mostly two categories of cameras are dis-tinguished with respect to their orienta-tion and applied functions.

The first class is the family of side-view cameras, , which is mostly located in the front part of a vehicle. The optical axis of these cameras is parallel to the ground

so that they are well suited for “first views”, , in situations the driver has a hidden line of sight such as at the exit of car parks, . Accordingly side-view cam-eras are mostly equipped with standard lenses. From the geometrical point of view, these cameras are best with respect to motion stereo since the optical axis is perpendicular to the motion vector.

Another class of lateral cameras is the family of top-view cameras, , . Here the destination is providing the driver with a virtual view containing the close environ-ment around a car, . Respectively, these cameras are rather positioned in central parts of the body shell, where a wide angle lens allows displaying the right or left near field. The orientation and opening angle of these systems negatively affects the quality of retrieved depth values. On the other hand, the central positioning is beneficial to parking maneuvers including backwards driving, too.

Our approach supports both camera classes. However the first realization is based on a side-view system to demonstrate the potential with respect to accuracy of the retrieved 3D environment model.

COMPUTATION STRATEGIES

Once the camera setup has been fixed, the computation strategy has to be chosen. Again, two approaches can be discussed.

Since computational power in automo-tive systems is limited to the possibilities of embedded systems, one possible solu-tion for retrieving depth information is limiting the number of complex arithme-tic operations.

Accordingly, our first approach [1, 2] addressed a feature based strategy. Here the basic idea was calculating characteris-tic features in subsequent images, which could be efficiently performed with cur-rent FPGA or DSP chips. In the next step only for this relatively small number of spatial supporting points the expensive 3D correlations have been calculated. This approach performed well in friendly con-ditions, i.e. as long as enough features could be derived from the environment.

However in challenging situations like darkness, too less features could have been retrieved so that in these situations the resulting point clouds were very sparse and the determination of free park-ing areas varied in an inacceptable way.

The feature based results and the higher performance of available embed-ded hardware today, led us radically change our concept of depth calculation.

We use the principle of motion-stereo, in order to determine a depth for every pixel of every camera image using optical flow. Since these calculations are relatively expensive in terms of processing power, the use of efficient methods is advisable [3]. For these techniques a correct camera calibration is of eminent importance. In particular a correction of the radial lens distortion (which is usually quite strong for automotive cameras) indispensable. But also large steering angles can substantially enlarge the computational complexity and must be compensated in order to maintain real-time capability.

Using optical flow, a depth profile can be calculated from two camera images. Using this profile a 3D reconstruction of the envi-

Top-view camera mounted in the exterior mirror Schematic field-of-view of a top-view system

02I2010 Volume 112 7

ronment can be determined. The know-ledge about the metric distance between two camera centres (also called baseline) can be obtained via odometry in the simple case. In combination with a visual compu-tation of the travelled distance, robustness is increased enormously.

The individual, partial 3D reconstruc-tions are fused using efficient methods, so that over time a global, incremental 3D model of the environment evolves. Within this model the detection of parking slots is achieved with classificators that detect passable areas and obstacles. If such a passable area fulfils size-constraints and is bounded by obstacles, then this area is a candidate for a parking slot. In another step, for these candidates an exact metric size is computed.

While the feature-based approach com-putes distances only for a few points of a cam-era image, the optical flow has several advan-tages due to the immense amount of data:

higher detection rate of obstacles with homogenous colorbetter detection of object-boundaries which results in a higher measurement accuracy

However there are still technical limita-tions – it is predominantly occlusions which cause problems. If specific parts of a scene are only visible in one camera image then the distances for these areas are likely to be wrong. Furthermore it must be noted that there are mathemati-cal limitations for monocular systems if moving objects come into play. In certain cases, if the motion-vectors of the own vehicle and an obstacle are collinear then

:

:

the motion of the obstacle is hard or impossible to determine without addi-tional knowledge or interpretation of data. In the worst case, this would lead to wrong distances.

TESTING AND RESULTS

During development, we evaluated several approaches and we will now disclose some of our results. The feature-based approach had mainly problems with measurement accuracy. These resulted largely from spo-radic misassignment of features. In prac-tice, this leads to wrong measurements of the size of parking slots or even false detections. In particular repetitive struc-tures (foliage on the ground for example) turned out to be a challenge. The optical flow based approach on the other hand is more robust in such cases and delivers stable and reproducible results.

During our analyses we put our focus mainly on quantifying the accuracy of the measurement of parking slots at varying modes of operation (high and low velocities, acceleration, breaking, driving on a sinuous line). We present the results of a test run on 20 parking slots with different lengths. The obstacles were walls and other vehicles. The database for our analysis was created by repeatedly passing by these slots (approxi-mately 1000 measurements). As a reference, we measured every parking slot manually.

Regarding the analysis of modes of oper-ation, we discovered an increase of the wrong measurements starting at around 30 km/h – hard breaking or acceleration had only a marginal impact. While the

optimal range of velocities is between 10 and 25 km/h, from 35 km/h the camera baseline is too large for reliable distance estimation.

The detection rate is with about 98.6 % relatively high. Also the distribution of the measurement error in exemplifies the robustness of the system. The crucial insight here is that the chance of estimat-ing the parking slot substantially wrong (too large) is less than 1%. Further, the shape of the distribution can be explained by a systematic error: the already men-tioned occlusion cause that specific parts of the obstacles are estimated too large, which is at the expense of free space.

Using highly optimized implementations, the computational overhead is maintainable in real-time for both concepts. However, computing the optical flow between two camera frames is the most time consuming part and can be accomplished in about 20 milliseconds on standard general purpose CPUs. Most parts of the algorithms are par-allelizable, since most parts perform local-ized operations which may be distributed to several processing units.

CONCLUSION

We presented how future advanced driver assistance systems could automatically detect and measure parking slots. Besides well-known ultrasound sensors, cameras might play an important role due to their flexibility and their attractive price. We describe differ-ent techniques for the automated acquisition of the environment: either using a feature based approach or by using optical flow. In our experiments we demonstrate that by employing recent technologies from image and information processing, a robust detec-tion and automated measurement of parking slots is possible in real-time.

REFERENCES[1] Wahl E., Strobel T., Ruß A., Rossberg D., Therburg R.D.: Realisierung eines Parkassistenten basierend auf Motion-Stereo. 16. Aachener Kolloquium, VDI, 2007[2] Wahl E., Therburg R.D.: Developing a Motion-Stereo Parking Assistant at BMW, MATLAB Digest, The MathWorks Inc., November 2008[3] Unger C., Benhimane S., Wahl E., Navab N.: Efficient Disparity Computation without Maximum Disparity for Real-Time Stereo Visison, British Ma-chine Vision Conference 2009, September 2009

The distribution of the measurement error


8

16 – 17 MARCH 2010 | STUTTGART | 10 TH STUTTGART INTERNATIONAL SYMPOSIUM

FAX +49(0)611. 7878-452

Event in cooperation with

Documents

3D-environment modeling as an enabler for autonomous vehicles