Multisampling and Motion Blur in a Ray TracerAtulit Kumar∗and Harry
Gifford†
Carnegie Mellon University
Abstract
We describe a simple extension for MSAA to ray tracing and describe
how to integrate motion blur with MSAA. We show that MSAA extends
naturally to a ray tracer, but it does not integrate well with
motion blur. To solve this problem, we add a cache to help reduce
unnecessary shading.
We then show that the quality and performance of our MSAA + motion
blur solution makes it worthwhile for many ray tracing
workloads.
1 Introduction
Consider the following two functions from a generic ray
tracer:
// get closest intersection. TriangleId intersect(Ray r);
// get color of the closest intersection. Color shade(Ray r);
In general, the shade function is considerably more expensive to
compute than the intersect function. Even in the simplest shaders,
shade will at least call intersect dozens of times. However in most
common physically based rendering techniques the shade function
will be vastly more complex. Therefore, if at all possible we would
like to reduce the number of times we shade as much as possible.
However, we assume that it is relatively cheap to compute the
nearest intersections.
Coupled with the assumption that luminance varies relatively slowly
over the surface of a geometry, as well as time, we can adapt a
reduced shading anti-aliasing technique known as multi-sample anti
aliasing (MSAA). The idea of MSAA is to reduce the number of shades
we perform per pixel.
Added complexity comes when we want to add MSAA to motion blur. In
order to deal with this we investigate the performance of
caching.
In this paper we extend a ray tracer with MSAA and motion blur and
then evaluate the performance.
We largely extend the ideas presented in [Ragan-Kelley et al.
2011], which explored rendering distributed effects with MSAA in a
rasterization setting. [Burns et al. 2010] also propose a similar
system, which essentially reorders the shading computations in
order to save the amount of shading, again in a rasterization
setting.
[Apodaca and Gritz 1999] seems to be the only ray tracer to
incorporate motion blur and multi-sample anti-aliasing (MSAA) and
Pixar have used it while rendering the movie ‘Cars’[Christensen et
al. 2006].
2 MSAA and Motion Blur
First we describe a simple extension of MSAA to ray tracing, and
then we describe how to extend MSAA to support motion blur.
∗e-mail:
[email protected] †e-mail:
[email protected]
2.1 Extending MSAA to Ray Tracing
In Algorithm 1 we show the basic pseudo-code for implementing MSAA
in a ray tracer.
The main idea of MSAA is to exploit the assumption that surface
luminance varies gradually in space and time. This allows us to
assume that nearby shade points will have reasonably similar
luminance (color).
Our MSAA algorithm exploits cheap intersection tests to get the
relative weight of each primitive geometry to each pixel. The ratio
of intersections of each primitive geometry to the total number of
intersections gives us an approximation to the total coverage of
the pixel by that primitive.
We then shade one particular point on each geometry at random,
weight it by the estimated coverage for that pixel and then add
that color to the fragment buffer.
Ray differentials are computed between shaded ray angles only,
which requires a slight modification of the standard ray
differential computation.
Algorithm 1: Extension of MSAA for ray tracing. input : pixels:
image sample points, rays: set of sample points output: MSAA
anti-aliased image. begin
for (i, j) ∈ pixels do weights← map from primitiveId to int. count
number of intersects with each primitive. for ray ∈ rays do
primitiveId← intersect(ray) weights [primitiveId ]← weights
[primitiveId ] +1
end pixels [i, j]← Black color one of the intersection points and
weight by the number of intersections. for (id, numIsects) ∈
weights do
pick a ray that intersects ‘id’. ray← randomRay(id) color←
shade(ray) weight← numIsects / len (rays) pixels [i, j]← pixels [i,
j] + weight × color
end end
end
Figure 1 shows the quality difference between super-sampling at 8
shades per pixel, versus MSAA at less than 1.4 shades per pixel.
The loss in quality is minimal, despite the significant reduction
in shading rate.
We would like to get this shading rate when combining MSAA with
motion blur, which turns out to be a non-trivial
modification.
Figure 1: Close up comparison between super-sampling (left) and
MSAA (right), of Bunny (above). Notice that there is minimal loss
in quality. A slight difference can be seen in the specular
highlights, which MSAA does not attempt to anti-alias.
2.2 Motion Blur
Briefly, we will describe a basic algorithm for implementing motion
blur.
The basic idea is to apply stochastic sampling to time as well as
space.
Consider rendering a frame that is open from time t0 to t1. Each
time we perform an intersection with a geometry we pick a time
point t, with which we use to shade. To get t, we pick a random
sample δ ∈ [0, 1) and let t = t0 + δ(t1 − t0).
2.3 Combining MSAA with Motion Blur
The pseudo-code for our algorithm is presented in Algorithm
2.
Algorithm 2: Implementation of Cache for ray tracing. input :
scenes: the 3D scene, rays: set of sample points output: MSAA
anti-aliased image. begin
primitiveId← intersect(ray) point← Point of intersection on the
differential geometry binId← getBinIndex(point, primitiveId,
rayDiffXY) if binId in Cache then
color← interpolate(point, rayDiffXY) end else
color← shade(ray) Add color to cache
end end
While the basic algorithm described in the previous section works
well on scenes with static geometry, the algorithm breaks down once
a point on a geometry is able to move from one pixel to another in
the same frame.
The reason this is not handled by MSAA is because nearby points on
the geometry can move from one pixel to another. This means that we
might shade one point, p in pixel a and then move to a nearby
pixel, b and get an intersection with a point q that is very near
p. This means that we are essentially wasting a shade. We would
therefore like a way to capture such intersection cases and make
sure that we reuse shading. This problem is shown clearly in Figure
3. Notice that the naive MSAA algorithm performs very poorly at
reusing shades, even with very modest motion.
In order to make use of the MSAA assumptions listed in the
introduction, we added a caching mechanism behind the basic MSAA
algorithm, which collects misses from motion blur.
The basic idea behind the algorithm is to split each primitive into
blocks. A block is a small subregion of a primitive, over which we
will ensure that we do not shade more than once. The size of each
block is dependent on the ray differentials of the particular ray
we are tracing [Igehy 1999].
For each ray shade request, we first check whether the block has
been shaded or not.
If the block has been shaded then we take a weighted interpolation
of the surrounding shade regions (the actual interpolation scheme
is not all that important) and let the shade color be set to the
interpolated color. It turns out this scheme works well in practice
and does not result in much loss in quality. This is at least
partially explained by the fact that as blur increases (longer
blurs), the number of cache requests we get will also increase -
which would
(a) Super-sampling (b) MSAA + cache
Figure 2: Comparison between super-sampling and MSAA with caching.
(a) took 78.3s to render and (b) took 20.0s to render. This shows
that we get much better performance at minimal loss in
quality.
result in loss of accuracy due to the fact that we are not
computing as many shade points. However, this is offset by the
general rule that as the objects get blurrier, it gets harder to
spot inaccuracy.
If the block has not been shaded then we shade and store the point
in our cache.
In our project we did not investigate how large our cache should
be. We only checked that the cache did not overflow the amount of
free memory available on our system. As shading becomes more and
more expensive it may well be useful to actually have boundless
caches in order to make sure that only in the very worst case do we
ever miss a shade. Cycle counting the average and standard
deviation for a set of typical shades would give a good indication
of how much space to dedicate to the cache, and which level of the
memory hierarchy you are willing to spill the cache to.
3 Evaluation
To evaluate our workload, we initially started off by varying the
thread task size and investigated its impact on the cache
performance. Fig 4. shows the impact of thread tasks of sizes 4,
128 and 512 with the lighter color meaning more shading. If the
task width increases per thread, we see a speedup due to decrease
in the number of overlapping shading. Since the blur is generally
within the same grid region, it results in better performance as we
have a cache hit per core. On the other hand, if the tasks become
too large, then we see a steady increase in the render times
because of the high variability in the workload. This theory is
collaborated by the diagram in Fig. 4. In general we hope that each
cache should be disjoint, in that there should be as little
redundancy in the caches over multiple cores. Figure ?? shows that
in general, for large tile sizes this is generally true. The sweet
spot of 256 width of thread task / pixels was found to be the ideal
estimate for the most of the scenes.
The maximum speedup of our implementation is achieved in Bunny. To
further enhance the performance, we initially use mutex locks to
atomize cache access for each thread but this resulted in a
dramatic drop in performance. This was primarily because now each
pixel was shaded sequentially. To overcome this problem, we made
the cache local to each thread as we did not have to worry about
two threads reading and writing data at the same time. This
approach resulted in the speedup that is seen in Fig. 6.
(a) Heatmap showing the number of shades for the MSAA algorithm on
a static scene, left, and a scene with motion blur, right. Red
indicates 64 shades per pixel and blue indicates one shade per
pixel. This figure shows that MSAA fails to save shading in scenes
with excessive motion blur.
(b) Above are two neighboring pixels. The blue crosses indicate
intersection tests and the red crosses indicate shades. The bold
red cross indicates two shades at different points in time on the
same triangle. In this case both shades should get the same color,
but the naive MSAA algorithm will overshade.
Figure 3: Motion blur can lead to over-shading.
Figure 4: Effect of thread task size on cache performance. Thread
task size of 4, 128 and 512 pixels from left to right. Lighter
color means more shading. Since we cache shades per thread, task
size has a large impact on the number of shades performed, as can
be seen from this heatmap.
4 64128 256 512 1,024
30
40
50
60
Ti m
e / s
Figure 5: Performance of algorithm as task width per thread
increases. As the granularity of the tasks gets bigger (less
tasks), we first see a speed-up as the amount of overlapping
shading decreases, but as tasks become too large we begin to get
variability in workload which hurts performance.
(a) (b) (c) (d) 0
10
20
30
40
50
Scene
Figure 6: a: bunny, b: killeroo-simple, c: anim-killeroo-moving, d:
anim-moving-reflection. Speedup on scenes with 64 shades per
sample, compared to super- sampling.
4 Conclusion
The main idea behind this paper was to showcase that we can combine
MSAA and motion blur and integrate it with a ray tracer. We solve
the problem by using cache to speedup the rendering process by
shading as less as possible and storing the result. Finally, we
demonstrated a significant performance increase achieved by using
our method over the more traditional super- sampling
approach.
References
APODACA, A. A., AND GRITZ, L. 1999. Advanced RenderMan: Creating
CGI for Motion Picture, 1st ed. Morgan Kaufmann Publishers Inc.,
San Francisco, CA, USA.
BURNS, C. A., FATAHALIAN, K., AND MARK, W. R. 2010. A lazy
object-space shading architecture with decoupled sampling. In
Proceedings of the Conference on High Performance Graphics,
Eurographics Association, Aire-la-Ville, Switzerland, Switzerland,
HPG ’10, 19–28.
CHRISTENSEN, P. H., FONG, J., LAUR, D. M., AND BATALI, D. 2006. Ray
Tracing for the Movie ‘Cars’. Symposium on Interactive Ray Tracing
0, 1–6.
IGEHY, H. 1999. Tracing ray differentials. In Proceedings of the
26th Annual Conference on Computer Graphics and Interactive
Techniques, ACM Press/Addison-Wesley Publishing Co., New York, NY,
USA, SIGGRAPH ’99, 179–186.