Multisampling and Motion Blur in a Ray Tracer

Multisampling and Motion Blur in a Ray TracerAtulit Kumar∗and Harry Gifford†
Carnegie Mellon University
Abstract
We describe a simple extension for MSAA to ray tracing and describe how to integrate motion blur with MSAA. We show that MSAA extends naturally to a ray tracer, but it does not integrate well with motion blur. To solve this problem, we add a cache to help reduce unnecessary shading.
We then show that the quality and performance of our MSAA + motion blur solution makes it worthwhile for many ray tracing workloads.
1 Introduction
Consider the following two functions from a generic ray tracer:
// get closest intersection. TriangleId intersect(Ray r);
// get color of the closest intersection. Color shade(Ray r);
In general, the shade function is considerably more expensive to compute than the intersect function. Even in the simplest shaders, shade will at least call intersect dozens of times. However in most common physically based rendering techniques the shade function will be vastly more complex. Therefore, if at all possible we would like to reduce the number of times we shade as much as possible. However, we assume that it is relatively cheap to compute the nearest intersections.
Coupled with the assumption that luminance varies relatively slowly over the surface of a geometry, as well as time, we can adapt a reduced shading anti-aliasing technique known as multi-sample anti aliasing (MSAA). The idea of MSAA is to reduce the number of shades we perform per pixel.
Added complexity comes when we want to add MSAA to motion blur. In order to deal with this we investigate the performance of caching.
In this paper we extend a ray tracer with MSAA and motion blur and then evaluate the performance.
We largely extend the ideas presented in [Ragan-Kelley et al. 2011], which explored rendering distributed effects with MSAA in a rasterization setting. [Burns et al. 2010] also propose a similar system, which essentially reorders the shading computations in order to save the amount of shading, again in a rasterization setting.
[Apodaca and Gritz 1999] seems to be the only ray tracer to incorporate motion blur and multi-sample anti-aliasing (MSAA) and Pixar have used it while rendering the movie ‘Cars’[Christensen et al. 2006].
2 MSAA and Motion Blur
First we describe a simple extension of MSAA to ray tracing, and then we describe how to extend MSAA to support motion blur.
∗e-mail:[email protected] †e-mail:[email protected]
2.1 Extending MSAA to Ray Tracing
In Algorithm 1 we show the basic pseudo-code for implementing MSAA in a ray tracer.
The main idea of MSAA is to exploit the assumption that surface luminance varies gradually in space and time. This allows us to assume that nearby shade points will have reasonably similar luminance (color).
Our MSAA algorithm exploits cheap intersection tests to get the relative weight of each primitive geometry to each pixel. The ratio of intersections of each primitive geometry to the total number of intersections gives us an approximation to the total coverage of the pixel by that primitive.
We then shade one particular point on each geometry at random, weight it by the estimated coverage for that pixel and then add that color to the fragment buffer.
Ray differentials are computed between shaded ray angles only, which requires a slight modification of the standard ray differential computation.
Algorithm 1: Extension of MSAA for ray tracing. input : pixels: image sample points, rays: set of sample points output: MSAA anti-aliased image. begin
for (i, j) ∈ pixels do weights← map from primitiveId to int. count number of intersects with each primitive. for ray ∈ rays do
primitiveId← intersect(ray) weights [primitiveId ]← weights [primitiveId ] +1
end pixels [i, j]← Black color one of the intersection points and weight by the number of intersections. for (id, numIsects) ∈ weights do
pick a ray that intersects ‘id’. ray← randomRay(id) color← shade(ray) weight← numIsects / len (rays) pixels [i, j]← pixels [i, j] + weight × color
end end
end
Figure 1 shows the quality difference between super-sampling at 8 shades per pixel, versus MSAA at less than 1.4 shades per pixel. The loss in quality is minimal, despite the significant reduction in shading rate.
We would like to get this shading rate when combining MSAA with motion blur, which turns out to be a non-trivial modification.
Figure 1: Close up comparison between super-sampling (left) and MSAA (right), of Bunny (above). Notice that there is minimal loss in quality. A slight difference can be seen in the specular highlights, which MSAA does not attempt to anti-alias.
2.2 Motion Blur
Briefly, we will describe a basic algorithm for implementing motion blur.
The basic idea is to apply stochastic sampling to time as well as space.
Consider rendering a frame that is open from time t0 to t1. Each time we perform an intersection with a geometry we pick a time point t, with which we use to shade. To get t, we pick a random sample δ ∈ [0, 1) and let t = t0 + δ(t1 − t0).
2.3 Combining MSAA with Motion Blur
The pseudo-code for our algorithm is presented in Algorithm 2.
Algorithm 2: Implementation of Cache for ray tracing. input : scenes: the 3D scene, rays: set of sample points output: MSAA anti-aliased image. begin
primitiveId← intersect(ray) point← Point of intersection on the differential geometry binId← getBinIndex(point, primitiveId, rayDiffXY) if binId in Cache then
color← interpolate(point, rayDiffXY) end else
color← shade(ray) Add color to cache
end end
While the basic algorithm described in the previous section works well on scenes with static geometry, the algorithm breaks down once a point on a geometry is able to move from one pixel to another in the same frame.
The reason this is not handled by MSAA is because nearby points on the geometry can move from one pixel to another. This means that we might shade one point, p in pixel a and then move to a nearby pixel, b and get an intersection with a point q that is very near p. This means that we are essentially wasting a shade. We would therefore like a way to capture such intersection cases and make sure that we reuse shading. This problem is shown clearly in Figure 3. Notice that the naive MSAA algorithm performs very poorly at reusing shades, even with very modest motion.
In order to make use of the MSAA assumptions listed in the introduction, we added a caching mechanism behind the basic MSAA algorithm, which collects misses from motion blur.
The basic idea behind the algorithm is to split each primitive into blocks. A block is a small subregion of a primitive, over which we will ensure that we do not shade more than once. The size of each block is dependent on the ray differentials of the particular ray we are tracing [Igehy 1999].
For each ray shade request, we first check whether the block has been shaded or not.
If the block has been shaded then we take a weighted interpolation of the surrounding shade regions (the actual interpolation scheme is not all that important) and let the shade color be set to the interpolated color. It turns out this scheme works well in practice and does not result in much loss in quality. This is at least partially explained by the fact that as blur increases (longer blurs), the number of cache requests we get will also increase - which would
(a) Super-sampling (b) MSAA + cache
Figure 2: Comparison between super-sampling and MSAA with caching. (a) took 78.3s to render and (b) took 20.0s to render. This shows that we get much better performance at minimal loss in quality.
result in loss of accuracy due to the fact that we are not computing as many shade points. However, this is offset by the general rule that as the objects get blurrier, it gets harder to spot inaccuracy.
If the block has not been shaded then we shade and store the point in our cache.
In our project we did not investigate how large our cache should be. We only checked that the cache did not overflow the amount of free memory available on our system. As shading becomes more and more expensive it may well be useful to actually have boundless caches in order to make sure that only in the very worst case do we ever miss a shade. Cycle counting the average and standard deviation for a set of typical shades would give a good indication of how much space to dedicate to the cache, and which level of the memory hierarchy you are willing to spill the cache to.
3 Evaluation
To evaluate our workload, we initially started off by varying the thread task size and investigated its impact on the cache performance. Fig 4. shows the impact of thread tasks of sizes 4, 128 and 512 with the lighter color meaning more shading. If the task width increases per thread, we see a speedup due to decrease in the number of overlapping shading. Since the blur is generally within the same grid region, it results in better performance as we have a cache hit per core. On the other hand, if the tasks become too large, then we see a steady increase in the render times because of the high variability in the workload. This theory is collaborated by the diagram in Fig. 4. In general we hope that each cache should be disjoint, in that there should be as little redundancy in the caches over multiple cores. Figure ?? shows that in general, for large tile sizes this is generally true. The sweet spot of 256 width of thread task / pixels was found to be the ideal estimate for the most of the scenes.
The maximum speedup of our implementation is achieved in Bunny. To further enhance the performance, we initially use mutex locks to atomize cache access for each thread but this resulted in a dramatic drop in performance. This was primarily because now each pixel was shaded sequentially. To overcome this problem, we made the cache local to each thread as we did not have to worry about two threads reading and writing data at the same time. This approach resulted in the speedup that is seen in Fig. 6.
(a) Heatmap showing the number of shades for the MSAA algorithm on a static scene, left, and a scene with motion blur, right. Red indicates 64 shades per pixel and blue indicates one shade per pixel. This figure shows that MSAA fails to save shading in scenes with excessive motion blur.
(b) Above are two neighboring pixels. The blue crosses indicate intersection tests and the red crosses indicate shades. The bold red cross indicates two shades at different points in time on the same triangle. In this case both shades should get the same color, but the naive MSAA algorithm will overshade.
Figure 3: Motion blur can lead to over-shading.
Figure 4: Effect of thread task size on cache performance. Thread task size of 4, 128 and 512 pixels from left to right. Lighter color means more shading. Since we cache shades per thread, task size has a large impact on the number of shades performed, as can be seen from this heatmap.
4 64128 256 512 1,024
30
40
50
60
Ti m
e / s
Figure 5: Performance of algorithm as task width per thread increases. As the granularity of the tasks gets bigger (less tasks), we first see a speed-up as the amount of overlapping shading decreases, but as tasks become too large we begin to get variability in workload which hurts performance.
(a) (b) (c) (d) 0
10
20
30
40
50
Scene
Figure 6: a: bunny, b: killeroo-simple, c: anim-killeroo-moving, d: anim-moving-reflection. Speedup on scenes with 64 shades per sample, compared to super- sampling.
4 Conclusion
The main idea behind this paper was to showcase that we can combine MSAA and motion blur and integrate it with a ray tracer. We solve the problem by using cache to speedup the rendering process by shading as less as possible and storing the result. Finally, we demonstrated a significant performance increase achieved by using our method over the more traditional super- sampling approach.
References
APODACA, A. A., AND GRITZ, L. 1999. Advanced RenderMan: Creating CGI for Motion Picture, 1st ed. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA.
BURNS, C. A., FATAHALIAN, K., AND MARK, W. R. 2010. A lazy object-space shading architecture with decoupled sampling. In Proceedings of the Conference on High Performance Graphics, Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, HPG ’10, 19–28.
CHRISTENSEN, P. H., FONG, J., LAUR, D. M., AND BATALI, D. 2006. Ray Tracing for the Movie ‘Cars’. Symposium on Interactive Ray Tracing 0, 1–6.
IGEHY, H. 1999. Tracing ray differentials. In Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, ACM Press/Addison-Wesley Publishing Co., New York, NY, USA, SIGGRAPH ’99, 179–186.

Documents

Multisampling and Motion Blur in a Ray Tracer